Commit Graph

33509 Commits

Author SHA1 Message Date
Dr. David Alan Gilbert
8d00cfd5e6 drm/amdgpu: Remove ppatomfwctrl deadcode
pp_atomfwctrl_get_pp_assign_pin() and pp_atomfwctrl_get_pp_assign_pin()
were added in 2017 by
commit 0d2c7569e1 ("drm/amdgpu: add new atomfirmware based helpers for
powerplay")
but have remained unused.

Remove them, and the helper functions they used.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:43:11 -05:00
Mario Limonciello
36d63ce5db drm/amd/display: Add a new dcdebugmask to allow turning off brightness curve
Upgrading the kernel may cause some systems that were previously not using
a firmware specified brightness curve to use one.

In the event of problems with this curve (for example an interpolation
error) add a new dcdebugmask value that can be used to turn it off.  Also
add an info message to show that custom brightness curves are currently in
use.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-6-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:57 -05:00
Mario Limonciello
578df37b1b drm/amd/display: Add support for custom brightness curve
Some systems specify in the firmware a brightness curve that better
reflects the characteristics of the panel used. This is done in the
form of data points and matching luminance percentage.

When converting a userspace requested brightness value use that curve
to convert to a firmware intended brightness value.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-5-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:51 -05:00
Mario Limonciello
f25c0f0d4f drm/amd/display: Avoid operating on copies of backlight caps
Making a copy of the backlight caps structure between uses is unnecessary.
Refer to pointers to the same structure when using it.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-4-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:46 -05:00
Mario Limonciello
f729e63743 drm/amd: Pass luminance data to amdgpu_dm_backlight_caps
The ATIF method on some systems will provide a backlight curve. Pass
this curve into amdgpu_dm add it to the structures.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-3-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:42 -05:00
Mario Limonciello
1c79b5fcdf drm/amd: Copy entire structure in amdgpu_acpi_get_backlight_caps()
As new members are introduced to the structure copying the entire
structure will help avoid missing them.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Link: https://lore.kernel.org/r/20250228185145.186319-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:37 -05:00
Taimur Hassan
43e88e20d3 drm/amd/display: Promote DAL to 3.2.323
This version brings along following fixes:
- Various cleanups to amdgpu dm
- Add DP tunneling IRQ handler
- Fix display corruption for dcn35
- Fix dmcub reset problem
- Adjust BW determination for PCON
- DIO encoder refactor
- Fix performance with SubVP under gaming

Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:34 -05:00
Mario Limonciello
130d8324ea drm/amd/display: Use drm_err() for handle_hpd_irq_helper()
drm_err() will show which device has the error.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:29 -05:00
Mario Limonciello
f123fda197 drm/amd/display: Use scoped guards for handle_hpd_irq_helper()
Scoped guards will release the mutex when they go out of scope.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:22 -05:00
Mario Limonciello
981a47429e drm/amd/display: Use _free() macro for amdgpu_dm_update_connector_after_detect()
By using a _free() macro multiple duplicated snippets of code to free
the sink can be dropped. The sink will be released when leaving scope.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:18 -05:00
Mario Limonciello
aca9ec9b05 drm/amd/display: Use scoped guard for amdgpu_dm_update_connector_after_detect()
A scoped guard will release the mutex when it goes out of scope.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:14 -05:00
Mario Limonciello
d13fbeb74b drm/amd/display: Use _free(kfree) for dm_gpureset_commit_state()
Using a _free(kfree) macro drops the need for a goto statement
as it will be freed when it goes out of scope.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:42:08 -05:00
Mario Limonciello
7b3e14acc1 drm/amd/display: Change amdgpu_dm_irq_resume_*() to void
amdgpu_dm_irq_resume_early() and amdgpu_dm_irq_resume_late() don't
have any error flows. Change the return type from integer to void.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:51 -05:00
Mario Limonciello
c2bd614bf8 drm/amd/display: Change amdgpu_dm_irq_resume_*() to use drm_dbg()
drm_dbg() is helpful to show which device had the debug statement.
Adjust to using this instead for debug messages.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:37 -05:00
Mario Limonciello
f24a74d59e drm/amd/display: Use scoped guard for dm_resume()
Scoped guards will release the mutex when they go out of scope.
Adjust the code to use these instead.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:34 -05:00
Mario Limonciello
180998bf30 drm/amd/display: Use drm_err() instead of DRM_ERROR in dm_resume()
drm_err() is helpful to show which device had the error. Adjust to
using this instead for error messages.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:29 -05:00
Mario Limonciello
e3bc320c4b drm/amd/display: Use _free() macro for amdgpu_dm_commit_zero_streams()
All cases except a failure to create a copy of the current context will
call dc_state_release() on the copied context.

Use a _free() macro to free the context and then adjust the error handling
flow to drop the unnecessary use of goto statements.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:25 -05:00
Mario Limonciello
3cf7a0bc87 drm/amd/display: Catch failures for amdgpu_dm_commit_zero_streams()
amdgpu_dm_commit_zero_streams() returns a DC error code that isn't
checked. Add an explicit check to this and fail dm_suspend() if it
is not DC_OK.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:21 -05:00
Mario Limonciello
65890cad2e drm/amd/display: Drop ret variable from dm_suspend()
The `ret` variable in dm_suspend() doesn't get set and is just used
to return 0.  Drop the needless declaration.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:17 -05:00
Mario Limonciello
20ea047768 drm/amd/display: Change amdgpu_dm_irq_suspend() to void
amdgpu_dm_irq_suspend() doesn't have any error flows and always
returns zero.

Change the function to void.

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:12 -05:00
Cruise Hung
c286e8501a drm/amd/display: Add tunneling IRQ handler
USB4 DP BW Allocation uses DP_TUNNELING_IRQ to indicate the status update.
The DP_TUNNELING_IRQ is defined in LINK_SERVICE_IRQ_VECTOR_ESI0. When
receiving DP HPD IRQ in USB4, read the LINK_SERVICE_IRQ_VECTOR_ESI0.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:08 -05:00
Leo Zeng
5ad8eed172 drm/amd/display: Added visual confirm for DCC
[WHY]
We want to add a visual confirm mode for DCC and MCache for
debugging purpose.

[HOW]
color pipes based on whether DCC is enabled and what MCache id
is used.
black - DCC disabled
red - DCC enabled
grey - 2 different MCaches used
other colors - 1 MCache used

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Leo Zeng <Leo.Zeng@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:41:02 -05:00
Nicholas Kazlauskas
c707ea82c7 drm/amd/display: Ensure DMCUB idle before reset on DCN31/DCN35
[Why]
If we soft reset before halt finishes and there are outstanding
memory transactions then the memory interface may produce unexpected
results, such as out of order transactions when the firmware next runs.

These can manifest as random or unexpected load/store violations.

[How]
Increase the timeout before soft reset to ensure the DMCUB has quiesced.
This is effectively 1s maximum based on experimentation.

Use the enable bit check on DCN31 like we're doing on DCN35 and reorder
the reset writes to follow the HW programming guide.

Ensure we're reading SCRATCH7 instead of SCRATCH8 for the HALT code.
No current versions of DMCUB firmware use the SCRATCH8 boot bit to
dynamically switch where the HALT code goes to maintain backwards
compatibility with PSP.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:40:56 -05:00
Nicholas Kazlauskas
a2f72c0717 drm/amd/display: Revert "Increase halt timeout for DMCUB to 1s"
This reverts commit 50f040c53e ("drm/amd/display: Increase halt
timeout for DMCUB to 1s")

There's two issues here:
1. Each poll is closer to 10us than 1us so it stalls for 15s on PNP.
2. We're reading the wrong scratch register to check for the HALT code.

Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:38 -05:00
Alex Hung
02b2c97824 drm/amd/display: Check NULL connector before it is used
[Why & How]
amdgpu_dm_find_first_crtc_matching_connector can return NULL.
It is necessary to the returned connector before passing it
drm_atomic_get_new_connector_state which always assumes connector is
not NULL.

Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:35 -05:00
George Shen
79fc4e856e drm/amd/display: Remove unused struct definition
[Why/How]
The struct is not and will not be used, as it is no longer relevant nor
supported.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:32 -05:00
George Shen
0584bbcf0c drm/amd/display: Skip checking FRL_MODE bit for PCON BW determination
[Why/How]
Certain PCON will clear the FRL_MODE bit despite supporting the link BW
indicated in the other bits.

Thus, skip checking the FRL_MODE bit when interpreting the
hdmi_encoded_link_bw struct.

Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:28 -05:00
Peichen Huang
54743ca151 drm/amd/display: misc for dio encoder refactor
[WHY]
These are left required changes for dio encoder refactor.

[HOW]
1. original logic is separated by config option
2. new link encoder dp enable/disable code for dcn35
3. process fec only for DP 8b10b encoding

Reviewed-by: Cruise Hung <cruise.hung@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:20 -05:00
Hansen Dsouza
fc215e83d0 drm/amd/display: read mso dpcd caps
[Why & How]
Read if panel support multi-sst links

Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Hansen Dsouza <Hansen.Dsouza@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:16 -05:00
Dillon Varone
0dfcc2bf26 drm/amd/display: Fix DMUB reset sequence for DCN401
[WHY]
It should no longer use DMCUB_SOFT_RESET as it can result
in the memory request path becoming desynchronized.

[HOW]
To ensure robustness in the reset sequence:
1) Extend timeout on the "halt" command sent via gpint, and check for
controller to enter "wait" as a stronger guarantee that there are no
requests to memory still in flight.
2) Remove usage of DMCUB_SOFT_RESET
3) Rely on PSP to reset the controller safely

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:10 -05:00
Dillon Varone
a025f424af drm/amd/display: Fix p-state type when p-state is unsupported
[WHY&HOW]
P-state type would remain on previously used when unsupported which
causes confusion in logging and visual confirm, so set back to zero
when unsupported.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:05 -05:00
Aric Cyr
b74f46f3ce drm/amd/display: Request HW cursor on DCN3.2 with SubVP
[why]
When SubVP is active the HW cursor size is limited to 64x64, and
anything larger will force composition which is bad for gaming on
DCN3.2 if the game uses a larger cursor.

[how]
If HW cursor is requested, typically by a fullscreen game, do not
enable SubVP so that up to 256x256 cursor sizes are available for
DCN3.2.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Aric Cyr <Aric.Cyr@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:39:00 -05:00
Vitaliy Shevtsov
c3c584c18c drm/amd/display: fix type mismatch in CalculateDynamicMetadataParameters()
There is a type mismatch between what CalculateDynamicMetadataParameters()
takes and what is passed to it. Currently this function accepts several
args as signed long but it's called with unsigned integers and integer. On
some systems where long is 32 bits and one of these unsigned int params is
greater than INT_MAX it may cause passing input params as negative values.

Fix this by changing these argument types from long to unsigned int and to
int respectively. Also this will align the function's definition with
similar functions in other dcn* drivers.

Found by Linux Verification Center (linuxtesting.org) with Svace.

Fixes: 6725a88f88 ("drm/amd/display: Add DCN3 DML")
Signed-off-by: Vitaliy Shevtsov <v.shevtsov@mt-integration.ru>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:38:07 -05:00
Lijo Lazar
a734a717dc drm/amdgpu: Avoid HDP flush on JPEG v5.0.1
Similar to JPEG v4.0.3, HDP flush shouldn't be performed by JPEG engine.
Keep it empty.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:38:01 -05:00
Lijo Lazar
6fcfaac604 drm/amdgpu: Initialize RRMT status on JPEG v5.0.1
Initialize RRMT enablement status from register.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:56 -05:00
Jesse.zhang@amd.com
77bd621d14 drm/amdgpu: Update SDMA scheduler mask handling to include page queue
This patch updates the SDMA scheduler mask handling to include the page queue
if it exists. The scheduler mask is calculated based on the number of SDMA
instances and the presence of the page queue. The mask is updated to reflect
the state of both the SDMA gfx ring and the page queue.

Changes:
- Add handling for the SDMA page queue in `amdgpu_debugfs_sdma_sched_mask_set`.
- Update scheduler mask calculations to include the page queue.
- Modify `amdgpu_debugfs_sdma_sched_mask_get` to return the correct mask value.

This change is necessary to verify multiple queues (SDMA gfx queue + page queue)
and ensure proper scheduling and state management for SDMA instances.

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:42 -05:00
Lijo Lazar
0b9647d40e drm/amdgpu: Add offset normalization in VCN v5.0.1
VCN v5.0.1 also will need register offset normalization. Reuse the logic
from VCN v4.0.3. Also, avoid HDP flush similar to VCN v4.0.3

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:35 -05:00
Lijo Lazar
b5a3fc54e8 drm/amdgpu: Initialize RRMT status on VCN v5.0.1
Initialize RRMT status from register.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:31 -05:00
Xiang Liu
677ae51f49 drm/amdgpu: Free CPER entry after committing to ring
Free CPER entry when it's committed to CPER ring to avoid memory leak.

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:27 -05:00
Alexandre Demers
899634a57a drm/amdgpu: fix spelling typos in SI
Fix typos

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:23 -05:00
Alexandre Demers
ce43abd7ec drm/amdgpu: fix spelling typos
Found some typos while exploring amdgpu code.

Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-05 10:37:13 -05:00
Dave Airlie
e21cba7047 Merge tag 'drm-misc-next-2025-02-27' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.15:

Cross-subsystem Changes:

bus:
- mhi: Avoid access to uninitialized field

Core Changes:

- Fix docmentation

dp:
- Add helpers for LTTPR transparent mode

sched:
- Improve job peek/pop operations
- Optimize layout of struct drm_sched_job

Driver Changes:

arc:
- Convert to devm_platform_ioremap_resource()

aspeed:
- Convert to devm_platform_ioremap_resource()

bridge:
- ti-sn65dsi86: Support CONFIG_PWM tristate

i915:
- dp: Use helpers for LTTPR transparent mode

mediatek:
- Convert to devm_platform_ioremap_resource()

msm:
- dp: Use helpers for LTTPR transparent mode

nouveau:
- dp: Use helpers for LTTPR transparent mode

panel:
- raydium-rm67200: Add driver for Raydium RM67200
- simple: Add support for BOE AV123Z7M-N17, BOE AV123Z7M-N17
- sony-td4353-jdi: Use MIPI-DSI multi-func interface
- summit: Add driver for Apple Summit display panel
- visionox-rm692e5: Add driver for Visionox RM692E5

repaper:
- Fix integer overflows

stm:
- Convert to devm_platform_ioremap_resource()

vc4:
- Convert to devm_platform_ioremap_resource()

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20250227094041.GA114623@linux.fritz.box
2025-02-28 12:36:01 +10:00
Srinivasan Shanmugam
7d83c129a8 drm/amdgpu: Fix parameter annotation in vcn_v5_0_0_is_idle
Update parameter description in the vcn_v5_0_0_is_idle function

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1231: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1231: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_is_idle'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:05 -05:00
Philip Yang
fe9d0061c4 drm/amdkfd: debugfs hang_hws skip GPU with MES
debugfs hang_hws is used by GPU reset test with HWS, for MES this crash
the kernel with NULL pointer access because dqm->packet_mgr is not setup
for MES path.

Skip GPU with MES for now, MES hang_hws debugfs interface will be
supported later.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:05 -05:00
Philip Yang
7919b4cad5 drm/amdkfd: Fix pqm_destroy_queue race with GPU reset
If GPU in reset, destroy_queue return -EIO, pqm_destroy_queue should
delete the queue from process_queue_list and free the resource.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:05 -05:00
Srinivasan Shanmugam
0f3fda3117 drm/amdgpu: Fix parameter annotations for VCN clock gating functions
The previous references to a non-existent `adev` parameter have been
removed & corrected to reflect the use of the `vinst` pointer, which
points to the VCN instance structure, in the below files:

- vcn_v1_0.c
- vcn_v2_0.c
- vcn_v3_0.c

Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:624: warning: Function parameter or struct member 'vinst' not described in 'vcn_v1_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:624: warning: Excess function parameter 'adev' description in 'vcn_v1_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:376: warning: Function parameter or struct member 'vinst' not described in 'vcn_v2_0_mc_resume'
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:376: warning: Excess function parameter 'adev' description in 'vcn_v2_0_mc_resume'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Function parameter or struct member 'vinst' not described in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Excess function parameter 'adev' description in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Excess function parameter 'inst' description in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Function parameter or struct member 'vinst' not described in 'vcn_v3_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Excess function parameter 'adev' description in 'vcn_v3_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Excess function parameter 'inst' description in 'vcn_v3_0_enable_clock_gating'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:05 -05:00
Philip Yang
f0b4440cdc drm/amdkfd: Fix mode1 reset crash issue
If HW scheduler hangs and mode1 reset is used to recover GPU, KFD signal
user space to abort the processes. After process abort exit, user queues
still use the GPU to access system memory before h/w is reset while KFD
cleanup worker free system memory and free VRAM.

There is use-after-free race bug that KFD allocate and reuse the freed
system memory, and user queue write to the same system memory to corrupt
the data structure and cause driver crash.

To fix this race, KFD cleanup worker terminate user queues, then flush
reset_domain wq to wait for any GPU ongoing reset complete, and then
free outstanding BOs.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Philip Yang
1b9366c601 drm/amdkfd: KFD release_work possible circular locking
If waiting for gpu reset done in KFD release_work, thers is WARNING:
possible circular locking dependency detected

  #2  kfd_create_process
        kfd_process_mutex
          flush kfd release work

  #1  kfd release work
        wait for amdgpu reset work

  #0  amdgpu_device_gpu_reset
        kgd2kfd_pre_reset
          kfd_process_mutex

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock((work_completion)(&p->release_work));
                  lock((wq_completion)kfd_process_wq);
                  lock((work_completion)(&p->release_work));
   lock((wq_completion)amdgpu-reset-dev);

To fix this, KFD create process move flush release work outside
kfd_process_mutex.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Philip Yang
ee3ed10066 drm/amdkfd: Remove kfd_process_hw_exception worker
With GPU reset-domain worker implemented, KFD hw_exception worker is not
needed any more, just call amdgpu_amdkfd_gpu_reset directly from
kfd_hws_hang.

Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00
Asad Kamal
f9234217d0 drm/amd/amdgpu: Add support for xgmi_v6_4_1
Add support for xgmi_v6_4_1 and use it appropriate places

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27 16:50:04 -05:00