Commit Graph

33509 Commits

Author SHA1 Message Date
Jani Nikula
cb2e1c2136 drm: remove driver date from struct drm_driver and all drivers
We stopped using the driver initialized date in commit 7fb8af6798
("drm: deprecate driver date") and (eventually) started returning "0"
for drm_version ioctl instead.

Finish the job, and remove the unused date member from struct
drm_driver, its initialization from drivers, along with the common
DRIVER_DATE macros.

v2: Also update drivers/accel (kernel test robot)

Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Simon Ser <contact@emersion.fr>
Acked-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # msm
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/1f2bf2543aed270a06f6c707fd6ed1b78bf16712.1733322525.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-12-05 12:35:42 +02:00
Alex Deucher
73dae652dc drm/amdgpu: rework resume handling for display (v2)
Split resume into a 3rd step to handle displays when DCC is
enabled on DCN 4.0.1.  Move display after the buffer funcs
have been re-enabled so that the GPU will do the move and
properly set the DCC metadata for DCN.

v2: fix fence irq resume ordering

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.11.x
2024-12-03 18:19:23 -05:00
Alex Deucher
1443dd3c67 drm/amd/pm: fix and simplify workload handling
smu->workload_mask is IP specific and should not be messed with in
the common code. The mask bits vary across SMU versions.

Move all handling of smu->workload_mask in to the backends and
simplify the code.  Store the user's preference in smu->power_profile_mode
which will be reflected in sysfs.  For internal driver profile
switches for KFD or VCN, just update the workload mask so that the
user's preference is retained.  Remove all of the extra now unused
workload related elements in the smu structure.

v2: use refcounts for workload profiles
v3: rework based on feedback from Lijo
v4: fix the refcount on failure, drop backend mask
v5: rework custom handling
v6: handle failure cleanup with custom profile
v7: Update documentation

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Kenneth Feng <kenneth.feng@amd.com>
Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: stable@vger.kernel.org # 6.11.x
2024-12-02 18:36:15 -05:00
Alex Deucher
c3d06a3b6a Revert "drm/amd/pm: correct the workload setting"
This reverts commit 74e1006430.

This causes a regression in the workload selection.
A more extensive fix is being worked on.
For now, revert.

This came back after a merge in 6.13-rc1, so revert again.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618
Fixes: 74e1006430 ("drm/amd/pm: correct the workload setting")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 44f392fbf6)
2024-12-02 18:35:57 -05:00
Yiqing Yao
f3bb57b66d drm/amdgpu: fix sriov reinit late orders
Use found block to call correct init/resume function on the block.
Set status.hw for resume and init.

Print re-init result again. Change to use dev_info.
Use amdgpu_device_ip_get_ip_block to get target block instead of
loop.

Fixes: 502d76308d ("drm/amdgpu: validate resume before function call")
Signed-off-by: Yiqing Yao <YiQing.Yao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:35:42 -05:00
Pratap Nirujogi
274e3f4596 drm/amdgpu: Fix ISP hw init issue
ISP hw_init is not called with the recent changes related
to hw init levels. AMDGPU_INIT_LEVEL_DEFAULT is ignoring
the ISP IP block as AMDGPU_IP_BLK_MASK_ALL is derived using
incorrect max number of IP blocks.

Update AMDGPU_IP_BLK_MASK_ALL to use AMDGPU_MAX_IP_NUM
instead of (AMDGPU_MAX_IP_NUM - 1) to fix the issue.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Fixes: 14f2fe34f5 ("drm/amdgpu: Add init levels")
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:35:36 -05:00
Chris Park
0c0a19430b drm/amd/display: Add hblank borrowing support
[WHY]
Some DSC timing failed at bandwidth validation due to hactive
can't be evenly divided on each ODM segment.

[HOW]
Borrow from hblank to increase hactive to support these timing.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Chris Park <chris.park@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:35:32 -05:00
Dillon Varone
a29997b7ac drm/amd/display: Limit VTotal range to max hw cap minus fp
[WHY & HOW]
Hardware does not support the VTotal to be between fp2 lines of the
maximum possible VTotal, so add a capability flag to track it and apply
where necessary.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jun Lei <jun.lei@amd.com>
Reviewed-by: Anthony Koo <anthony.koo@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:35:28 -05:00
Lo-an Chen
24d3749c11 drm/amd/display: Correct prefetch calculation
[WHY]
The minimum value of the dst_y_prefetch_equ was not correct
in prefetch calculation whice causes OPTC underflow.

[HOW]
Add the min operation of dst_y_prefetch_equ in prefetch calculation.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Lo-an Chen <lo-an.chen@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:35:25 -05:00
Sung Lee
6a7fd76b94 drm/amd/display: Add option to retrieve detile buffer size
[WHY]
For better power profiling knowing the detile
buffer size at a given point in time
would be useful.

[HOW]
Add interface to retrieve detile buffer from
dc state.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Sung Lee <Sung.Lee@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:35:20 -05:00
Peterson Guo
63e7ee677c drm/amd/display: Add a left edge pixel if in YCbCr422 or YCbCr420 and odm
[WHY]
On some cards when odm is used, the monitor will have 2 separate pipes
split vertically. When compression is used on the YCbCr colour space on
the second pipe to have correct colours, we need to read a pixel from the
end of first pipe to accurately display colours. Hardware was programmed
properly to account for this extra pixel but it was not calculated
properly in software causing a split screen on some monitors.

[HOW]
The fix adjusts the second pipe's viewport and timings if the pixel
encoding is YCbCr422 or YCbCr420.

Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: George Shen <george.shen@amd.com>
Signed-off-by: Peterson Guo <peterson.guo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:35:16 -05:00
David Yat Sin
55ed120dcf drm/amdkfd: hard-code cacheline for gc943,gc944
Cacheline size is not available in IP discovery for gc943,gc944.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-12-02 18:34:57 -05:00
Sreekant Somasekharan
33114f1057 drm/amdkfd: add MEC version that supports no PCIe atomics for GFX12
Add MEC version from which alternate support for no PCIe atomics
is provided so that device is not skipped during KFD device init in
GFX1200/GFX1201.

Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.11.x
2024-12-02 18:34:52 -05:00
Mario Limonciello
1c09386201 drm/amd/display: Fix programming backlight on OLED panels
commit 38077562e0 ("drm/amd/display: Implement new
backlight_level_params structure") adjusted DC core to require
the backlight type to be programmed in the dc link when changing
brightness.  This isn't initialized in amdgpu_dm for OLED panels
though which broke brightness.

Explicitly initialize when aux support is enabled.

Reported-and-tested-by: Luke Jones <luke@ljones.dev>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3792
Fixes: 38077562e0 ("drm/amd/display: Implement new backlight_level_params structure")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20241128032200.2085398-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:34:48 -05:00
Mario Limonciello
0f15cbc203 drm/amd: Sanity check the ACPI EDID
An HP Pavilion Aero Laptop 13-be0xxx/8916 has an ACPI EDID, but using
it is causing corruption. It's got illogical values of not specifying
a digital interface. Sanity check the ACPI EDID to avoid tripping such
problems.

Suggested-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Reported-and-tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3782
Fixes: c6a837088b ("drm/amd/display: Fetch the EDID from _DDC if available for eDP")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20241128032500.2088288-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 18:34:43 -05:00
Alex Deucher
689275140c drm/amdgpu/hdp7.0: do a posting read when flushing HDP
Need to read back to make sure the write goes through.

Cc: David Belanger <david.belanger@amd.com>
Reviewed-by: Frank Min <frank.min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-12-02 18:05:04 -05:00
Alex Deucher
abe1cbaec6 drm/amdgpu/hdp6.0: do a posting read when flushing HDP
Need to read back to make sure the write goes through.

Cc: David Belanger <david.belanger@amd.com>
Reviewed-by: Frank Min <frank.min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-12-02 17:38:47 -05:00
Alex Deucher
f756dbac1c drm/amdgpu/hdp5.2: do a posting read when flushing HDP
Need to read back to make sure the write goes through.

Cc: David Belanger <david.belanger@amd.com>
Reviewed-by: Frank Min <frank.min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-12-02 17:38:41 -05:00
Alex Deucher
cf424020e0 drm/amdgpu/hdp5.0: do a posting read when flushing HDP
Need to read back to make sure the write goes through.

Cc: David Belanger <david.belanger@amd.com>
Reviewed-by: Frank Min <frank.min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-12-02 17:38:26 -05:00
Alex Deucher
c9b8dcabb5 drm/amdgpu/hdp4.0: do a posting read when flushing HDP
Need to read back to make sure the write goes through.

Cc: David Belanger <david.belanger@amd.com>
Reviewed-by: Frank Min <frank.min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-12-02 17:38:05 -05:00
Alex Deucher
c6c2f66372 drm/amdgpu/jpeg1.0: fix idle work handler
On VCN 1.0, VCN and JPEG use the same worker thread so cancel
the vcn worker rather than jpeg.  On VCN 2.0 and newer
there are separate workers for each.

Fixes: 93df748737 ("drm/amdgpu/jpeg: cancel the jpeg worker")
Tested-by: George Zhang <george.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-02 17:37:31 -05:00
Peter Zijlstra
cdd30ebb1b module: Convert symbol namespace to string literal
Clean up the existing export namespace code along the same lines of
commit 33def8498f ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02 11:34:44 -08:00
Maxime Ripard
3aba2eba84 Merge drm/drm-next into drm-misc-next
Kickstart 6.14 cycle.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-12-02 12:44:18 +01:00
Linus Torvalds
2ba9f676d0 Merge tag 'drm-next-2024-11-29' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Merge window fixes, mostly amdgpu and xe, with a few other minor ones,
  all looks fairly normal,

  i915:
   - hdcp: Fix when the first read and write are retried

  xe:
   - Wake up waiters after wait condition set to true
   - Mark the preempt fence workqueue as reclaim
   - Update xe2 graphics name string
   - Fix a couple of guc submit races
   - Fix pat index usage in migrate
   - Ensure non-cached migrate pagetable bo mappings
   - Take a PM ref in the delayed snapshot capture worker

  amdgpu:
   - SMU 13.0.6 fixes
   - XGMI fixes
   - SMU 13.0.7 fixes
   - Misc code cleanups
   - Plane refcount fixes
   - DCN 4.0.1 fixes
   - DC power fixes
   - DTO fixes
   - NBIO 7.11 fixes
   - SMU 14.0.x fixes
   - Reset fixes
   - Enable DC on LoongArch
   - Sysfs hotplug warning fix
   - Misc small fixes
   - VCN 4.0.3 fix
   - Slab usage fix
   - Jpeg delayed work fix

  amdkfd:
   - wptr handling fixes

  radeon:
   - Use ttm_bo_move_null()
   - Constify struct pci_device_id
   - Fix spurious hotplug
   - HPD fix

  rockchip
   - fix 32-bit build"

* tag 'drm-next-2024-11-29' of https://gitlab.freedesktop.org/drm/kernel: (48 commits)
  drm/xe: Take PM ref in delayed snapshot capture worker
  drm/xe/migrate: use XE_BO_FLAG_PAGETABLE
  drm/xe/migrate: fix pat index usage
  drm/xe/guc_submit: fix race around suspend_pending
  drm/xe/guc_submit: fix race around pending_disable
  drm/xe: Update xe2_graphics name string
  drm/rockchip: avoid 64-bit division
  Revert "drm/radeon: Delay Connector detecting when HPD singals is unstable"
  drm/amdgpu/jpeg: cancel the jpeg worker
  drm/amdgpu: fix usage slab after free
  drm/amdgpu/vcn: reset fw_shared when VCPU buffers corrupted on vcn v4.0.3
  drm/amdgpu: Fix sysfs warning when hotplugging
  drm/amdgpu: Add sysfs interface for vcn reset mask
  drm/amdgpu/gmc7: fix wait_for_idle callers
  drm/amd/pm: Remove arcturus min power limit
  drm/amd/pm: skip setting the power source on smu v14.0.2/3
  drm/amd/pm: disable pcie speed switching on Intel platform for smu v14.0.2/3
  drm/amdkfd: Use the correct wptr size
  drm/xe: Mark preempt fence workqueue as reclaim
  drm/xe/ufence: Wake up waiters after setting ufence->signalled
  ...
2024-11-29 13:06:06 -08:00
Linus Torvalds
55cb93fd24 Merge tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
 "Here is a small set of driver core changes for 6.13-rc1.

  Nothing major for this merge cycle, except for the two simple merge
  conflicts are here just to make life interesting.

  Included in here are:

   - sysfs core changes and preparations for more sysfs api cleanups
     that can come through all driver trees after -rc1 is out

   - fw_devlink fixes based on many reports and debugging sessions

   - list_for_each_reverse() removal, no one was using it!

   - last-minute seq_printf() format string bug found and fixed in many
     drivers all at once.

   - minor bugfixes and changes full details in the shortlog"

* tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits)
  Fix a potential abuse of seq_printf() format string in drivers
  cpu: Remove spurious NULL in attribute_group definition
  s390/con3215: Remove spurious NULL in attribute_group definition
  perf: arm-ni: Remove spurious NULL in attribute_group definition
  driver core: Constify bin_attribute definitions
  sysfs: attribute_group: allow registration of const bin_attribute
  firmware_loader: Fix possible resource leak in fw_log_firmware_info()
  drivers: core: fw_devlink: Fix excess parameter description in docstring
  driver core: class: Correct WARN() message in APIs class_(for_each|find)_device()
  cacheinfo: Use of_property_present() for non-boolean properties
  cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap()
  drivers: core: fw_devlink: Make the error message a bit more useful
  phy: tegra: xusb: Set fwnode for xusb port devices
  drm: display: Set fwnode for aux bus devices
  driver core: fw_devlink: Stop trying to optimize cycle detection logic
  driver core: Constify attribute arguments of binary attributes
  sysfs: bin_attribute: add const read/write callback variants
  sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR()
  sysfs: treewide: constify attribute callback of bin_attribute::llseek()
  sysfs: treewide: constify attribute callback of bin_attribute::mmap()
  ...
2024-11-29 11:43:29 -08:00
Linus Torvalds
28eb75e178 Merge tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "There's a lot of rework, the panic helper support is being added to
  more drivers, v3d gets support for HW superpages, scheduler
  documentation, drm client and video aperture reworks, some new
  MAINTAINERS added, amdgpu has the usual lots of IP refactors, Intel
  has some Pantherlake enablement and xe is getting some SRIOV bits, but
  just lots of stuff everywhere.

  core:
   - split DSC helpers from DP helpers
   - clang build fixes for drm/mm test
   - drop simple pipeline support for gem vram
   - document submission error signaling
   - move drm_rect to drm core module from kms helper
   - add default client setup to most drivers
   - move to video aperture helpers instead of drm ones

  tests:
   - new framebuffer tests

  ttm:
   - remove swapped and pinned BOs from TTM lru

  panic:
   - fix uninit spinlock
   - add ABGR2101010 support

  bridge:
   - add TI TDP158 support
   - use standard PM OPS

  dma-fence:
   - use read_trylock instead of read_lock to help lockdep

  scheduler:
   - add errno to sched start to report different errors
   - add locking to drm_sched_entity_modify_sched
   - improve documentation

  xe:
   - add drm_line_printer
   - lots of refactoring
   - Enable Xe2 + PES disaggregation
   - add new ARL PCI ID
   - SRIOV development work
   - fix exec unnecessary implicit fence
   - define and parse OA sync props
   - forcewake refactoring

  i915:
   - Enable BMG/LNL ultra joiner
   - Enable 10bpx + CCS scanout on ICL+, fp16/CCS on TGL+
   - use DSB for plane/color mgmt
   - Arrow lake PCI IDs
   - lots of i915/xe display refactoring
   - enable PXP GuC autoteardown
   - Pantherlake (PTL) Xe3 LPD display enablement
   - Allow fastset HDR infoframe changes
   - write DP source OUI for non-eDP sinks
   - share PCI IDs between i915 and xe

  amdgpu:
   - SDMA queue reset support
   - SMU 13.0.6, JPEG 4.0.3 updates
   - Initial runtime repartitioning support
   - rework IP structs for multiple IP instances
   - Fetch EDID from _DDC if available
   - SMU13 zero rpm user control
   - lots of fixes/cleanups

  amdkfd:
   - Increase event FIFO size
   - add topology cap flag for per queue reset

  msm:
   - DPU:
      - SA8775P support
      - (disabled by default) MSM8917, MSM8937, MSM8953 and MSM8996 support
      - Enable large framebuffer support
      - Drop MSM8998 and SDM845
   - DP:
      - SA8775P support
   - GPU:
      - a7xx preemption support
      - Adreno A663 support

  ast:
   - warn about unsupported TX chips

  ivpu:
   - add coredump
   - add pantherlake support

  rockchip:
   - 4K@60Hz display enablement
   - generate pll programming tables

  panthor:
   - add timestamp query API
   - add realtime group priority
   - add fdinfo support

  etnaviv:
   - improve handling of DMA address limits
   - improve GPU hangcheck

  exynos:
   - Decon Exynos7870 support

  mediatek:
   - add OF graph support

  omap:
   - locking fixes

  bochs:
   - convert to gem/shmem from simpledrm

  v3d:
   - support big/super pages
   - add gemfs

  vc4:
   - BCM2712 support refactoring
   - add YUV444 format support

  udmabuf:
   - folio related fixes

  nouveau:
   - add panic support on nv50+"

* tag 'drm-next-2024-11-21' of https://gitlab.freedesktop.org/drm/kernel: (1583 commits)
  drm/xe/guc: Fix dereference before NULL check
  drm/amd: Fix initialization mistake for NBIO 7.7.0
  Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"
  drm/amd/display: Fix failure to read vram info due to static BP_RESULT
  drm/amdgpu: enable GTT fallback handling for dGPUs only
  drm/amd/amdgpu: limit single process inside MES
  drm/fourcc: add AMD_FMT_MOD_TILE_GFX9_4K_D_X
  drm/amdgpu/mes12: correct kiq unmap latency
  drm/amdgpu: Support vcn and jpeg error info parsing
  drm/amd : Update MES API header file for v11 & v12
  drm/amd/amdkfd: add/remove kfd queues on start/stop KFD scheduling
  drm/amdkfd: change kfd process kref count at creation
  drm/amdgpu: Cleanup shift coding style
  drm/amd/amdgpu: Increase MES log buffer to dump mes scratch data
  drm/amdgpu: Implement virt req_ras_err_count
  drm/amdgpu: VF Query RAS Caps from Host if supported
  drm/amdgpu: Add msg handlers for SRIOV RAS Telemetry
  drm/amdgpu: Update SRIOV Exchange Headers for RAS Telemetry Support
  drm/amd/display: 3.2.309
  drm/amd/display: Adjust VSDB parser for replay feature
  ...
2024-11-21 14:56:17 -08:00
Alex Deucher
93df748737 drm/amdgpu/jpeg: cancel the jpeg worker
Looks like these got missed when jpeg was split from vcn.
Cancel the jpeg workers rather than vcn workers.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-21 15:56:23 -05:00
Vitaly Prosyak
b61badd20b drm/amdgpu: fix usage slab after free
[  +0.000021] BUG: KASAN: slab-use-after-free in drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000027] Read of size 8 at addr ffff8881b8605f88 by task amd_pci_unplug/2147

[  +0.000023] CPU: 6 PID: 2147 Comm: amd_pci_unplug Not tainted 6.10.0+ #1
[  +0.000016] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[  +0.000016] Call Trace:
[  +0.000008]  <TASK>
[  +0.000009]  dump_stack_lvl+0x76/0xa0
[  +0.000017]  print_report+0xce/0x5f0
[  +0.000017]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000019]  ? srso_return_thunk+0x5/0x5f
[  +0.000015]  ? kasan_complete_mode_report_info+0x72/0x200
[  +0.000016]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000019]  kasan_report+0xbe/0x110
[  +0.000015]  ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000023]  __asan_report_load8_noabort+0x14/0x30
[  +0.000014]  drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched]
[  +0.000020]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? __kasan_check_write+0x14/0x30
[  +0.000016]  ? __pfx_drm_sched_entity_flush+0x10/0x10 [gpu_sched]
[  +0.000020]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? __kasan_check_write+0x14/0x30
[  +0.000013]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? enable_work+0x124/0x220
[  +0.000015]  ? __pfx_enable_work+0x10/0x10
[  +0.000013]  ? srso_return_thunk+0x5/0x5f
[  +0.000014]  ? free_large_kmalloc+0x85/0xf0
[  +0.000016]  drm_sched_entity_destroy+0x18/0x30 [gpu_sched]
[  +0.000020]  amdgpu_vce_sw_fini+0x55/0x170 [amdgpu]
[  +0.000735]  ? __kasan_check_read+0x11/0x20
[  +0.000016]  vce_v4_0_sw_fini+0x80/0x110 [amdgpu]
[  +0.000726]  amdgpu_device_fini_sw+0x331/0xfc0 [amdgpu]
[  +0.000679]  ? mutex_unlock+0x80/0xe0
[  +0.000017]  ? __pfx_amdgpu_device_fini_sw+0x10/0x10 [amdgpu]
[  +0.000662]  ? srso_return_thunk+0x5/0x5f
[  +0.000014]  ? __kasan_check_write+0x14/0x30
[  +0.000013]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? mutex_unlock+0x80/0xe0
[  +0.000016]  amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
[  +0.000663]  drm_minor_release+0xc9/0x140 [drm]
[  +0.000081]  drm_release+0x1fd/0x390 [drm]
[  +0.000082]  __fput+0x36c/0xad0
[  +0.000018]  __fput_sync+0x3c/0x50
[  +0.000014]  __x64_sys_close+0x7d/0xe0
[  +0.000014]  x64_sys_call+0x1bc6/0x2680
[  +0.000014]  do_syscall_64+0x70/0x130
[  +0.000014]  ? srso_return_thunk+0x5/0x5f
[  +0.000014]  ? irqentry_exit_to_user_mode+0x60/0x190
[  +0.000015]  ? srso_return_thunk+0x5/0x5f
[  +0.000014]  ? irqentry_exit+0x43/0x50
[  +0.000012]  ? srso_return_thunk+0x5/0x5f
[  +0.000013]  ? exc_page_fault+0x7c/0x110
[  +0.000015]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  +0.000014] RIP: 0033:0x7ffff7b14f67
[  +0.000013] Code: ff e8 0d 16 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff
[  +0.000026] RSP: 002b:00007fffffffe378 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[  +0.000019] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7b14f67
[  +0.000014] RDX: 0000000000000000 RSI: 00007ffff7f6f47a RDI: 0000000000000003
[  +0.000014] RBP: 00007fffffffe3a0 R08: 0000555555569890 R09: 0000000000000000
[  +0.000014] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5c8
[  +0.000013] R13: 00005555555552a9 R14: 0000555555557d48 R15: 00007ffff7ffd040
[  +0.000020]  </TASK>

[  +0.000016] Allocated by task 383 on cpu 7 at 26.880319s:
[  +0.000014]  kasan_save_stack+0x28/0x60
[  +0.000008]  kasan_save_track+0x18/0x70
[  +0.000007]  kasan_save_alloc_info+0x38/0x60
[  +0.000007]  __kasan_kmalloc+0xc1/0xd0
[  +0.000007]  kmalloc_trace_noprof+0x180/0x380
[  +0.000007]  drm_sched_init+0x411/0xec0 [gpu_sched]
[  +0.000012]  amdgpu_device_init+0x695f/0xa610 [amdgpu]
[  +0.000658]  amdgpu_driver_load_kms+0x1a/0x120 [amdgpu]
[  +0.000662]  amdgpu_pci_probe+0x361/0xf30 [amdgpu]
[  +0.000651]  local_pci_probe+0xe7/0x1b0
[  +0.000009]  pci_device_probe+0x248/0x890
[  +0.000008]  really_probe+0x1fd/0x950
[  +0.000008]  __driver_probe_device+0x307/0x410
[  +0.000007]  driver_probe_device+0x4e/0x150
[  +0.000007]  __driver_attach+0x223/0x510
[  +0.000006]  bus_for_each_dev+0x102/0x1a0
[  +0.000007]  driver_attach+0x3d/0x60
[  +0.000006]  bus_add_driver+0x2ac/0x5f0
[  +0.000006]  driver_register+0x13d/0x490
[  +0.000008]  __pci_register_driver+0x1ee/0x2b0
[  +0.000007]  llc_sap_close+0xb0/0x160 [llc]
[  +0.000009]  do_one_initcall+0x9c/0x3e0
[  +0.000008]  do_init_module+0x241/0x760
[  +0.000008]  load_module+0x51ac/0x6c30
[  +0.000006]  __do_sys_init_module+0x234/0x270
[  +0.000007]  __x64_sys_init_module+0x73/0xc0
[  +0.000006]  x64_sys_call+0xe3/0x2680
[  +0.000006]  do_syscall_64+0x70/0x130
[  +0.000007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

[  +0.000015] Freed by task 2147 on cpu 6 at 160.507651s:
[  +0.000013]  kasan_save_stack+0x28/0x60
[  +0.000007]  kasan_save_track+0x18/0x70
[  +0.000007]  kasan_save_free_info+0x3b/0x60
[  +0.000007]  poison_slab_object+0x115/0x1c0
[  +0.000007]  __kasan_slab_free+0x34/0x60
[  +0.000007]  kfree+0xfa/0x2f0
[  +0.000007]  drm_sched_fini+0x19d/0x410 [gpu_sched]
[  +0.000012]  amdgpu_fence_driver_sw_fini+0xc4/0x2f0 [amdgpu]
[  +0.000662]  amdgpu_device_fini_sw+0x77/0xfc0 [amdgpu]
[  +0.000653]  amdgpu_driver_release_kms+0x16/0x80 [amdgpu]
[  +0.000655]  drm_minor_release+0xc9/0x140 [drm]
[  +0.000071]  drm_release+0x1fd/0x390 [drm]
[  +0.000071]  __fput+0x36c/0xad0
[  +0.000008]  __fput_sync+0x3c/0x50
[  +0.000007]  __x64_sys_close+0x7d/0xe0
[  +0.000007]  x64_sys_call+0x1bc6/0x2680
[  +0.000007]  do_syscall_64+0x70/0x130
[  +0.000007]  entry_SYSCALL_64_after_hwframe+0x76/0x7e

[  +0.000014] The buggy address belongs to the object at ffff8881b8605f80
               which belongs to the cache kmalloc-64 of size 64
[  +0.000020] The buggy address is located 8 bytes inside of
               freed 64-byte region [ffff8881b8605f80, ffff8881b8605fc0)

[  +0.000028] The buggy address belongs to the physical page:
[  +0.000011] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1b8605
[  +0.000008] anon flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[  +0.000007] page_type: 0xffffefff(slab)
[  +0.000009] raw: 0017ffffc0000000 ffff8881000428c0 0000000000000000 dead000000000001
[  +0.000006] raw: 0000000000000000 0000000000200020 00000001ffffefff 0000000000000000
[  +0.000006] page dumped because: kasan: bad access detected

[  +0.000012] Memory state around the buggy address:
[  +0.000011]  ffff8881b8605e80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  +0.000015]  ffff8881b8605f00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
[  +0.000015] >ffff8881b8605f80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  +0.000013]                       ^
[  +0.000011]  ffff8881b8606000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc
[  +0.000014]  ffff8881b8606080: fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb
[  +0.000013] ==================================================================

The issue reproduced on VG20 during the IGT pci_unplug test.
The root cause of the issue is that the function drm_sched_fini is called before drm_sched_entity_kill.
In drm_sched_fini, the drm_sched_rq structure is freed, but this structure is later accessed by
each entity within the run queue, leading to invalid memory access.
To resolve this, the order of cleanup calls is updated:

    Before:
        amdgpu_fence_driver_sw_fini
        amdgpu_device_ip_fini

    After:
        amdgpu_device_ip_fini
        amdgpu_fence_driver_sw_fini

This updated order ensures that all entities in the IPs are cleaned up first, followed by proper
cleanup of the schedulers.

Additional Investigation:

During debugging, another issue was identified in the amdgpu_vce_sw_fini function. The vce.vcpu_bo
buffer must be freed only as the final step in the cleanup process to prevent any premature
access during earlier cleanup stages.

v2: Using Christian suggestion call drm_sched_entity_destroy before drm_sched_fini.

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-11-21 15:56:22 -05:00
Xiang Liu
928cd772e1 drm/amdgpu/vcn: reset fw_shared when VCPU buffers corrupted on vcn v4.0.3
It is not necessarily corrupted. When there is RAS fatal error, device
memory access is blocked. Hence vcpu bo cannot be saved to system memory
as in a regular suspend sequence before going for reset. In other full
device reset cases, that gets saved and restored during resume.

v2: Remove redundant code like vcn_v4_0 did
v2: Refine commit message
v3: Drop the volatile
v3: Refine commit message

Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-21 15:56:22 -05:00
Jesse.zhang@amd.com
2f1b13521d drm/amdgpu: Fix sysfs warning when hotplugging
Fix the similar warning when hotplugging:

[  155.585721] kernfs: can not remove 'enforce_isolation', no directory
[  155.592201] WARNING: CPU: 3 PID: 6960 at fs/kernfs/dir.c:1683 kernfs_remove_by_name_ns+0xb9/0xc0
[  155.601145] Modules linked in: xt_MASQUERADE xt_comment nft_compat veth bridge stp llc overlay nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr intel_rapl_msr amd_atl intel_rapl_common amd64_edac edac_mce_amd amdgpu kvm_amd kvm ipmi_ssif amdxcp rapl drm_exec gpu_sched drm_buddy i2c_algo_bit drm_suballoc_helper drm_ttm_helper ttm pcspkr drm_display_helper acpi_cpufreq drm_kms_helper video wmi k10temp i2c_piix4 acpi_ipmi ipmi_si drm zram ip_tables loop squashfs dm_multipath crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 sp5100_tco ixgbe rfkill ccp dca sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_devintf ipmi_msghandler fuse
[  155.685224] systemd-journald[1354]: Compressed data object 957 -> 524 using ZSTD
[  155.685687] CPU: 3 PID: 6960 Comm: amd_pci_unplug Not tainted 6.10.0-1148853.1.zuul.164395107d6642bdb451071313e9378d #1
[  155.704149] Hardware name: TYAN B8021G88V2HR-2T/S8021GM2NR-2T, BIOS V1.03.B10 04/01/2019
[  155.712383] RIP: 0010:kernfs_remove_by_name_ns+0xb9/0xc0
[  155.717805] Code: a0 00 48 89 ef e8 37 96 c7 ff 5b b8 fe ff ff ff 5d 41 5c 41 5d e9 f7 96 a0 00 0f 0b eb ab 48 c7 c7 48 ba 7e 8f e8 f7 66 bf ff <0f> 0b eb dc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
[  155.736766] RSP: 0018:ffffb1685d7a3e20 EFLAGS: 00010296
[  155.742108] RAX: 0000000000000038 RBX: ffff929e94c80000 RCX: 0000000000000000
[  155.749363] RDX: ffff928e1efaf200 RSI: ffff928e1efa18c0 RDI: ffff928e1efa18c0
[  155.756612] RBP: 0000000000000008 R08: 0000000000000000 R09: 0000000000000003
[  155.763855] R10: ffffb1685d7a3cd8 R11: ffffffff8fb3e1c8 R12: ffffffffc1ef5341
[  155.771104] R13: ffff929e94cc5530 R14: 0000000000000000 R15: 0000000000000000
[  155.778357] FS:  00007fd9dd8d9c40(0000) GS:ffff928e1ef80000(0000) knlGS:0000000000000000
[  155.786594] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  155.792450] CR2: 0000561245ceee38 CR3: 0000000113018000 CR4: 00000000003506f0
[  155.799702] Call Trace:
[  155.802254]  <TASK>
[  155.804460]  ? __warn+0x80/0x120
[  155.807798]  ? kernfs_remove_by_name_ns+0xb9/0xc0
[  155.812617]  ? report_bug+0x164/0x190
[  155.816393]  ? handle_bug+0x3c/0x80
[  155.819994]  ? exc_invalid_op+0x17/0x70
[  155.823939]  ? asm_exc_invalid_op+0x1a/0x20
[  155.828235]  ? kernfs_remove_by_name_ns+0xb9/0xc0
[  155.833058]  amdgpu_gfx_sysfs_fini+0x59/0xd0 [amdgpu]
[  155.838637]  gfx_v9_0_sw_fini+0x123/0x1c0 [amdgpu]
[  155.843887]  amdgpu_device_fini_sw+0xbc/0x3e0 [amdgpu]
[  155.849432]  amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
[  155.855235]  drm_dev_put.part.0+0x3c/0x60 [drm]
[  155.859914]  drm_release+0x8b/0xc0 [drm]
[  155.863978]  __fput+0xf1/0x2c0
[  155.867141]  __x64_sys_close+0x3c/0x80
[  155.870998]  do_syscall_64+0x64/0x170

V2: Add details in comments (Tim)

Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reported-by: Andy Dong <andy.dong@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-21 15:56:22 -05:00
Jesse.zhang@amd.com
fb9898243a drm/amdgpu: Add sysfs interface for vcn reset mask
Add the sysfs interface for vcn:
vcn_reset_mask

The interface is read-only and show the resets supported by the IP.
For example, full adapter reset (mode1/mode2/BACO/etc),
soft reset, queue reset, and pipe reset.

V2: the sysfs node returns a text string instead of some flags (Christian)

V2: the sysfs node returns a text string instead of some flags (Christian)
v3: add a generic helper which takes the ring as parameter
    and print the strings in the order they are applied (Christian)

    check amdgpu_gpu_recovery  before creating sysfs file itself,
    and initialize supported_reset_types in IP version files (Lijo)
v4: s/sdma/vcn/ in the reset mask setup

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-21 15:56:22 -05:00
Alex Deucher
4c28e645aa drm/amdgpu/gmc7: fix wait_for_idle callers
The wait_for_idle signature was changed, but the callers
were not.

Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Reported-by: Michel Dänzer <michel@daenzer.net>
Fixes: 82ae6619a4 ("drm/amdgpu: update the handle ptr in wait_for_idle")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
2024-11-21 15:56:22 -05:00
Lijo Lazar
da868898cf drm/amd/pm: Remove arcturus min power limit
As per power team, there is no need to impose a lower bound on arcturus
power limit. Any unreasonable limit set will result in frequent
throttling.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-11-21 15:56:06 -05:00
Kenneth Feng
76c7f08094 drm/amd/pm: skip setting the power source on smu v14.0.2/3
skip setting power source on smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.11.x
2024-11-21 15:55:58 -05:00
Kenneth Feng
b0df0e7778 drm/amd/pm: disable pcie speed switching on Intel platform for smu v14.0.2/3
disable pcie speed switching on Intel platform for smu v14.0.2/3
based on Intel's requirement.
v2: align the setting with smu v13.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.11.x
2024-11-21 15:55:40 -05:00
Lijo Lazar
cdc6705f98 drm/amdkfd: Use the correct wptr size
Write pointer could be 32-bit or 64-bit. Use the correct size during
initialization.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-11-21 15:55:20 -05:00
Thomas Weißschuh
c2753b2471 drm/amd/display: Add support for minimum backlight quirk
Not all platforms provide the full range of PWM backlight capabilities
supported by the hardware through ATIF.
Use the generic drm panel minimum backlight quirk infrastructure to
override the capabilities where necessary.

Testing the backlight quirk together with the "panel_power_savings"
sysfs file has not shown any negative impact.
One quirk seems to be that 0% at panel_power_savings=0 seems to be
slightly darker than at panel_power_savings=4.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Tested-by: Dustin L. Howett <dustin@howett.net>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241111-amdgpu-min-backlight-quirk-v7-2-f662851fda69@weissschuh.net
2024-11-21 09:28:13 -06:00
Huacai Chen
4217ef9ab7 drm/amd/display: Allow building DC with clang on LoongArch
Clang on LoongArch (18+) appears to be unaffected by the bug causing
excessive stack usage in calculate_bandwidth(). But when building DC_FP
support the stack frame size can be as large as 2816 bytes, which causes
the FRAME_WARN build warnings. So on LoongArch we allow building DC with
clang, but disable DC_FP by default.

The help message is also updated.

Tested-by: Rui Wang <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 10:03:06 -05:00
Zicheng Qu
2bc96c9507 drm/amd/display: Fix null check for pipe_ctx->plane_state in hwss_setup_dpp
This commit addresses a null pointer dereference issue in
hwss_setup_dpp(). The issue could occur when pipe_ctx->plane_state is
null. The fix adds a check to ensure `pipe_ctx->plane_state` is not null
before accessing. This prevents a null pointer dereference.

Fixes: 0baae62463 ("drm/amd/display: Refactor fast update to use new HWSS build sequence")
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 10:03:06 -05:00
Zicheng Qu
6a057072dd drm/amd/display: Fix null check for pipe_ctx->plane_state in dcn20_program_pipe
This commit addresses a null pointer dereference issue in
dcn20_program_pipe(). Previously, commit 8e4ed3cf16 ("drm/amd/display:
Add null check for pipe_ctx->plane_state in dcn20_program_pipe")
partially fixed the null pointer dereference issue. However, in
dcn20_update_dchubp_dpp(), the variable pipe_ctx is passed in, and
plane_state is accessed again through pipe_ctx. Multiple if statements
directly call attributes of plane_state, leading to potential null
pointer dereference issues. This patch adds necessary null checks to
ensure stability.

Fixes: 8e4ed3cf16 ("drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe")
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 10:03:05 -05:00
Lijo Lazar
e283f4fb08 drm/amdgpu: Use reset recovery state checks
Some in_reset checks are infact checking whether the state is
reinitialization after reset. Replace with reset_in_recovery calls to
identify that it's really checking for recovery stage after reset.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 10:03:05 -05:00
Lijo Lazar
a86e0c0e94 drm/amdgpu: Add init level for post reset reinit
When device needs to be reset before initialization, it's not required
for all IPs to be initialized before a reset. In such cases, it needs to
identify whether the IP/feature is initialized for the first time or
whether it's reinitialized after a reset.

Add RESET_RECOVERY init level to identify post reset reinitialization
phase. This only provides a device level identification, IP/features may
choose to track their state independently also.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 10:03:05 -05:00
Kenneth Feng
6719ab8234 drm/amdgpu/pm: add gen5 display to the user on smu v14.0.2/3
add gen5 display to the user on smu v14.0.2/3

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.11.x
2024-11-20 10:03:05 -05:00
Victor Zhao
097c69d46c drm/amdkfd: make sure ring buffer is flushed before update wptr
In a consecutive packet submission, for example unmap and query status,
when CP is reading wptr caused by unmap packet doorbell ring, if in some
case CP operates slower (e.g. doorbell_mode=1) and wptr has been updated
to next packet (query status), but the query status packet content has
not been flushed to memory yet, it will cause CP fetched stalled data.

Adding mb to ensure ring buffer has been updated before updating wptr.
Also adding a mb to ensure wptr updated before doorbell ring.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 10:03:05 -05:00
Mario Limonciello
349af06a3a drm/amd: Fix initialization mistake for NBIO 7.11 devices
There is a strapping issue on NBIO 7.11.x that can lead to spurious PME
events while in the D0 state.

Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20241118174611.10700-2-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 10:03:05 -05:00
Mario Limonciello
902fbbf429 drm/amd: Add some missing straps from NBIO 7.11.0
Earlier ASICs have strap information exported, and this is missing
for NBIO 7.11.0.

Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: ca8c68142a ("drm/amdgpu: add nbio 7.11 registers")
Link: https://lore.kernel.org/r/20241118174611.10700-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 10:02:53 -05:00
Aric Cyr
1c1929d6ab drm/amd/display: 3.2.310
This version brings along the following:

- DC core fixes
- DCN35 fix
- DCN4+ fixes
- DML2 fix
- New SPL features

Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 09:41:29 -05:00
Ovidiu Bunea
a3e6079bd9 drm/amd/display: Remove PIPE_DTO_SRC_SEL programming from set_dtbclk_dto
There are cases where an OTG is remapped from driving a regular HDMI
display to a DP/eDP display. There are also cases where DTBCLK needs to
be enabled for HPO, but DTBCLK DTO programming may be done while OTG is
still enabled which is dangerous as the PIPE_DTO_SRC_SEL programming may
change the pixel clock generator source for a mapped and running OTG and
cause it to hang.

Remove the PIPE_DTO_SRC_SEL programming from this sequence since it is
already done in program_pixel_clk(). Additionally, make sure that
program_pixel_clk sets DTBCLK DTO as source for special HDMI cases.

Cc: stable@vger.kernel.org # 6.11+
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 09:41:22 -05:00
Samson Tam
1df1d452d2 drm/amd/display: allow chroma 1:1 scaling when sharpness is off
[Why]
SPL code forces taps to 1 when ratio is 1:1 and sharpness is off
But for chroma 1:1, need taps > 1 to handle cositing

[How]
Do not force chroma taps to 1 when ratio is 1:1 for YUV420
Remove 420_CHROMA_BYPASS mode for scaler

Reviewed-by: Navid Assadian <navid.assadian@amd.com>
Signed-off-by: Samson Tam <Samson.Tam@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 09:41:16 -05:00
Austin Zheng
c3ea03c2a1 drm/amd/display: Populate Power Profile In Case of Early Return
Early return possible if context has no clk_mgr.
This will lead to an invalid power profile being returned
which looks identical to a profile with the lowest power level.
Add back logic that populated the power profile and overwrite
the value if needed.

Cc: stable@vger.kernel.org
Fixes: d016d0dd5a ("drm/amd/display: Update Interface to Check UCLK DPM")
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Austin Zheng <Austin.Zheng@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-20 09:39:58 -05:00