Commit Graph

683 Commits

Author SHA1 Message Date
Alex Deucher
25e07c8403 drm/amdgpu: fix pm notifier handling
commit 4aaffc8575 upstream.

Set the s3/s0ix and s4 flags in the pm notifier so that we can skip
the resource evictions properly in pm prepare based on whether
we are suspending or hibernating.  Drop the eviction as processes
are not frozen at this time, we we can end up getting stuck trying
to evict VRAM while applications continue to submit work which
causes the buffers to get pulled back into VRAM.

v2: Move suspend flags out of pm notifier (Mario)

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4178
Fixes: 2965e6355d ("drm/amd: Add Suspend/Hibernate notification callback support")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 06f2dcc241)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22 14:29:54 +02:00
Mario Limonciello
4d45a5f1e2 drm/amd: Add Suspend/Hibernate notification callback support
[ Upstream commit 2965e6355d ]

As part of the suspend sequence VRAM needs to be evicted on dGPUs.
In order to make suspend/resume more reliable we moved this into
the pmops prepare() callback so that the suspend sequence would fail
but the system could remain operational under high memory usage suspend.

Another class of issues exist though where due to memory fragementation
there isn't a large enough contiguous space and swap isn't accessible.

Add support for a suspend/hibernate notification callback that could
evict VRAM before tasks are frozen. This should allow paging out to swap
if necessary.

Link: https://github.com/ROCm/ROCK-Kernel-Driver/issues/174
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3476
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3781
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Link: https://lore.kernel.org/r/20241128032656.2090059-2-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: d0ce1aaa85 ("Revert "drm/amd: Stop evicting resources on APUs in suspend"")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-22 14:29:37 +02:00
Mario Limonciello
b542559343 drm/amd: Handle being compiled without SI or CIK support better
commit 5f054ddead upstream.

If compiled without SI or CIK support but amdgpu tries to load it
will run into failures with uninitialized callbacks.

Show a nicer message in this case and fail probe instead.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4050
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-25 10:47:58 +02:00
Mario Limonciello
cf1b904867 drm/amd: Keep display off while going into S4
[ Upstream commit 4afacc9948 ]

When userspace invokes S4 the flow is:

1) amdgpu_pmops_prepare()
2) amdgpu_pmops_freeze()
3) Create hibernation image
4) amdgpu_pmops_thaw()
5) Write out image to disk
6) Turn off system

Then on resume amdgpu_pmops_restore() is called.

This flow has a problem that because amdgpu_pmops_thaw() is called
it will call amdgpu_device_resume() which will resume all of the GPU.

This includes turning the display hardware back on and discovering
connectors again.

This is an unexpected experience for the display to turn back on.
Adjust the flow so that during the S4 sequence display hardware is
not turned back on.

Reported-by: Xaver Hugl <xaver.hugl@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2038
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250306185124.44780-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 68bfdc8dc0)
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10 14:39:31 +02:00
Alex Deucher
27b929c45d drm/amdgpu: bump version for RV/PCO compute fix
commit 55ed2b1b50 upstream.

Bump the driver version for RV/PCO compute stability fix
so mesa can use this check to enable compute queues on
RV/PCO.

Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.12.x
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-27 04:30:24 -08:00
Marek Olšák
20a57f68db drm/amdgpu: add a BO metadata flag to disable write compression for Vulkan
commit 2255b40cac upstream.

Vulkan can't support DCC and Z/S compression on GFX12 without
WRITE_COMPRESS_DISABLE in this commit or a completely different DCC
interface.

AMDGPU_TILING_GFX12_SCANOUT is added because it's already used by userspace.

Cc: stable@vger.kernel.org # 6.12.x
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-17 10:05:09 +01:00
Linus Torvalds
994aeacbb3 Merge tag 'drm-next-2024-09-28' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Regular fixes for the week to end the merge window, i915 and xe have a
  few each, amdgpu makes up most of it with a bunch of SR-IOV related
  fixes amongst others.

  i915:
   - Fix BMG support to UHBR13.5
   - Two PSR fixes
   - Fix colorimetry detection for DP

  xe:
   - Fix macro for checking minimum GuC version
   - Fix CCS offset calculation for some BMG SKUs
   - Fix locking on memory usage reporting via fdinfo and BO destroy
   - Fix GPU page fault handler on a closed VM
   - Fix overflow in oa batch buffer

  amdgpu:
   - MES 12 fix
   - KFD fence sync fix
   - SR-IOV fixes
   - VCN 4.0.6 fix
   - SDMA 7.x fix
   - Bump driver version to note cleared VRAM support
   - SWSMU fix
   - CU occupancy logic fix
   - SDMA queue fix"

* tag 'drm-next-2024-09-28' of https://gitlab.freedesktop.org/drm/kernel: (79 commits)
  drm/amd/pm: update workload mask after the setting
  drm/amdgpu: bump driver version for cleared VRAM
  drm/amdgpu: fix vbios fetching for SR-IOV
  drm/amdgpu: fix PTE copy corruption for sdma 7
  drm/amdkfd: Add SDMA queue quantum support for GFX12
  drm/amdgpu/vcn: enable AV1 on both instances
  drm/amdkfd: Fix CU occupancy for GFX 9.4.3
  drm/amdkfd: Update logic for CU occupancy calculations
  drm/amdgpu: skip coredump after job timeout in SRIOV
  drm/amdgpu: sync to KFD fences before clearing PTEs
  drm/amdgpu/mes12: set enable_level_process_quantum_check
  drm/i915/dp: Fix colorimetry detection
  drm/amdgpu/mes12: reduce timeout
  drm/amdgpu/mes11: reduce timeout
  drm/amdgpu: use GEM references instead of TTMs v2
  drm/amd/display: Allow backlight to go below `AMDGPU_DM_DEFAULT_MIN_BACKLIGHT`
  drm/amd/display: Fix kdoc entry for 'tps' in 'dc_process_dmub_dpia_set_tps_notification'
  drm/amdgpu: update golden regs for gfx12
  drm/amdgpu: clean up vbios fetching code
  drm/amd/display: handle nulled pipe context in DCE110's set_drr()
  ...
2024-09-28 08:47:46 -07:00
Alex Deucher
34ad56a467 drm/amdgpu: bump driver version for cleared VRAM
Driver now clears VRAM on allocation.  Bump the
driver version so mesa knows when it will get
cleared vram by default.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.11.x
2024-09-26 17:04:47 -04:00
Linus Torvalds
de848da12f Merge tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "This adds a couple of patches outside the drm core, all should be
  acked appropriately, the string and pstore ones are the main ones that
  come to mind.

  Otherwise it's the usual drivers, xe is getting enabled by default on
  some new hardware, we've changed the device number handling to allow
  more devices, and we added some optional rust code to create QR codes
  in the panic handler, an idea first suggested I think 10 years ago :-)

  string:
   - add mem_is_zero()

  core:
   - support more device numbers
   - use XArray for minor ids
   - add backlight constants
   - Split dma fence array creation into alloc and arm

  fbdev:
   - remove usage of old fbdev hooks

  kms:
   - Add might_fault() to drm_modeset_lock priming
   - Add dynamic per-crtc vblank configuration support

  dma-buf:
   - docs cleanup

  buddy:
   - Add start address support for trim function

  printk:
   - pass description to kmsg_dump

  scheduler:
   - Remove full_recover from drm_sched_start

  ttm:
   - Make LRU walk restartable after dropping locks
   - Allow direct reclaim to allocate local memory

  panic:
   - add display QR code (in rust)

  displayport:
   - mst: GUID improvements

  bridge:
   - Silence error message on -EPROBE_DEFER
   - analogix: Clean aup
   - bridge-connector: Fix double free
   - lt6505: Disable interrupt when powered off
   - tc358767: Make default DP port preemphasis configurable
   - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR
   - anx7625: simplify OF array handling
   - dw-hdmi: simplify clock handling
   - lontium-lt8912b: fix mode validation
   - nwl-dsi: fix mode vsync/hsync polarity

  xe:
   - Enable LunarLake and Battlemage support
   - Introducing Xe2 ccs modifiers for integrated and discrete graphics
   - rename xe perf to xe observation
   - use wb caching on DGFX for system memory
   - add fence timeouts
   - Lunar Lake graphics/media/display workarounds
   - Battlemage workarounds
   - Battlemage GSC support
   - GSC and HuC fw updates for LL/BM
   - use dma_fence_chain_free
   - refactor hw engine lookup and mmio access
   - enable priority mem read for Xe2
   - Add first GuC BMG fw
   - fix dma-resv lock
   - Fix DGFX display suspend/resume
   - Use xe_managed for kernel BOs
   - Use reserved copy engine for user binds on faulting devices
   - Allow mixing dma-fence jobs and long-running faulting jobs
   - fix media TLB invalidation
   - fix rpm in TTM swapout path
   - track resources and VF state by PF

  i915:
   - Type-C programming fix for MTL+
   - FBC cleanup
   - Calc vblank delay more accurately
   - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates
   - Fix DP LTTPR detection
   - limit relocations to INT_MAX
   - fix long hangs in buddy allocator on DG2/A380

  amdgpu:
   - Per-queue reset support
   - SDMA devcoredump support
   - DCN 4.0.1 updates
   - GFX12/VCN4/JPEG4 updates
   - Convert vbios embedded EDID to drm_edid
   - GFX9.3/9.4 devcoredump support
   - process isolation framework for GFX 9.4.3/4
   - take IOMMU mappings into account for P2P DMA

  amdkfd:
   - CRIU fixes
   - HMM fix
   - Enable process isolation support for GFX 9.4.3/4
   - Allow users to target recommended SDMA engines
   - KFD support for targetting queues on recommended SDMA engines

  radeon:
   - remove .load and drm_dev_alloc
   - Fix vbios embedded EDID size handling
   - Convert vbios embedded EDID to drm_edid
   - Use GEM references instead of TTM
   - r100 cp init cleanup
   - Fix potential overflows in evergreen CS offset tracking

  msm:
   - DPU:
      - implement DP/PHY mapping on SC8180X
      - Enable writeback on SM8150, SC8180X, SM6125, SM6350
   - DP:
      - Enable widebus on all relevant chipsets
      - MSM8998 HDMI support
   - GPU:
      - A642L speedbin support
      - A615/A306/A621 support
      - A7xx devcoredump support

  ast:
   - astdp: Support AST2600 with VGA
   - Clean up HPD
   - Fix timeout loop for DP link training
   - reorganize output code by type (VGA, DP, etc)
   - convert to struct drm_edid
   - fix BMC handling for all outputs

  exynos:
   - drop stale MAINTAINERS pattern
   - constify struct

  loongson:
   - use GEM refcount over TTM

  mgag200:
   - Improve BMC handling
   - Support VBLANK intterupts
   - transparently support BMC outputs

  nouveau:
   - Refactor and clean up internals
   - Use GEM refcount over TTM's

  gm12u320:
   - convert to struct drm_edid

  gma500:
   - update i2c terms

  lcdif:
   - pixel clock fix

  host1x:
   - fix syncpoint IRQ during resume
   - use iommu_paging_domain_alloc()

  imx:
   - ipuv3: convert to struct drm_edid

  omapdrm:
   - improve error handling
   - use common helper for_each_endpoint_of_node()

  panel:
   - add support for BOE TV101WUM-LL2 plus DT bindings
   - novatek-nt35950: improve error handling
   - nv3051d: improve error handling
   - panel-edp:
      - add support for BOE NE140WUM-N6G
      - revert support for SDC ATNA45AF01
   - visionox-vtdr6130:
      - improve error handling
      - use devm_regulator_bulk_get_const()
   - boe-th101mb31ig002:
      - Support for starry-er88577 MIPI-DSI panel plus DT
      - Fix porch parameter
   - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE
     NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2,
     CMN N116BCP-EA2, CSW MNB601LS1-4
   - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT
   - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT
   - jd9365da:
      - Support Melfas lmfbx101117480 MIPI-DSI panel plus DT
      - Refactor for code sharing
   - panel-edp: fix name for HKC MB116AN01
   - jd9365da: fix "exit sleep" commands
   - jdi-fhd-r63452: simplify error handling with DSI multi-style
     helpers
   - mantix-mlaf057we51: simplify error handling with DSI multi-style
     helpers
   - simple:
      - support Innolux G070ACE-LH3 plus DT bindings
      - support On Tat Industrial Company KD50G21-40NT-A1 plus DT
        bindings
   - st7701:
      - decouple DSI and DRM code
      - add SPI support
      - support Anbernic RG28XX plus DT bindings

  mediatek:
   - support alpha blending
   - remove cl in struct cmdq_pkt
   - ovl adaptor fix
   - add power domain binding for mediatek DPI controller

  renesas:
   - rz-du: add support for RZ/G2UL plus DT bindings

  rockchip:
   - Improve DP sink-capability reporting
   - dw_hdmi: Support 4k@60Hz
   - vop:
      - Support RGB display on Rockchip RK3066
      - Support 4096px width

  sti:
   - convert to struct drm_edid

  stm:
   - Avoid UAF wih managed plane and CRTC helpers
   - Fix module owner
   - Fix error handling in probe
   - Depend on COMMON_CLK
   - ltdc:
      - Fix transparency after disabling plane
      - Remove unused interrupt

  tegra:
   - gr3d: improve PM domain handling
   - convert to struct drm_edid
   - Call drm_atomic_helper_shutdown()

  vc4:
   - fix PM during detect
   - replace DRM_ERROR() with drm_error()
   - v3d: simplify clock retrieval

  v3d:
   - Clean up perfmon

  virtio:
   - add DRM capset"

* tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel: (1326 commits)
  drm/xe: Fix missing conversion to xe_display_pm_runtime_resume
  drm/xe/xe2hpg: Add Wa_15016589081
  drm/xe: Don't keep stale pointer to bo->ggtt_node
  drm/xe: fix missing 'xe_vm_put'
  drm/xe: fix build warning with CONFIG_PM=n
  drm/xe: Suppress missing outer rpm protection warning
  drm/xe: prevent potential UAF in pf_provision_vf_ggtt()
  drm/amd/display: Add all planes on CRTC to state for overlay cursor
  drm/i915/bios: fix printk format width
  drm/i915/display: Fix BMG CCS modifiers
  drm/amdgpu: get rid of bogus includes of fdtable.h
  drm/amdkfd: CRIU fixes
  drm/amdgpu: fix a race in kfd_mem_export_dmabuf()
  drm: new helper: drm_gem_prime_handle_to_dmabuf()
  drm/amdgpu/atomfirmware: Silence UBSAN warning
  drm/amdgpu: Fix kdoc entry in 'amdgpu_vm_cpu_prepare'
  drm/amd/amdgpu: apply command submission parser for JPEG v1
  drm/amd/amdgpu: apply command submission parser for JPEG v2+
  drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3
  drm/amd/pm: update the features set on smu v14.0.2/3
  ...
2024-09-19 10:18:15 +02:00
Ramesh Errabolu
01be2b62c0 drm/amdgpu: Surface svm_default_granularity, a RW module parameter
Enables users to update SVM's default granularity, used in
buffer migration and handling of recoverable page faults.
Param value is set in terms of log(numPages(buffer)),
e.g. 9 for a 2 MIB buffer

Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06 17:55:05 -04:00
Christian Brauner
641bb4394f fs: move FMODE_UNSIGNED_OFFSET to fop_flags
This is another flag that is statically set and doesn't need to use up
an FMODE_* bit. Move it to ->fop_flags and free up another FMODE_* bit.

(1) mem_open() used from proc_mem_operations
(2) adi_open() used from adi_fops
(3) drm_open_helper():
    (3.1) accel_open() used from DRM_ACCEL_FOPS
    (3.2) drm_open() used from
    (3.2.1) amdgpu_driver_kms_fops
    (3.2.2) psb_gem_fops
    (3.2.3) i915_driver_fops
    (3.2.4) nouveau_driver_fops
    (3.2.5) panthor_drm_driver_fops
    (3.2.6) radeon_driver_kms_fops
    (3.2.7) tegra_drm_fops
    (3.2.8) vmwgfx_driver_fops
    (3.2.9) xe_driver_fops
    (3.2.10) DRM_GEM_FOPS
    (3.2.11) DEFINE_DRM_GEM_DMA_FOPS
(4) struct memdev sets fmode flags based on type of device opened. For
    devices using struct mem_fops unsigned offset is used.

Mark all these file operations as FOP_UNSIGNED_OFFSET and add asserts
into the open helper to ensure that the flag is always set.

Link: https://lore.kernel.org/r/20240809-work-fop_unsigned-v1-1-658e054d893e@kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-30 08:22:36 +02:00
Alex Deucher
a9b67c036c drm/amdgpu: add experimental resets debug flag
Add this flag to enable experimental resets for testing before they
are fully validated.

Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29 13:38:36 -04:00
Thomas Zimmermann
7a26f18119 drm/amdgpu: Do not set struct drm_driver.lastclose
Remove the implementation of struct drm_driver.lastclose. The hook
was only necessary before in-kernel DRM clients existed, but is now
obsolete. The code in amdgpu_driver_lastclose_kms() is performed by
drm_lastclose().

v2:
- update commit message

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240812083000.337744-3-tzimmermann@suse.de
2024-08-13 16:21:08 +02:00
Aurabindo Pillai
28814be882 drm/amd: Bump KMS_DRIVER_MINOR version
Increase the KMS minor version to indicate GFX12 DCC support since this
contains a major change in how DCC is managed across IPs like GFX, DCN
etc. This will be used mainly by userspace like Mesa to figure out
DCC support on GFX12 hardware.

v2: fix version number (Alex)

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-15 17:41:23 -04:00
Kenneth Feng
bf826ba9b4 Revert "drm/amd/amdgpu: add module parameter for jpeg"
This reverts commit d3620eeae8.
Revert this due to a final solution:
commit ed3165d660 ("drm/amdgpu/jpeg5: reprogram doorbell setting after power up for each playback")

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Sonny Jiang <sonjiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-27 17:10:40 -04:00
Dave Airlie
f680df51ca Merge tag 'drm-misc-next-2024-05-30' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for 6.11:

UAPI Changes:
  - Deprecate DRM date and return a 0 date in DRM_IOCTL_VERSION

Core Changes:
  - connector: Create a set of helpers to help with HDMI support
  - fbdev: Create memory manager optimized fbdev emulation
  - panic: Allow to select fonts, improve drm_fb_dma_get_scanout_buffer

Driver Changes:
  - Remove driver owner assignments
  - Allow more drivers to compile with COMPILE_TEST
  - Conversions to drm_edid
  - ivpu: hardware scheduler support, profiling support, improvements
    to the platform support layer
  - mgag200: general reworks and improvements
  - nouveau: Add NVreg_RegistryDwords command line option
  - rockchip: Conversion to the hdmi helpers
  - sun4i: Conversion to the hdmi helpers
  - vc4: Conversion to the hdmi helpers
  - v3d: Perf counters improvements
  - zynqmp: IRQ and debugfs improvements
  - bridge:
    - Remove redundant checks on bridge->encoder
  - panels:
    - Switch panels from register table initialization to proper code
    - Now that the panel code tracks the panel state, remove every
      ad-hoc implementation in the panel drivers
    - New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
      13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0, BOE
      nv110wum-l60, IVO t109nw41

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240530-hilarious-flat-magpie-5fa186@houat
2024-06-21 10:30:31 +10:00
Kenneth Feng
d3620eeae8 drm/amd/amdgpu: add module parameter for jpeg
add module parameter for jpeg. this is a temporary workaround for jpeg unit test fail
on vcn 5.0 now. will be removed later.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17 17:40:37 -04:00
Likun Gao
7be73af53b drm/amdgpu: switch default mes to uni mes
Switch the default mes to uni mes for gfx v12.
V2: remove uni_mes set for gfx v11.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17 17:40:36 -04:00
Yang Wang
b2aa3d4b30 drm/amdgpu: add debug flag to enable RAS ACA
Use debug_mask=0x10 (BIT.4) param to help enable RAS ACA.
(RAS ACA is disabled by default.)

Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17 17:40:36 -04:00
Saleemkhan Jamadar
98a2e3a0d1 drm/amdgpu/umsch: add support to capture fw debug log
Added support to capture unsch fw debug logs in debugfs.
To enable set amdgpu_umschfw_log =1 in boot args.

v1 - rename variable to umsch_mm_fwlog (Veera)

Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-13 15:45:41 -04:00
Jack Xiao
3dc434ad26 drm/amdgpu: add module parameter 'amdgpu_uni_mes'
Add module parameter 'amdgpu_uni_mes' to enable/disable unified
mes fw support.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-02 16:18:14 -04:00
Thomas Zimmermann
aae4682e5d drm/fbdev-generic: Convert to fbdev-ttm
Only TTM-based drivers use fbdev-generic. Rename it to fbdev-ttm and
change the symbol infix from _generic_ to _ttm_. Link the source file
into TTM helpers, so that it is only build if TTM-based drivers have
been selected. Select DRM_TTM_HELPER for loongson.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419083331.7761-43-tzimmermann@suse.de
2024-05-02 11:33:32 +02:00
Stanley.Yang
939c475181 drm/amdgpu: Support setting reset_method at runtime
In order to support more test cases, support user
change reset_method at runtime.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23 12:08:30 -04:00
Ahmad Rehman
ea137071ad drm/amdgpu: Skip the coredump collection on reset during driver reload
In passthrough environment, the driver triggers the mode-1 reset on
reload. The reset causes the core dump collection which is delayed task
and prevents driver from unloading until it is completed. Since we do
not need to collect data on "reset on reload" case, we can skip core
dump collection.

v2: Use the same flag to avoid calling amdgpu_reset_reg_dumps as well.

Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-18 23:46:45 -04:00
Ma Jun
fcc0735b00 drm/amdgpu: Add support for BAMACO mode checking
Optimize the code to add support for BAMACO mode checking

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-09 22:08:18 -04:00
shaoyunl
e58acb7613 drm/amdgpu : Add mes_log_enable to control mes log feature
The MES log might slow down the performance for extra step of log the data,
disable it by default and introduce a parameter can enable it when necessary

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-09 21:50:27 -04:00
Ahmad Rehman
f679fd6057 drm/amdgpu: Init zone device and drm client after mode-1 reset on reload
In passthrough environment, when amdgpu is reloaded after unload, mode-1
is triggered after initializing the necessary IPs, That init does not
include KFD, and KFD init waits until the reset is completed. KFD init
is called in the reset handler, but in this case, the zone device and
drm client is not initialized, causing app to create kernel panic.

v2: Removing the init KFD condition from amdgpu_amdkfd_drm_client_create.
As the previous version has the potential of creating DRM client twice.

v3: v2 patch results in SDMA engine hung as DRM open causes VM clear to SDMA
before SDMA init. Adding the condition to in drm client creation, on top of v1,
to guard against drm client creation call multiple times.

Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-20 13:12:57 -04:00
Ma Jun
86e14a7386 drm/amdgpu: Use rpm_mode flag instead of checking it again for rpm
Because the rpm_mode flag is already set when the driver
is initialized, we use it directly for runtime suspend/resume
instead of checking it again

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-04 15:59:08 -05:00
Bjorn Helgaas
b07395d5d5 drm/amdgpu: remove misleading amdgpu_pmops_runtime_idle() comment
After 4020c22802 ("drm/amdgpu: don't runtime suspend if there are
displays attached (v3)"), "ret" is unconditionally set later before being
used, so there's point in initializing it and the associated comment is no
longer meaningful.

Remove the comment and the unnecessary initialization.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-29 20:35:39 -05:00
Alex Deucher
959143dab1 Revert "drm/amd: Remove freesync video mode amdgpu parameter"
This reverts commit e94e787e37.

This conflicts with how compositors want to handle VRR.  Now
that compositors actually handle VRR, we probably don't need
freesync video.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2985
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-29 20:35:31 -05:00
Hamza Mahfooz
040fdcde28 drm/amdgpu: respect the abmlevel module parameter value if it is set
Currently, if the abmlevel module parameter is set, it is possible for
user space to override the ABM level at some point after boot. However,
that is undesirable because it means that we aren't respecting the
user's wishes with regard to the level that they want to use. So,
prevent user space from changing the ABM level if the module parameter
is set to a non-auto value.

Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-12 16:14:05 -05:00
Hamza Mahfooz
fc184dbe9f drm/amdgpu: make damage clips support configurable
We have observed that there are quite a number of PSR-SU panels on the
market that are unable to keep up with what user space throws at them,
resulting in hangs and random black screens. So, make damage clips
support configurable and disable it by default for PSR-SU displays.

Cc: stable@vger.kernel.org
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-12 16:10:24 -05:00
Prike Liang
0326de4c44 drm/amdgpu: skip to program GFXDEC registers for suspend abort
In the suspend abort cases, the gfx power rail doesn't turn off so
some GFXDEC registers/CSB can't reset to default value and at this
moment reinitialize GFXDEC/CSB will result in an unexpected error.
So let skip those program sequence for the suspend abort case.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 12:26:22 -05:00
Alex Deucher
39a82d304b drm/amdgpu: fix typo in parameter description
Missing space.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 10:01:05 -05:00
Le Ma
c0125b848a drm/amdgpu: move the drm client creation behind drm device registration
This patch is to eliminate interrupt warning below:

  "[drm] Fence fallback timer expired on ring sdma0.0".

An early vm pt clearing job is sent to SDMA ahead of interrupt enabled.
And re-locating the drm client creation following after drm_dev_register
looks like a more proper flow.

v2: wrap the drm client creation

Fixes: 1819200166 ("drm/amdkfd: Export DMABufs from KFD using GEM handles")
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-29 15:35:13 -05:00
Christian König
087a3e13ec drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"
Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the
HW in an active state and is an unbalanced use of the IP callbacks.

Using the IP callbacks like this can lead to memory leaks, double
free and imbalanced reference counters.

Leaving the HW in an active state can lead to DMA accesses to memory now
freed by the driver.

Both is a complete no-go for driver unload so completely revert the
workaround for now.

This reverts commit f5c7e77970.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-18 15:42:20 -05:00
chenxuebing
b679566bf0 drm/amdgpu: Clean up errors in amdgpu_drv.c
Fix the following errors reported by checkpatch:

ERROR: do not initialise globals to 0

Signed-off-by: chenxuebing <chenxb_99091@126.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-15 18:35:40 -05:00
Le Ma
51258acdc4 drm/amdgpu: move debug options init prior to amdgpu device init
To bring debug options into effect in early initialization phase

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-15 18:32:54 -05:00
Le Ma
d20e1aec88 drm/amdgpu: add debug flag to place fw bo on vram for frontdoor loading
Use debug_mask=0x8 param to help isolating data path issues
on new systems in early phase.

v2: rename the flag for explicitness (lijo)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-15 18:32:49 -05:00
Le Ma
6c5683bd9e Revert "drm/amdgpu: add param to specify fw bo location for front-door loading"
This reverts commit c572abffe9.

Will use debug module param instead of independent module param.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-15 18:32:23 -05:00
Le Ma
c572abffe9 drm/amdgpu: add param to specify fw bo location for front-door loading
This param can help isolating data path issues on new systems in early phase.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-05 16:10:43 -05:00
Friedrich Vock
91963397c4 drm/amdgpu: Enable tunneling on high-priority compute queues
This improves latency if the GPU is already busy with other work.
This is useful for VR compositors that submit highly latency-sensitive
compositing work on high-priority compute queues while the GPU is busy
rendering the next frame.

Userspace merge request:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26462

v2: bump driver version (Alex)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:23:59 -05:00
Evan Quan
b8b39de646 drm/amd/pm: setup the framework to support Wifi RFI mitigation feature
With WBRF feature supported, as a driver responding to the frequencies,
amdgpu driver is able to do shadow pstate switching to mitigate possible
interference(between its (G-)DDR memory clocks and local radio module
frequency bands used by Wifi 6/6e/7).

--
v1->v2:
  - update the prompt for feature support(Lijo)
v8->v9:
  - update parameter document for smu_wbrf_event_handler(Simon)
v9->v10:
v10->v11:
 - correct the logics for wbrf range sorting(Lijo)
v13:
 - Fix the format issue (IIpo Jarvinen)

Signed-off-by: Evan Quan <quanliangl@hotmail.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-13 15:23:50 -05:00
Mario Limonciello
bd1f6a31e7 drm/amd: Enable PCIe PME from D3
When dGPU is put into BOCO it may be in D3cold but still able send
PME on display hotplug event. For this to work it must be enabled
as wake source from D3.

When runpm is enabled use pci_wake_from_d3() to mark wakeup as
enabled by default.

Cc: stable@vger.kernel.org # 6.1+
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29 16:49:23 -05:00
Alex Deucher
6ba5b61383 drm/amdgpu: add a module parameter to control the AGP aperture
Add a module parameter to control the AGP aperture.  The AGP
aperture is an aperture in the GPU's internal address space
which provides direct non-paged access to the platform address
space.  This access is non-snooped so only uncached memory
can be accessed.

Add a knob so that we can toggle this for debugging.

Fixes: 67318cb843 ("drm/amdgpu/gmc11: set gart placement GC11")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-17 00:58:20 -05:00
Ma Jun
dbab63561b drm/amdgpu: Optimize the asic type fix code
Use a new struct array to define the asic information which
asic type needs to be fixed.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-03 12:18:32 -04:00
Mario Limonciello
5095d54181 drm/amd: Evict resources during PM ops prepare() callback
Linux PM core has a prepare() callback run before suspend.

If the system is under high memory pressure, the resources may need
to be evicted into swap instead.  If the storage backing for swap
is offlined during the suspend() step then such a call may fail.

So move this step into prepare() to move evict majority of
resources and update all non-pmops callers to call the same callback.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:00:18 -04:00
Boyuan Zhang
6cb8e3ee3a drm/amdgpu: update ib start and size alignment
Update IB starting address alignment and size alignment with correct values
for decode and encode IPs.

Decode IB starting address alignment: 256 bytes
Decode IB size alignment: 64 bytes
Encode IB starting address alignment: 256 bytes
Encode IB size alignment: 4 bytes

Also bump amdgpu driver version for this update.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:51:39 -04:00
Alex Deucher
7a41ed8b59 drm/amdgpu: add new INFO ioctl query for the last GPU page fault
Add a interface to query the last GPU page fault for the process.
Useful for debugging context lost errors.

v2: split vmhub representation between kernel and userspace
v3: add locking when fetching fault info in INFO IOCTL

Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238
libdrm MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238

Cc: samuel.pitoiset@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:49:39 -04:00
Mario Limonciello
5dc270d366 drm/amd: Add a module parameter for seamless boot
The module parameter can be used to test more easily enabling seamless
boot support on additional ASICs.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-09-20 12:24:39 -04:00