drm/amdgpu: disable peer-to-peer access for DCC-enabled GC12 VRAM surfaces

commit 22a36e660d upstream.

Certain multi-GPU configurations (especially GFX12) may hit
data corruption when a DCC-compressed VRAM surface is shared across GPUs
using peer-to-peer (P2P) DMA transfers.

Such surfaces rely on device-local metadata and cannot be safely accessed
through a remote GPU’s page tables. Attempting to import a DCC-enabled
surface through P2P leads to incorrect rendering or GPU faults.

This change disables P2P for DCC-enabled VRAM buffers that are contiguous
and allocated on GFX12+ hardware.  In these cases, the importer falls back
to the standard system-memory path, avoiding invalid access to compressed
surfaces.

Future work could consider optional migration (VRAM→System→VRAM) if a
performance regression is observed when `attach->peer2peer = false`.

Tested on:
 - Dual RX 9700 XT (Navi4x) setup
 - GNOME and Wayland compositor scenarios
 - Confirmed no corruption after disabling P2P under these conditions
v2: Remove check TTM_PL_VRAM & TTM_PL_FLAG_CONTIGUOUS.
v3: simplify for upsteam and fix ip version check (Alex)

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9dff2bb709e6fbd97e263fd12bf12802d2b5a0cf)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This commit is contained in:
Vitaly Prosyak
2025-11-06 12:35:53 -05:00
committed by Greg Kroah-Hartman
parent 325aa07165
commit d9db9abf66

View File

@@ -81,6 +81,18 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf,
struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj);
struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
/*
* Disable peer-to-peer access for DCC-enabled VRAM surfaces on GFX12+.
* Such buffers cannot be safely accessed over P2P due to device-local
* compression metadata. Fallback to system-memory path instead.
* Device supports GFX12 (GC 12.x or newer)
* BO was created with the AMDGPU_GEM_CREATE_GFX12_DCC flag
*
*/
if (amdgpu_ip_version(adev, GC_HWIP, 0) >= IP_VERSION(12, 0, 0) &&
bo->flags & AMDGPU_GEM_CREATE_GFX12_DCC)
attach->peer2peer = false;
if (!amdgpu_dmabuf_is_xgmi_accessible(attach_adev, bo) && if (!amdgpu_dmabuf_is_xgmi_accessible(attach_adev, bo) &&
pci_p2pdma_distance(adev->pdev, attach->dev, false) < 0) pci_p2pdma_distance(adev->pdev, attach->dev, false) < 0)
attach->peer2peer = false; attach->peer2peer = false;