[ Upstream commit cd90511557 ]
In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode()
is assigned to mode, which will lead to a NULL pointer dereference
on failure of drm_cvt_mode(). Add a check to avoid null pointer
dereference.
Signed-off-by: Ma Ke <make_ruc2021@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fa1f1cc09d ]
err_event_athub will corrupt VCPU buffer and not good to
be restored in amdgpu_vcn_resume() and in this case
the VCPU buffer needs to be cleared for VCN firmware to
work properly.
Acked-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bcfb9cee61 ]
On GFX v9.4.3 dGPU, applications have random timeout failure when XNACK
on, dmesg log has "amdgpu: IH soft ring buffer overflow 0x900, 0x900",
because dGPU mode has 272 cam entries. After increasing IH soft ring
to 512 entries, no more IH soft ring overflow message and application
passed.
Fixes: bf80d34b6c ("drm/amdgpu: Increase soft IH ring size")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
In amdgpu_dma_buf_move_notify reserve fences for the page table updates
in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON
in dma_resv_add_fence when using SDMA for page table updates.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SI hardware does not have doorbells at all, however currently the code
will try to do the allocation and thus fail, makes SI AMDGPU not usable.
Fix this failure by skipping doorbells allocation when doorbells count
is zero.
Fixes: 54c30d2a8d ("drm/amdgpu: create kernel doorbell pages")
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On some systems with Navi3x dGPU will attempt to use BACO for runtime
PM but fails to resume properly. This is because on these systems
the root port goes into D3cold which is incompatible with BACO.
This happens because in this case dGPU is connected to a bridge between
root port which causes BOCO detection logic to fail. Fix the intent of
the logic by looking at root port, not the immediate upstream bridge for
_PR3.
Cc: stable@vger.kernel.org
Suggested-by: Jun Ma <Jun.Ma2@amd.com>
Tested-by: David Perry <David.Perry@amd.com>
Fixes: b10c1c5b3a ("drm/amdgpu: add check for ACPI power resources")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch fixes a memory leak in the amdgpu_ras_feature_enable() function.
The leak occurs when the function sends a command to the firmware to enable
or disable a RAS feature for a GFX block. If the command fails, the kfree()
function is not called to free the info memory.
Fixes: 9f051d6ff1 ("drm/amdgpu: Free ras cmd input buffer properly")
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Cong Liu <liucong2@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 7748ce5b69.
vbios_version sysfs node is used to identify Part Number also. Revert to
the same so that it doesn't break scripts/software which parse this.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On some APU systems, there is no atom context and so the
atom_context struct is null.
Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl
to handle this case, returning all zeroes.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently, we store CU info only for a single XCC assuming
that it is the same for all XCCs. However, that may not be
true. As a result, store CU info for all XCCs. This info is
later used for CU masking.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch fixes the case where the code currently passes
absolute register address and not the reg offset, which HWS
expects, when sending the PM4 packet to set/update CWSR grace
period. Additionally, cleanup the signature of
build_grace_period_packet_info function as it no longer needs
the inst parameter.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull drm fixes from Dave Airlie:
"Regular rounds of rc1 fixes, a large bunch for amdgpu since it's three
weeks in one go, one i915, one nouveau and one ivpu.
I think there might be a few more fixes in misc that I haven't pulled
in yet, but we should get them all for rc2.
amdgpu:
- Display replay fixes
- Fixes for headless boards
- Fix documentation breakage
- RAS fixes
- Handle newer IP discovery tables
- SMU 13.0.6 fixes
- SR-IOV fixes
- Display vstartup fixes
- NBIO 7.9 fixes
- Display scaling mode fixes
- Debugfs power reporting fix
- GC 9.4.3 fixes
- Dirty framebuffer fixes for fbcon
- eDP fixes
- DCN 3.1.5 fix
- Display ODM fixes
- GPU core dump fix
- Re-enable zops property now that IGT test is fixed
- Fix possible UAF in CS code
- Cursor degamma fix
amdkfd:
- HMM fixes
- Interrupt masking fix
- GFX11 MQD fixes
i915:
- Mark requests for GuC virtual engines to avoid use-after-free
nouveau:
- Fix fence state in nouveau_fence_emit()
ivpu:
- replace strncpy"
* tag 'drm-next-2023-09-08' of git://anongit.freedesktop.org/drm/drm: (51 commits)
drm/amdgpu: Restrict bootloader wait to SMUv13.0.6
drm/amd/display: prevent potential division by zero errors
drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma
drm/amd/display: limit the v_startup workaround to ASICs older than DCN3.1
Revert "drm/amd/display: Remove v_startup workaround for dcn3+"
drm/amdgpu: fix amdgpu_cs_p1_user_fence
Revert "Revert "drm/amd/display: Implement zpos property""
drm/amdkfd: Add missing gfx11 MQD manager callbacks
drm/amdgpu: Free ras cmd input buffer properly
drm/amdgpu: Hide xcp partition sysfs under SRIOV
drm/amdgpu: use read-modify-write mode for gfx v9_4_3 SQ setting
drm/amdkfd: use mask to get v9 interrupt sq data bits correctly
drm/amdgpu: Allocate coredump memory in a nonblocking way
drm/amdgpu: Support query ecc cap for aqua_vanjaram
drm/amdgpu: Add umc_info v4_0 structure
drm/amd/display: always switch off ODM before committing more streams
drm/amd/display: Remove wait while locked
drm/amd/display: update blank state on ODM changes
drm/amd/display: Add smu write msg id fail retry process
drm/amdgpu: Add SMU v13.0.6 default reset methods
...
Restrict the wait for boot loader steady state only to SMUv13.0.6. For
older SOCs, ASIC init has a longer wait period and that takes care.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The offset is just 32bits here so this can potentially overflow if
somebody specifies a large value. Instead reduce the size to calculate
the last possible offset.
The error handling path incorrectly drops the reference to the user
fence BO resulting in potential reference count underflow.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
During a GPU reset, a normal memory reclaim could block to reclaim
memory. Giving that coredump is a best effort mechanism, it shouldn't
disturb the reset path. Change its memory allocation flag to a
nonblocking one.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
fbcon requires that we implement &drm_framebuffer_funcs.dirty.
Otherwise, the framebuffer might take a while to flush (which would
manifest as noticeable lag). However, we can't enable this callback for
non-fbcon cases since it may cause too many atomic commits to be made at
once. So, implement amdgpu_dirtyfb() and only enable it for fbcon
framebuffers (we can use the "struct drm_file file" parameter in the
callback to check for this since it is only NULL when called by fbcon,
at least in the mainline kernel) on devices that support atomic KMS.
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org # 6.1+
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2519
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>