linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-07 18:40:10 +00:00

Author	SHA1	Message	Date
Vitaly Prosyak	de1c09598b	drm/amdgpu: fix software pci_unplug on some chips [ Upstream commit `4638e0c29a` ] When software 'pci unplug' using IGT is executed we got a sysfs directory entry is NULL for differant ras blocks like hdp, umc, etc. Before call 'sysfs_remove_file_from_group' and 'sysfs_remove_group' check that 'sd' is not NULL. [ +0.000001] RIP: 0010:sysfs_remove_group+0x83/0x90 [ +0.000002] Code: 31 c0 31 d2 31 f6 31 ff e9 9a a8 b4 00 4c 89 e7 e8 f2 a2 ff ff eb c2 49 8b 55 00 48 8b 33 48 c7 c7 80 65 94 82 e8 cd 82 bb ff <0f> 0b eb cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 [ +0.000001] RSP: 0018:ffffc90002067c90 EFLAGS: 00010246 [ +0.000002] RAX: 0000000000000000 RBX: ffffffff824ea180 RCX: 0000000000000000 [ +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ +0.000001] RBP: ffffc90002067ca8 R08: 0000000000000000 R09: 0000000000000000 [ +0.000001] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ +0.000001] R13: ffff88810a395f48 R14: ffff888101aab0d0 R15: 0000000000000000 [ +0.000001] FS: 00007f5ddaa43a00(0000) GS:ffff88841e800000(0000) knlGS:0000000000000000 [ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000001] CR2: 00007f8ffa61ba50 CR3: 0000000106432000 CR4: 0000000000350ef0 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000001] ? show_regs+0x72/0x90 [ +0.000002] ? sysfs_remove_group+0x83/0x90 [ +0.000002] ? __warn+0x8d/0x160 [ +0.000001] ? sysfs_remove_group+0x83/0x90 [ +0.000001] ? report_bug+0x1bb/0x1d0 [ +0.000003] ? handle_bug+0x46/0x90 [ +0.000001] ? exc_invalid_op+0x19/0x80 [ +0.000002] ? asm_exc_invalid_op+0x1b/0x20 [ +0.000003] ? sysfs_remove_group+0x83/0x90 [ +0.000001] dpm_sysfs_remove+0x61/0x70 [ +0.000002] device_del+0xa3/0x3d0 [ +0.000002] ? ktime_get_mono_fast_ns+0x46/0xb0 [ +0.000002] device_unregister+0x18/0x70 [ +0.000001] i2c_del_adapter+0x26d/0x330 [ +0.000002] arcturus_i2c_control_fini+0x25/0x50 [amdgpu] [ +0.000236] smu_sw_fini+0x38/0x260 [amdgpu] [ +0.000241] amdgpu_device_fini_sw+0x116/0x670 [amdgpu] [ +0.000186] ? mutex_lock+0x13/0x50 [ +0.000003] amdgpu_driver_release_kms+0x16/0x40 [amdgpu] [ +0.000192] drm_minor_release+0x4f/0x80 [drm] [ +0.000025] drm_release+0xfe/0x150 [drm] [ +0.000027] __fput+0x9f/0x290 [ +0.000002] ____fput+0xe/0x20 [ +0.000002] task_work_run+0x61/0xa0 [ +0.000002] exit_to_user_mode_prepare+0x150/0x170 [ +0.000002] syscall_exit_to_user_mode+0x2a/0x50 Cc: Hawking Zhang <hawking.zhang@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:47 +00:00
Wayne Lin	df8bc953ee	drm/amd/display: Avoid NULL dereference of timing generator [ Upstream commit `b1904ed480` ] [Why & How] Check whether assigned timing generator is NULL or not before accessing its funcs to prevent NULL dereference. Reviewed-by: Jun Lei <jun.lei@amd.com> Acked-by: Hersen Wu <hersenxs.wu@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:47 +00:00
Lin.Cao	09f617219f	drm/amd: check num of link levels when update pcie param [ Upstream commit `406e884535` ] In SR-IOV environment, the value of pcie_table->num_of_link_levels will be 0, and num_of_levels - 1 will cause array index out of bounds Signed-off-by: Lin.Cao <lincao12@amd.com> Acked-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:41 +00:00
Samson Tam	b58ace0e8c	drm/amd/display: fix num_ways overflow error [ Upstream commit `79f3f1b667` ] [Why] Helper function calculates num_ways using 32-bit. But is returned as 8-bit. If num_ways exceeds 8-bit, then it reports back the incorrect num_ways and erroneously uses MALL when it should not [How] Make returned value 32-bit and convert after it checks against caps.cache_num_ways, which is under 8-bit Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Acked-by: Roman Li <roman.li@amd.com> Signed-off-by: Samson Tam <samson.tam@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:41 +00:00
Mario Limonciello	745682e634	drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supported [ Upstream commit `fbf1035b03` ] Rather than individual ASICs checking for the quirk, set the quirk at the driver level. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:41 +00:00
Qu Huang	ba3c0796d2	drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL [ Upstream commit `5104fdf50d` ] In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file could result in an abnormal null pointer access when the smc_rreg pointer is NULL. Below are the steps to reproduce this issue and the corresponding exception log: 1. Navigate to the directory: /sys/kernel/debug/dri/0 2. Execute command: cat amdgpu_regs_smc 3. Exception Log:: [4005007.702554] BUG: kernel NULL pointer dereference, address: 0000000000000000 [4005007.702562] #PF: supervisor instruction fetch in kernel mode [4005007.702567] #PF: error_code(0x0010) - not-present page [4005007.702570] PGD 0 P4D 0 [4005007.702576] Oops: 0010 [#1] SMP NOPTI [4005007.702581] CPU: 4 PID: 62563 Comm: cat Tainted: G OE 5.15.0-43-generic #46-Ubunt u [4005007.702590] RIP: 0010:0x0 [4005007.702598] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.702600] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.702605] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.702609] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.702612] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.702615] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.702618] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.702622] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.702626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.702629] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 [4005007.702633] Call Trace: [4005007.702636] <TASK> [4005007.702640] amdgpu_debugfs_regs_smc_read+0xb0/0x120 [amdgpu] [4005007.703002] full_proxy_read+0x5c/0x80 [4005007.703011] vfs_read+0x9f/0x1a0 [4005007.703019] ksys_read+0x67/0xe0 [4005007.703023] __x64_sys_read+0x19/0x20 [4005007.703028] do_syscall_64+0x5c/0xc0 [4005007.703034] ? do_user_addr_fault+0x1e3/0x670 [4005007.703040] ? exit_to_user_mode_prepare+0x37/0xb0 [4005007.703047] ? irqentry_exit_to_user_mode+0x9/0x20 [4005007.703052] ? irqentry_exit+0x19/0x30 [4005007.703057] ? exc_page_fault+0x89/0x160 [4005007.703062] ? asm_exc_page_fault+0x8/0x30 [4005007.703068] entry_SYSCALL_64_after_hwframe+0x44/0xae [4005007.703075] RIP: 0033:0x7f5e07672992 [4005007.703079] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 e c 28 48 89 54 24 [4005007.703083] RSP: 002b:00007ffe03097898 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [4005007.703088] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5e07672992 [4005007.703091] RDX: 0000000000020000 RSI: 00007f5e06753000 RDI: 0000000000000003 [4005007.703094] RBP: 00007f5e06753000 R08: 00007f5e06752010 R09: 00007f5e06752010 [4005007.703096] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [4005007.703099] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [4005007.703105] </TASK> [4005007.703107] Modules linked in: nf_tables libcrc32c nfnetlink algif_hash af_alg binfmt_misc nls_ iso8859_1 ipmi_ssif ast intel_rapl_msr intel_rapl_common drm_vram_helper drm_ttm_helper amd64_edac t tm edac_mce_amd kvm_amd ccp mac_hid k10temp kvm acpi_ipmi ipmi_si rapl sch_fq_codel ipmi_devintf ipm i_msghandler msr parport_pc ppdev lp parport mtd pstore_blk efi_pstore ramoops pstore_zone reed_solo mon ip_tables x_tables autofs4 ib_uverbs ib_core amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) iommu_v 2 amd_sched(OE) amdkcl(OE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm igb ahci xhci_pci libahci i2c_piix4 i2c_algo_bit xhci_pci_renesas dca [4005007.703184] CR2: 0000000000000000 [4005007.703188] ---[ end trace ac65a538d240da39 ]--- [4005007.800865] RIP: 0010:0x0 [4005007.800871] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.800874] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.800878] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.800881] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.800883] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.800886] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.800888] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.800891] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.800895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.800898] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 Signed-off-by: Qu Huang <qu.huang@linux.dev> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:41 +00:00
Jesse Zhang	56649c43d4	drm/amdkfd: Fix shift out-of-bounds issue [ Upstream commit `282c1d7930` ] [ 567.613292] shift exponent 255 is too large for 64-bit type 'long unsigned int' [ 567.614498] CPU: 5 PID: 238 Comm: kworker/5:1 Tainted: G OE 6.2.0-34-generic #34~22.04.1-Ubuntu [ 567.614502] Hardware name: AMD Splinter/Splinter-RPL, BIOS WS43927N_871 09/25/2023 [ 567.614504] Workqueue: events send_exception_work_handler [amdgpu] [ 567.614748] Call Trace: [ 567.614750] <TASK> [ 567.614753] dump_stack_lvl+0x48/0x70 [ 567.614761] dump_stack+0x10/0x20 [ 567.614763] __ubsan_handle_shift_out_of_bounds+0x156/0x310 [ 567.614769] ? srso_alias_return_thunk+0x5/0x7f [ 567.614773] ? update_sd_lb_stats.constprop.0+0xf2/0x3c0 [ 567.614780] svm_range_split_by_granularity.cold+0x2b/0x34 [amdgpu] [ 567.615047] ? srso_alias_return_thunk+0x5/0x7f [ 567.615052] svm_migrate_to_ram+0x185/0x4d0 [amdgpu] [ 567.615286] do_swap_page+0x7b6/0xa30 [ 567.615291] ? srso_alias_return_thunk+0x5/0x7f [ 567.615294] ? __free_pages+0x119/0x130 [ 567.615299] handle_pte_fault+0x227/0x280 [ 567.615303] __handle_mm_fault+0x3c0/0x720 [ 567.615311] handle_mm_fault+0x119/0x330 [ 567.615314] ? lock_mm_and_find_vma+0x44/0x250 [ 567.615318] do_user_addr_fault+0x1a9/0x640 [ 567.615323] exc_page_fault+0x81/0x1b0 [ 567.615328] asm_exc_page_fault+0x27/0x30 [ 567.615332] RIP: 0010:__get_user_8+0x1c/0x30 Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:41 +00:00
Ma Ke	70f831f211	drm/amdgpu/vkms: fix a possible null pointer dereference [ Upstream commit `cd90511557` ] In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_cvt_mode(). Add a check to avoid null pointer dereference. Signed-off-by: Ma Ke <make_ruc2021@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:41 +00:00
Stanley.Yang	da46e63482	drm/amdgpu: Fix potential null pointer derefernce [ Upstream commit `80285ae1ec` ] The amdgpu_ras_get_context may return NULL if device not support ras feature, so add check before using. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:41 +00:00
Mario Limonciello	b3b8b7c040	drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga [ Upstream commit `0f0e59075b` ] For pptable structs that use flexible array sizes, use flexible arrays. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:40 +00:00
Mario Limonciello	92a775e7c9	drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7 [ Upstream commit `760efbca74` ] For pptable structs that use flexible array sizes, use flexible arrays. Suggested-by: Felix Held <felix.held@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2874 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:40 +00:00
Wenjing Liu	dec0a70e52	drm/amd/display: use full update for clip size increase of large plane source [ Upstream commit `05b78277ef` ] [why] Clip size increase will increase viewport, which could cause us to switch to MPC combine. If we skip full update, we are not able to change to MPC combine in fast update. This will cause corruption showing on the video plane. [how] treat clip size increase of a surface larger than 5k as a full update. Reviewed-by: Jun Lei <jun.lei@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:40 +00:00
Mario Limonciello	82c4cf2c05	drm/amd: Update `update_pcie_parameters` functions to use uint8_t arguments [ Upstream commit `7752ccf85b` ] The matching values for `pcie_gen_cap` and `pcie_width_cap` when fetched from powerplay tables are 1 byte, so narrow the arguments to match to ensure min() and max() comparisons without casts. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:40 +00:00
Tao Zhou	b39f1303b9	drm/amdgpu: update retry times for psp vmbx wait [ Upstream commit `fc59889071` ] Increase the retry loops and replace the constant number with macro. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:40 +00:00
Xiaogang Chen	fc02107201	drm/amdkfd: Fix a race condition of vram buffer unref in svm code [ Upstream commit `709c348261` ] prange->svm_bo unref can happen in both mmu callback and a callback after migrate to system ram. Both are async call in different tasks. Sync svm_bo unref operation to avoid random "use-after-free". Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Tested-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:40 +00:00
David (Ming Qiang) Wu	0a810a3fc6	drm/amdgpu: not to save bo in the case of RAS err_event_athub [ Upstream commit `fa1f1cc09d` ] err_event_athub will corrupt VCPU buffer and not good to be restored in amdgpu_vcn_resume() and in this case the VCPU buffer needs to be cleared for VCN firmware to work properly. Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:39 +00:00
Wenjing Liu	ea971fd05f	drm/amd/display: add seamless pipe topology transition check [ Upstream commit `15c6798ae2` ] [why] We have a few cases where we need to perform update topology update in dc update interface. However some of the updates are not seamless This could cause user noticible glitches. To enforce seamless transition we are adding a checking condition and error logging so the corruption as result of non seamless transition can be easily spotted. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:39 +00:00
Alvin Lee	f95d571bb5	drm/amd/display: Don't lock phantom pipe on disabling [ Upstream commit `cbb4c9bc55` ] [Description] - When disabling a phantom pipe, we first enable the phantom OTG so the double buffer update can successfully take place - However, want to avoid locking the phantom otherwise setting DPG_EN=1 for the phantom pipe is blocked (without this we could hit underflow due to phantom HUBP being blanked by default) Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:39 +00:00
Alvin Lee	67af2e23e2	drm/amd/display: Blank phantom OTG before enabling [ Upstream commit `e87a6c5b77` ] [Description] Before enabling the phantom OTG for an update we must enable DPG to avoid underflow. Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:39 +00:00
Harish Kasiviswanathan	de1ce1081b	drm/amdkfd: ratelimited SQ interrupt messages [ Upstream commit `37fb879107` ] No functional change. Use ratelimited version of pr_ to avoid overflowing of dmesg buffer Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-28 17:19:39 +00:00
Alex Deucher	6d8d01d591	drm/amdgpu: don't put MQDs in VRAM on ARM \| ARM64 [ Upstream commit `ba0fb4b48c` ] Issues were reported with commit `1cfb4d6121` ("drm/amdgpu: put MQDs in VRAM") on an ADLINK Ampere Altra Developer Platform (AVA developer platform). Various ARM systems seem to have problems related to PCIe and MMIO access. In this case, I'm not sure if this is specific to the ADLINK platform or ARM in general. Seems to be some coherency issue with VRAM. For now, just don't put MQDs in VRAM on ARM. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-October/100453.html Fixes: `1cfb4d6121` ("drm/amdgpu: put MQDs in VRAM") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: alexey.klimov@linaro.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:33 +01:00
Alex Deucher	aad60bce9b	drm/amdgpu/gfx10,11: use memcpy_to/fromio for MQDs [ Upstream commit `b3c942bb6c` ] Since they were moved to VRAM, we need to use the IO variants of memcpy. Fixes: `1cfb4d6121` ("drm/amdgpu: put MQDs in VRAM") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:33 +01:00
Kunwu.Chan	cc9742165a	drm/amd/pm: Fix a memory leak on an error path [ Upstream commit `828f8e3137` ] Add missing free on an error path. Fixes: `511a95552e` ("drm/amd/pm: Add SMU 13.0.6 support") Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Kunwu.Chan <chentao@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:13 +01:00
Michel Dänzer	89eea870d7	drm/amd/display: Bail from dm_check_crtc_cursor if no relevant change [ Upstream commit `bc0b79ce20` ] If no plane was newly enabled or changed scaling, there can be no new scaling mismatch with the cursor plane. By not pulling non-cursor plane states into all atomic commits while the cursor plane is enabled, this avoids synchronizing all cursor plane changes to vertical blank, which caused the following IGT tests to fail: kms_cursor_legacy@cursor-vs-flip.* kms_cursor_legacy@flip-vs-cursor.* Fixes: `003048ddf4` ("drm/amd/display: Check all enabled planes in dm_check_crtc_cursor") Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:11 +01:00
Michel Dänzer	0995fb3dd4	drm/amd/display: Refactor dm_get_plane_scale helper [ Upstream commit `ec4d770bbb` ] Cleanup, no functional change intended. Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: `bc0b79ce20` ("drm/amd/display: Bail from dm_check_crtc_cursor if no relevant change") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:11 +01:00
Michel Dänzer	c9106fb186	drm/amd/display: Check all enabled planes in dm_check_crtc_cursor [ Upstream commit `003048ddf4` ] It was only checking planes which had any state changes in the same commit. However, it also needs to check other enabled planes. Not doing this meant that a commit might spuriously "succeed", resulting in the cursor plane displaying with incorrect scaling. See https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3177#note_1824263 for an example. Fixes: `d1bfbe8a32` ("amd/display: check cursor plane matches underlying plane") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:11 +01:00
Cong Liu	8b72c5d4a5	drm/amd/display: Fix null pointer dereference in error message [ Upstream commit `0c3601a2fb` ] This patch fixes a null pointer dereference in the error message that is printed when the Display Core (DC) fails to initialize. The original message includes the DC version number, which is undefined if the DC is not initialized. Fixes: `9788d087ca` ("drm/amd/display: improve the message printed when loading DC") Signed-off-by: Cong Liu <liucong2@kylinos.cn> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:11 +01:00
Philip Yang	49ca8b9035	drm/amdkfd: Handle errors from svm validate and map [ Upstream commit `eb3c357bcb` ] If new range is splited to multiple pranges with max_svm_range_pages alignment and added to update_list, svm validate and map should keep going after error to make sure prange->mapped_to_gpu flag is up to date for the whole range. svm validate and map update set prange->mapped_to_gpu after mapping to GPUs successfully, otherwise clear prange->mapped_to_gpu flag (for update mapping case) instead of setting error flag, we can remove the redundant error flag to simpliy code. Refactor to remove goto and update prange->mapped_to_gpu flag inside svm_range_lock, to guarant we always evict queues or unmap from GPUs if there are invalid ranges. After svm validate and map return error -EAGIN, the caller retry will update the mapping for the whole range again. Fixes: `c22b044070` ("drm/amdkfd: flag added to handle errors from svm validate and map") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: James Zhu <james.zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:10 +01:00
Philip Yang	33b9170fef	drm/amdkfd: Remove svm range validated_once flag [ Upstream commit `c99b161280` ] The validated_once flag is not used after the prefault was removed, The prefault was needed to ensure validate all system memory pages at least once before mapping or migrating the range to GPU. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: `eb3c357bcb` ("drm/amdkfd: Handle errors from svm validate and map") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:10 +01:00
Xiaogang Chen	da27d73ec9	drm/amdkfd: fix some race conditions in vram buffer alloc/free of svm code [ Upstream commit `7bfaa160ca` ] This patch fixes: 1: ref number of prange's svm_bo got decreased by an async call from hmm. When wait svm_bo of prange got released we shoul also wait prang->svm_bo become NULL, otherwise prange->svm_bo may be set to null after allocate new vram buffer. 2: During waiting svm_bo of prange got released in a while loop should reschedule current task to give other tasks oppotunity to run, specially the the workque task that handles svm_bo ref release, otherwise we may enter to softlock. Signed-off-by: Xiaogang.Chen <xiaogang.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:10 +01:00
Philip Yang	03b98d789a	drm/amdgpu: Increase IH soft ring size for GFX v9.4.3 dGPU [ Upstream commit `bcfb9cee61` ] On GFX v9.4.3 dGPU, applications have random timeout failure when XNACK on, dmesg log has "amdgpu: IH soft ring buffer overflow 0x900, 0x900", because dGPU mode has 272 cam entries. After increasing IH soft ring to 512 entries, no more IH soft ring overflow message and application passed. Fixes: `bf80d34b6c` ("drm/amdgpu: Increase soft IH ring size") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-20 11:59:10 +01:00
Nicholas Kazlauskas	2a79e92f23	drm/amd/display: Don't use fsleep for PSR exit waits [ Upstream commit `79df45dc4b` ] [Why] These functions can be called from high IRQ levels and the OS will hang if it tries to use a usleep_highres or a msleep. [How] Replace the fsleep with a udelay. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-11-08 11:56:20 +01:00
Dave Airlie	44117828ed	Merge tag 'amd-drm-fixes-6.6-2023-10-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.6-2023-10-25: amdgpu: - Extend VI APSM quirks to more platforms Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231026035452.14921-1-alexander.deucher@amd.com	2023-10-27 12:17:26 +10:00
Dave Airlie	6366ffa6ed	Merge tag 'drm-misc-fixes-2023-10-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: amdgpu: - ignore duplicated BOs in CS parser - remove redundant call to amdgpu_ctx_priority_is_valid() amdkfd: - reserve fence slot while locking BO dp_mst: - Fix NULL deref in get_mst_branch_device_by_guid_helper() logicvc: - Kconfig: Select REGMAP and REGMAP_MMIO ivpu: - Fix missing VPUIP interrupts Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231026110132.GA10591@linux-uq9g.fritz.box	2023-10-27 11:51:35 +10:00
Mario Limonciello	64ffd2f1d0	drm/amd: Disable ASPM for VI w/ all Intel systems Originally we were quirking ASPM disabled specifically for VI when used with Alder Lake, but it appears to have problems with Rocket Lake as well. Like we've done in the case of dpm for newer platforms, disable ASPM for all Intel systems. Cc: stable@vger.kernel.org # 5.15+ Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Reported-and-tested-by: Paolo Gentili <paolo.gentili@canonical.com> Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-25 09:53:17 -04:00
Christian König	4984fc578a	drm/amdkfd: reserve a fence slot while locking the BO Looks like the KFD still needs this. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `8abc1eb298` ("drm/amdkfd: switch over to using drm_exec v3") Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231020123306.43978-1-christian.koenig@amd.com	2023-10-23 14:48:47 +02:00
Luben Tuikov	d3df66fd98	drm/amdgpu: Remove redundant call to priority_is_valid() Remove a redundant call to amdgpu_ctx_priority_is_valid() from amdgpu_ctx_priority_permit(), which is called from amdgpu_ctx_init() which is called from amdgpu_ctx_alloc() which is called from amdgpu_ctx_ioctl(), where we've called amdgpu_ctx_priority_is_valid() already first thing in the function. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231018010359.30393-1-luben.tuikov@amd.com	2023-10-21 20:27:15 -04:00
Dave Airlie	d43c76c820	Merge tag 'drm-misc-fixes-2023-10-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: amdgpu: - Disable AMD_CTX_PRIORITY_UNSET bridge: - ti-sn65dsi86: Fix device lifetime edid: - Add quirk for BenQ GW2765 ivpu: - Extend address range for MMU mmap nouveau: - DP-connector fixes - Documentation fixes panel: - Move AUX B116XW03 into panel-simple scheduler: - Eliminate DRM_SCHED_PRIORITY_UNSET ttm: - Fix possible NULL-ptr deref in cleanup Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231019114605.GA22540@linux-uq9g	2023-10-20 14:07:58 +10:00
Felix Kuehling	316baf09d3	drm/amdgpu: Reserve fences for VM update In amdgpu_dma_buf_move_notify reserve fences for the page table updates in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON in dma_resv_add_fence when using SDMA for page table updates. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:56:57 -04:00
Felix Kuehling	51b79f3381	drm/amdgpu: Fix possible null pointer dereference abo->tbo.resource may be NULL in amdgpu_vm_bo_update. Fixes: `1802537820` ("drm/ttm: stop allocating dummy resources during BO creation") Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-19 18:56:50 -04:00
Christian König	6b18ef481f	drm/amdgpu: ignore duplicate BOs again Looks like RADV is actually hitting this. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: `ca6c1e210a` ("drm/amdgpu: use the new drm_exec object for CS v3") Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231017121015.1336786-1-christian.koenig@amd.com	2023-10-19 13:19:44 +02:00
Luben Tuikov	fa8391ad68	gpu/drm: Eliminate DRM_SCHED_PRIORITY_UNSET Eliminate DRM_SCHED_PRIORITY_UNSET, value of -2, whose only user was amdgpu. Furthermore, eliminate an index bug, in that when amdgpu boots, it calls drm_sched_entity_init() with DRM_SCHED_PRIORITY_UNSET, which uses it to index sched->sched_rq[]. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Link: https://lore.kernel.org/r/20231017035656.8211-2-luben.tuikov@amd.com	2023-10-17 20:35:38 -04:00
Luben Tuikov	eab0261967	drm/amdgpu: Unset context priority is now invalid A context priority value of AMD_CTX_PRIORITY_UNSET is now invalid--instead of carrying it around and passing it to the Direct Rendering Manager--and it becomes AMD_CTX_PRIORITY_NORMAL in amdgpu_ctx_ioctl(), the gateway to context creation. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Link: https://lore.kernel.org/r/20231017035656.8211-1-luben.tuikov@amd.com	2023-10-17 20:35:38 -04:00
Icenowy Zheng	3806a8c647	drm/amdgpu: fix SI failure due to doorbells allocation SI hardware does not have doorbells at all, however currently the code will try to do the allocation and thus fail, makes SI AMDGPU not usable. Fix this failure by skipping doorbells allocation when doorbells count is zero. Fixes: `54c30d2a8d` ("drm/amdgpu: create kernel doorbell pages") Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:59:29 -04:00
Christian König	ff89f064dc	drm/amdgpu: add missing NULL check bo->tbo.resource can easily be NULL here. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org	2023-10-09 17:59:29 -04:00
Daniel Miess	23645bca98	drm/amd/display: Don't set dpms_off for seamless boot [Why] eDPs fail to light up with seamless boot enabled [How] When seamless boot is enabled don't configure dpms_off in disable_vbios_mode_if_required. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Daniel Miess <daniel.miess@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-09 17:59:29 -04:00
Samson Tam	b206011bf0	drm/amd/display: apply edge-case DISPCLK WDIVIDER changes to master OTG pipes only [Why] The edge-case DISPCLK WDIVIDER changes call stream_enc functions. But with MPC pipes, downstream pipes have null stream_enc and will cause crash. [How] Only call stream_enc functions for pipes that are OTG master. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Samson Tam <samson.tam@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:55:05 -04:00
Mario Limonciello	134b8c5d86	drm/amd: Fix detection of _PR3 on the PCIe root port On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: stable@vger.kernel.org Suggested-by: Jun Ma <Jun.Ma2@amd.com> Tested-by: David Perry <David.Perry@amd.com> Fixes: `b10c1c5b3a` ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:52:05 -04:00
Mario Limonciello	2a1fe39a5b	drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters() While aligning SMU11 with SMU13 implementation an assumption was made that `dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization like in SMU13 but it isn't. So restore some of the original logic and instead just check for amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode values; erring on the side of performance. Cc: stable@vger.kernel.org # 6.1+ Reported-and-tested-by: Umio Yasuno <coelacanth_dream@protonmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382 Fixes: `e701156ccc` ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:52:05 -04:00
Luben Tuikov	5d061675b7	drm/amdgpu: Fix a memory leak Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Fixes: `0dbf2c5626` ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-04 22:43:26 -04:00

1 2 3 4 5 ...

27115 Commits