Panel replay was enabled by default in commit 5950efe25e
("drm/amd/display: Enable Panel Replay for static screen use case"), but
it isn't working properly at least on some BOE and AUO panels. Instead
of being static the screen is solid black when active. As it's a new
feature that was just introduced that regressed VRR disable it for now
so that problem can be properly root caused.
Cc: Tom Chung <chiahsuan.chung@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3344
Fixes: 5950efe25e ("drm/amd/display: Enable Panel Replay for static screen use case")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is a race condition when re-creating a kfd_process for a process.
This has been observed when a process under the debugger executes
exec(3). In this scenario:
- The process executes exec.
- This will eventually release the process's mm, which will cause the
kfd_process object associated with the process to be freed
(kfd_process_free_notifier decrements the reference count to the
kfd_process to 0). This causes kfd_process_ref_release to enqueue
kfd_process_wq_release to the kfd_process_wq.
- The debugger receives the PTRACE_EVENT_EXEC notification, and tries to
re-enable AMDGPU traps (KFD_IOC_DBG_TRAP_ENABLE).
- When handling this request, KFD tries to re-create a kfd_process.
This eventually calls kfd_create_process and kobject_init_and_add.
At this point the call to kobject_init_and_add can fail because the
old kfd_process.kobj has not been freed yet by kfd_process_wq_release.
This patch proposes to avoid this race by making sure to drain
kfd_process_wq before creating a new kfd_process object. This way, we
know that any cleanup task is done executing when we reach
kobject_init_and_add.
Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why && How]
Screen flickering saw on 4K@60 eDP with high refresh rate external
monitor when booting up in DC mode. DC Mode Capping is disabled
which caused wrong UCLK being used.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Leo Ma <hanghong.ma@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move
on same heap. The basic problem here is that after the move the old
location is simply not available any more.
Some fixes were suggested, but essentially we should call the move
notification before actually moving things because only this way we have
the correct order for DMA-buf and VM move notifications as well.
Also rework the statistic handling so that we don't update the eviction
counter before the move.
v2: add missing NULL check
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 94aeb41173 ("drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3171
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
[Why]
During DP tunnel creation, CM preallocates BW and reduces
estimated BW of other DPIA. CM release preallocation only
when allocation is complete. Display mode validation logic
validates timings based on bw available per host router.
In multi display setup, this causes bw allocation failure
when allocation greater than estimated bw.
[How]
Do zero alloc to make the CM to release preallocation and
update estimated BW correctly for all DPIAs per host router.
Reviewed-by: PeiChen Huang <peichen.huang@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why] DSC debugfs, such as dp_dsc_clock_en_read,
use aconnector->dc_link to find pipe_ctx for display.
Displays connected to MST hub share the same dc_link.
DSC instance is from pipe_ctx. This causes incorrect
DSC instance for display connected to MST hub.
[How] Add aconnector->sink check to find pipe_ctx.
CC: stable@vger.kernel.org
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
New request from KMD/VBIOS in order to support new UMA carveout
model. This fixes a null dereference from accessing
Ctx->dc_bios->integrated_info while it was NULL.
DAL parses through the BIOS and extracts the necessary
integrated_info but was missing a case for the new BIOS
version 2.3.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
Currently DCN315 clk manager is missing code to enable/disable dtbclk.
Because of this, "optimized_required" flag is constantly set
and this prevents FreeSync from engaging for certain high bandwidth
display Modes which require DTBCLK.
Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Swapnil Patel <swapnil.patel@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Subtract the VRAM pinned memory when checking for available memory
in amdgpu_amdkfd_reserve_mem_limit function since that memory is not
available for use.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The spreadsheet defines the PLL register block as having
the dwords in the following order:
block dwords offsets
PLL1 0x0-0x7 0x00-0x1f
PLL2 0x0-0x7 0x20-0x3f
PLL1ext 0x10-0x1f 0x40-0x5f
PLL2ext 0x10-0x1f 0x60-0x7f
So dword indexes 0x8-0xf don't even exist. Renumber
our register defines to match.
Note that the spreadsheet used hex numbering whereas our
defiens are in decimal. Perhaps we should change that?
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-5-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
VLV_PLL_DW9_BCAST is actually VLV_PCS_DW17_BCAST. The address
does kinda look like it goes to the PLL block on a first glance,
but broadcast is special and doesn't even exist for the PLL
(only PCS and TX have it).
The fact that we use a broadcast write here is a bit sketchy
IMO since we're now blasting the register to all PCS splines
across the whole PHY. So the PCS registers in the other channel
(ie. other pipe/port) will also be written. But I guess the
fact that we always write the same value should make this a nop
even if the other channel is already enabled (assuming the VBIOS/GOP
didn't screw up and use some other value...).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240422083457.23815-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Currently we allocate all 3 levels of radix3 page tables using
nvkm_gsp_mem_ctor(), which uses dma_alloc_coherent() for allocating all of
the relevant memory. This can end up failing in scenarios where the system
has very high memory fragmentation, and we can't find enough contiguous
memory to allocate level 2 of the page table.
Currently, this can result in runtime PM issues on systems where memory
fragmentation is high - as we'll fail to allocate the page table for our
suspend/resume buffer:
kworker/10:2: page allocation failure: order:7, mode:0xcc0(GFP_KERNEL),
nodemask=(null),cpuset=/,mems_allowed=0
CPU: 10 PID: 479809 Comm: kworker/10:2 Not tainted
6.8.6-201.ChopperV6.fc39.x86_64 #1
Hardware name: SLIMBOOK Executive/Executive, BIOS N.1.10GRU06 02/02/2024
Workqueue: pm pm_runtime_work
Call Trace:
<TASK>
dump_stack_lvl+0x64/0x80
warn_alloc+0x165/0x1e0
? __alloc_pages_direct_compact+0xb3/0x2b0
__alloc_pages_slowpath.constprop.0+0xd7d/0xde0
__alloc_pages+0x32d/0x350
__dma_direct_alloc_pages.isra.0+0x16a/0x2b0
dma_direct_alloc+0x70/0x270
nvkm_gsp_radix3_sg+0x5e/0x130 [nouveau]
r535_gsp_fini+0x1d4/0x350 [nouveau]
nvkm_subdev_fini+0x67/0x150 [nouveau]
nvkm_device_fini+0x95/0x1e0 [nouveau]
nvkm_udevice_fini+0x53/0x70 [nouveau]
nvkm_object_fini+0xb9/0x240 [nouveau]
nvkm_object_fini+0x75/0x240 [nouveau]
nouveau_do_suspend+0xf5/0x280 [nouveau]
nouveau_pmops_runtime_suspend+0x3e/0xb0 [nouveau]
pci_pm_runtime_suspend+0x67/0x1e0
? __pfx_pci_pm_runtime_suspend+0x10/0x10
__rpm_callback+0x41/0x170
? __pfx_pci_pm_runtime_suspend+0x10/0x10
rpm_callback+0x5d/0x70
? __pfx_pci_pm_runtime_suspend+0x10/0x10
rpm_suspend+0x120/0x6a0
pm_runtime_work+0x98/0xb0
process_one_work+0x171/0x340
worker_thread+0x27b/0x3a0
? __pfx_worker_thread+0x10/0x10
kthread+0xe5/0x120
? __pfx_kthread+0x10/0x10
ret_from_fork+0x31/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
Luckily, we don't actually need to allocate coherent memory for the page
table thanks to being able to pass the GPU a radix3 page table for
suspend/resume data. So, let's rewrite nvkm_gsp_radix3_sg() to use the sg
allocator for level 2. We continue using coherent allocations for lvl0 and
1, since they only take a single page.
V2:
* Don't forget to actually jump to the next scatterlist when we reach the
end of the scatterlist we're currently on when writing out the page table
for level 2
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240429182318.189668-2-lyude@redhat.com
UAPI Changes:
- drm/i915/guc: Use context hints for GT frequency
Allow user to provide a low latency context hint. When set, KMD
sends a hint to GuC which results in special handling for this
context. SLPC will ramp the GT frequency aggressively every time
it switches to this context. The down freq threshold will also be
lower so GuC will ramp down the GT freq for this context more slowly.
We also disable waitboost for this context as that will interfere with
the strategy.
We need to enable the use of SLPC Compute strategy during init, but
it will apply only to contexts that set this bit during context
creation.
Userland can check whether this feature is supported using a new param-
I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission
enabled platforms as they use SLPC for frequency management.
The Mesa usage model for this flag is here -
https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint
- drm/i915/gt: Enable only one CCS for compute workload
Enable only one CCS engine by default with all the compute sices
allocated to it.
While generating the list of UABI engines to be exposed to the
user, exclude any additional CCS engines beyond the first
instance
***
NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by
default to mitigate a hardware bug. All the EUs will still remain
usable, and all the userspace drivers have been confirmed to be able
to dynamically detect the change in number of CCS engines and adjust.
For the smaller percent of applications that get perf benefit from
letting the userspace driver dispatch across all 4 CCS engines we will
be introducing a sysfs control as a later patch to choose 4 CCS each
with 25% EUs (or 50% if 2 CCS).
NOTE: A regression has been reported at
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895
However Andi has been triaging the issue and we're closing in a fix
to the gap in the W/A implementation:
https://lists.freedesktop.org/archives/intel-gfx/2024-April/348747.html
Driver Changes:
- Add new and fix to existing workarounds: Wa_14018575942 (MTL),
Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438,
Wa_14020495402 (Gen12.70) (Tejas, John, Lucas)
- Fix UAF on destroy against retire race and remove two earlier
partial fixes (Janusz)
- Limit the reserved VM space to only the platforms that need it (Andi)
- Reset queue_priority_hint on parking for execlist platforms (Chris)
- Fix gt reset with GuC submission is disabled (Nirmoy)
- Correct capture of EIR register on hang (John)
- Remove usage of the deprecated ida_simple_xx() API
- Refactor confusing __intel_gt_reset() (Nirmoy)
- Fix the fix for GuC reset lock confusion (John)
- Simplify/extend platform check for Wa_14018913170 (John)
- Replace dev_priv with i915 (Andi)
- Add and use gt_to_guc() wrapper (Andi)
- Remove bogus null check (Rodrigo, Dan)
. Selftest improvements (Janusz, Nirmoy, Daniele)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZitVBTvZmityDi7D@jlahtine-mobl.ger.corp.intel.com
Thomas needs the defio fixes, Maíra needs the vkms fixes and Joonas
has some fun with i915-gem conflicts.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>