Pull drm fixes from Dave Airlie:
"Regular weekly fixes. This is a bit larger than usual but doesn't seem
too crazy.
Most of it is vmwgfx changes that fix a bunch of issues with wayland
userspaces with dma-buf/external buffers and modesetting fixes.
Otherwise it's kinda spread out, v3d fixes some new ioctls, nouveau
has regression revert and fixes, amdgpu, i915 and ast have some small
fixes, and some core fixes spread about.
client:
- fix error code
atomic:
- allow damage clips with async flips
- allow explicit sync with async flips
kselftests:
- fix dmabuf-heaps test
panic:
- fix schedule_work in panic paths
panel:
- fix OrangePi Neo orientation
gpuvm:
- fix missing dependency
amdgpu:
- SMU 14.x update
- Fix contiguous VRAM handling for IB parsing
- GFX 12 fix
- Regression fix for old APUs
i915:
- Static analysis fix for int overflow
- Fix for HDCP2_STREAM_STATUS macro and removal of PWR_CLK_STATE for gen12
nouveau:
- revert busy wait change that caused a resume regression
- fix buffer placement fault on dynamic pm s/r
- fix refcount underflow
ast:
- fix black screen on resume
- wake during connector status detect
v3d:
- fix issues with perf/timestamp ioctls
vmwgfx:
- fix deadlock in dma-buf fence polling
- fix screen surface refcounting
- fix dumb buffer handling
- fix support for external buffers
- fix overlay with screen targets
- trigger modeset on screen moves"
* tag 'drm-fixes-2024-08-02' of https://gitlab.freedesktop.org/drm/kernel: (31 commits)
Revert "nouveau: rip out busy fence waits"
nouveau: set placement to original placement on uvmm validate.
drm/atomic: Allow userspace to use damage clips with async flips
drm/atomic: Allow userspace to use explicit sync with atomic async flips
drm/i915: Fix possible int overflow in skl_ddi_calculate_wrpll()
drm/i915/hdcp: Fix HDCP2_STREAM_STATUS macro
drm/ast: astdp: Wake up during connector status detection
i915/perf: Remove code to update PWR_CLK_STATE for gen12
kselftests: dmabuf-heaps: Ensure the driver name is null-terminated
drm/client: Fix error code in drm_client_buffer_vmap_local()
drm/amdgpu: Fix APU handling in amdgpu_pm_load_smu_firmware()
drm/amdgpu: increase mes log buffer size for gfx12
drm/amdgpu: fix contiguous handling for IB parsing v2
drm/amdgpu/pm: support gpu_metrics sysfs interface for smu v14.0.2/3
drm/vmwgfx: Trigger a modeset when the screen moves
drm/vmwgfx: Fix overlay when using Screen Targets
drm/vmwgfx: Add basic support for external buffers
drm/vmwgfx: Fix handling of dumb buffers
drm/vmwgfx: Make sure the screen surface is ref counted
drm/vmwgfx: Fix a deadlock in dma buf fence polling
...
This reverts commit d45bb9c5f7.
Just got a report that this causes some suspend/resume issues,
so back it out and I'll investigate it later.
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When a buffer is evicted for memory pressure or TTM evict all,
the placement is set to the eviction domain, this means the
buffer never gets revalidated on the next exec to the correct domain.
I think this should be fine to use the initial domain from the
object creation, as least with VM_BIND this won't change after
init so this should be the correct answer.
Fixes: b88baab828 ("drm/nouveau: implement new VM_BIND uAPI")
Cc: Danilo Krummrich <dakr@redhat.com>
Cc: <stable@vger.kernel.org> # v6.6
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240515025542.2156774-1-airlied@gmail.com
On the off chance that clock value ends up being too high (by means
of skl_ddi_calculate_wrpll() having been called with big enough
value of crtc_state->port_clock * 1000), one possible consequence
may be that the result will not be able to fit into signed int.
Fix this issue by moving conversion of clock parameter from kHz to Hz
into the body of skl_ddi_calculate_wrpll(), as well as casting the
same parameter to u64 type while calculating the value for AFE clock.
This both mitigates the overflow problem and avoids possible erroneous
integer promotion mishaps.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: 82d3543701 ("drm/i915/skl: Implementation of SKL DPLL programming")
Cc: stable@vger.kernel.org
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240729174035.25727-1-n.zhandarovich@fintech.ru
(cherry picked from commit 833cf12846)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We only had a couple of array[] declarations, and changing them to just
use 'MAX()' instead of 'max()' fixes the issue.
This will allow us to simplify our min/max macros enormously, since they
can now unconditionally use temporary variables to avoid using the
argument values multiple times.
Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This just standardizes the use of MIN() and MAX() macros, with the very
traditional semantics. The goal is to use these for C constant
expressions and for top-level / static initializers, and so be able to
simplify the min()/max() macros.
These macro names were used by various kernel code - they are very
traditional, after all - and all such users have been fixed up, with a
few different approaches:
- trivial duplicated macro definitions have been removed
Note that 'trivial' here means that it's obviously kernel code that
already included all the major kernel headers, and thus gets the new
generic MIN/MAX macros automatically.
- non-trivial duplicated macro definitions are guarded with #ifndef
This is the "yes, they define their own versions, but no, the include
situation is not entirely obvious, and maybe they don't get the
generic version automatically" case.
- strange use case #1
A couple of drivers decided that the way they want to describe their
versioning is with
#define MAJ 1
#define MIN 2
#define DRV_VERSION __stringify(MAJ) "." __stringify(MIN)
which adds zero value and I just did my Alexander the Great
impersonation, and rewrote that pointless Gordian knot as
#define DRV_VERSION "1.2"
instead.
- strange use case #2
A couple of drivers thought that it's a good idea to have a random
'MIN' or 'MAX' define for a value or index into a table, rather than
the traditional macro that takes arguments.
These values were re-written as C enum's instead. The new
function-line macros only expand when followed by an open
parenthesis, and thus don't clash with enum use.
Happily, there weren't really all that many of these cases, and a lot of
users already had the pattern of using '#ifndef' guarding (or in one
case just using '#undef MIN') before defining their own private version
that does the same thing. I left such cases alone.
Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 3a7e02c040 ("minmax: avoid overly complicated constant
expressions in VM code") added the simpler MIN_T/MAX_T macros in order
to avoid some excessive expansion from the rather complicated regular
min/max macros.
The complexity of those macros stems from two issues:
(a) trying to use them in situations that require a C constant
expression (in static initializers and for array sizes)
(b) the type sanity checking
and MIN_T/MAX_T avoids both of these issues.
Now, in the whole (long) discussion about all this, it was pointed out
that the whole type sanity checking is entirely unnecessary for
min_t/max_t which get a fixed type that the comparison is done in.
But that still leaves min_t/max_t unnecessarily complicated due to
worries about the C constant expression case.
However, it turns out that there really aren't very many cases that use
min_t/max_t for this, and we can just force-convert those.
This does exactly that.
Which in turn will then allow for much simpler implementations of
min_t()/max_t(). All the usual "macros in all upper case will evaluate
the arguments multiple times" rules apply.
We should do all the same things for the regular min/max() vs MIN/MAX()
cases, but that has the added complexity of various drivers defining
their own local versions of MIN/MAX, so that needs another level of
fixes first.
Link: https://lore.kernel.org/all/b47fad1d0cf8449886ad148f8c013dae@AcuMS.aculab.com/
Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull drm fixes from Dave Airlie:
"Fixes for rc1, mostly amdgpu, i915 and xe, with some other misc ones,
doesn't seem to be anything too serious.
amdgpu:
- Bump driver version for GFX12 DCC
- DC documention warning fixes
- VCN unified queue power fix
- SMU fix
- RAS fix
- Display corruption fix
- SDMA 5.2 workaround
- GFX12 fixes
- Uninitialized variable fix
- VCN/JPEG 4.0.3 fixes
- Misc display fixes
- RAS fixes
- VCN4/5 harvest fix
- GPU reset fix
i915:
- Reset intel_dp->link_trained before retraining the link
- Don't switch the LTTPR mode on an active link
- Do not consider preemption during execlists_dequeue for gen8
- Allow NULL memory region
xe:
- xe_exec ioctl minor fix on sync entry cleanup upon error
- SRIOV: limit VF LMEM provisioning
- Wedge mode fixes
v3d:
- fix indirect dispatch on newer v3d revs
panel:
- fix panel backlight bindings"
* tag 'drm-next-2024-07-26' of https://gitlab.freedesktop.org/drm/kernel: (39 commits)
drm/amdgpu: reset vm state machine after gpu reset(vram lost)
drm/amdgpu: add missed harvest check for VCN IP v4/v5
drm/amdgpu: Fix eeprom max record count
drm/amdgpu: fix ras UE error injection failure issue
drm/amd/display: Remove ASSERT if significance is zero in math_ceil2
drm/amd/display: Check for NULL pointer
drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF
drm/amdgpu: Add empty HDP flush function to VCN v4.0.3
drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3
drm/amd/amdgpu: Fix uninitialized variable warnings
drm/amdgpu: Fix atomics on GFX12
drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell
drm/i915: Allow NULL memory region
drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8
dt-bindings: display: panel: samsung,atna33xc20: Document ATNA45AF01
drm/xe: Don't suspend device upon wedge
drm/xe: Wedge the entire device
drm/xe/pf: Limit fair VF LMEM provisioning
drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup
drm/amd/display: fix corruption with high refresh rates on DCN 3.0
...
When multi-monitor is cycled the X,Y position of the Screen Target will
likely change but the resolution will not. We need to trigger a modeset
when this occurs in order to recreate the Screen Target with the correct
X,Y position.
Fixes a bug where multiple displays are shown in a single scrollable
host window rather than in 2+ windows on separate host displays.
Fixes: 4268269331 ("drm/vmwgfx: Filter modes which exceed graphics memory")
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240624205951.23343-1-ian.forbes@broadcom.com
Pull driver core updates from Greg KH:
"Here is the big set of driver core changes for 6.11-rc1.
Lots of stuff in here, with not a huge diffstat, but apis are evolving
which required lots of files to be touched. Highlights of the changes
in here are:
- platform remove callback api final fixups (Uwe took many releases
to get here, finally!)
- Rust bindings for basic firmware apis and initial driver-core
interactions.
It's not all that useful for a "write a whole driver in rust" type
of thing, but the firmware bindings do help out the phy rust
drivers, and the driver core bindings give a solid base on which
others can start their work.
There is still a long way to go here before we have a multitude of
rust drivers being added, but it's a great first step.
- driver core const api changes.
This reached across all bus types, and there are some fix-ups for
some not-common bus types that linux-next and 0-day testing shook
out.
This work is being done to help make the rust bindings more safe,
as well as the C code, moving toward the end-goal of allowing us to
put driver structures into read-only memory. We aren't there yet,
but are getting closer.
- minor devres cleanups and fixes found by code inspection
- arch_topology minor changes
- other minor driver core cleanups
All of these have been in linux-next for a very long time with no
reported problems"
* tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
ARM: sa1100: make match function take a const pointer
sysfs/cpu: Make crash_hotplug attribute world-readable
dio: Have dio_bus_match() callback take a const *
zorro: make match function take a const pointer
driver core: module: make module_[add|remove]_driver take a const *
driver core: make driver_find_device() take a const *
driver core: make driver_[create|remove]_file take a const *
firmware_loader: fix soundness issue in `request_internal`
firmware_loader: annotate doctests as `no_run`
devres: Correct code style for functions that return a pointer type
devres: Initialize an uninitialized struct member
devres: Fix memory leakage caused by driver API devm_free_percpu()
devres: Fix devm_krealloc() wasting memory
driver core: platform: Switch to use kmemdup_array()
driver core: have match() callback in struct bus_type take a const *
MAINTAINERS: add Rust device abstractions to DRIVER CORE
device: rust: improve safety comments
MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer
MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER
firmware: rust: improve safety comments
...
Make vmwgfx go through the dma-buf interface to map/unmap imported
buffers. The driver used to try to directly manipulate external
buffers, assuming that everything that was coming to it had to live
in cpu accessible memory. While technically true because what's in the
vms is controlled by us, it's semantically completely broken.
Fix importing of external buffers by forwarding all memory access
requests to the importer.
Tested by the vmw_prime basic_vgem test.
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722184313.181318-5-zack.rusin@broadcom.com
Dumb buffers can be used in kms but also through prime with gallium's
resource_from_handle. In the second case the dumb buffers can be
rendered by the GPU where with the regular DRM kms interfaces they
are mapped and written to by the CPU. Because the same buffer can
be written to by the GPU and CPU vmwgfx needs to use vmw_surface (object
which properly tracks dirty state of the guest and gpu memory)
instead of vmw_bo (which is just guest side memory).
Furthermore the dumb buffer handles are expected to be gem objects by
a lot of userspace.
Make vmwgfx accept gem handles in prime and kms but internally switch
to vmw_surface's to properly track the dirty state of the objects between
the GPU and CPU.
Fixes new kwin and kde on wayland.
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: b32233acce ("drm/vmwgfx: Fix prime import/export")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.9+
Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722184313.181318-4-zack.rusin@broadcom.com
Introduce a version of the fence ops that on release doesn't remove
the fence from the pending list, and thus doesn't require a lock to
fix poll->fence wait->fence unref deadlocks.
vmwgfx overwrites the wait callback to iterate over the list of all
fences and update their status, to do that it holds a lock to prevent
the list modifcations from other threads. The fence destroy callback
both deletes the fence and removes it from the list of pending
fences, for which it holds a lock.
dma buf polling cb unrefs a fence after it's been signaled: so the poll
calls the wait, which signals the fences, which are being destroyed.
The destruction tries to acquire the lock on the pending fences list
which it can never get because it's held by the wait from which it
was called.
Old bug, but not a lot of userspace apps were using dma-buf polling
interfaces. Fix those, in particular this fixes KDE stalls/deadlock.
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: 2298e804e9 ("drm/vmwgfx: rework to new fence interface, v2")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.2+
Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722184313.181318-2-zack.rusin@broadcom.com
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.
[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
the new generation token.
v2: Check vm->generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit 47c0388b05)
The eeprom table is empty before initializing,
set eeprom table version first before initializing.
Changed from V1:
Reuse amdgpu_ras_set_eeprom_table_version function
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 015b8a2fdf)
The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8284951a6e)
In the DML math_ceil2 function, there is one ASSERT if the significance
is equal to zero. However, significance might be equal to zero
sometimes, and this is not an issue for a ceil function, but the current
ASSERT will trigger warnings in those cases. This commit removes the
ASSERT if the significance is equal to zero to avoid unnecessary noise.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 332315885d)
For VCN/JPEG 4.0.3, use only the local addressing scheme.
- Mask bit higher than AID0 range
v2
remain the case for mmhub use master XCC
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit caaf576292)
VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch
will do the HDP flush.
This change is necessary for VCN v4.0.3, no need for backward compatibility
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 49cfaebe48)
JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead,
mmsch fw will do the flush.
This change is necessary for JPEG v4.0.3, no need for backward compatibility
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 585e3fdb36)
If PCIe supports atomics, configure register to prevent DF from
breaking atomics in separate load/store operations.
Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 666f14cab2)
We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().
Enable this workaround while we are root causing the issue with
the HW team.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
Tested-by: Friedrich Vock <friedrich.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit f2ac526349)