linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-20 08:42:06 +00:00

Author	SHA1	Message	Date
Dafna Hirschfeld	2efba0c095	drm/xe: fix missing 'xe_vm_put' Fix memleak caused by missing xe_vm_put Fixes: `852856e3b6` ("drm/xe: Use reserved copy engine for user binds on faulting devices") Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240901044227.1177211-1-dhirschfeld@habana.ai Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `249df8cbec`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-09-12 18:04:36 -05:00
Jani Nikula	87d8ecf015	drm/xe: replace #include <drm/xe_drm.h> with <uapi/drm/xe_drm.h> include/drm/xe_drm.h does not exist. Prefer the explicit uapi include. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240827091539.4136838-1-jani.nikula@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-08-28 15:17:54 -04:00
Matthew Auld	7116c35aac	drm/xe: prevent UAF around preempt fence The fence lock is part of the queue, therefore in the current design anything locking the fence should then also hold a ref to the queue to prevent the queue from being freed. However, currently it looks like we signal the fence and then drop the queue ref, but if something is waiting on the fence, the waiter is kicked to wake up at some later point, where upon waking up it first grabs the lock before checking the fence state. But if we have already dropped the queue ref, then the lock might already be freed as part of the queue, leading to uaf. To prevent this, move the fence lock into the fence itself so we don't run into lifetime issues. Alternative might be to have device level lock, or only release the queue in the fence release callback, however that might require pushing to another worker to avoid locking issues. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2454 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2342 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2020 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814110129.825847-2-matthew.auld@intel.com	2024-08-19 12:38:09 +01:00
Francois Dugast	0d92cd8935	drm/xe/exec_queue: Prepare last fence for hw engine group resume context Ensure we can safely take a ref of the exec queue's last fence from the context of resuming jobs from the hw engine group. The locking requirements differ from the general case, hence the introduction of this new function. v2: Add kernel doc, rework the code to prevent code duplication v3: Fix kernel doc, remove now unnecessary lockdep variants (Matt Brost) v4: Remove new put function (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-7-francois.dugast@intel.com	2024-08-17 18:31:54 -07:00
Francois Dugast	7f0d7bee20	drm/xe/exec_queue: Remove duplicated code This code section is the same as the body of xe_exec_queue_last_fence_put_unlocked() so call the function instead and remove duplicated code to make maintenance easier. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-6-francois.dugast@intel.com	2024-08-17 18:31:53 -07:00
Francois Dugast	7970cb3696	'drm/xe/hw_engine_group: Register hw engine group's exec queues Add helpers to safely add and delete the exec queues attached to a hw engine group, and make use them at the time of creation and destruction of the exec queues. Keeping track of them is required to control the execution mode of the hw engine group. v2: Improve error handling and robustness, suspend exec queues created in fault mode if group in dma-fence mode, init queue link (Matt Brost) v3: Delete queue from hw engine group when it is destroyed by the user, also clean up at the time of closing the file in case the user did not destroy the queue v4: Use correct list when checking if empty, do not add the queue if VM is in xe_vm_in_preempt_fence_mode (Matt Brost) v5: Remove unrelated newline, add checks and asserts for group, unwind on suspend failure (Matt Brost) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809155156.1955925-4-francois.dugast@intel.com	2024-08-17 18:31:51 -07:00
Matthew Brost	852856e3b6	drm/xe: Use reserved copy engine for user binds on faulting devices User binds map to engines with can fault, faults depend on user binds completion, thus we can deadlock. Avoid this by using reserved copy engine for user binds on faulting devices. While we are here, normalize bind queue creation with a helper. v2: - Pass in extensions to bind queue creation (CI) v3: - s/resevered/reserved (Lucas) - Fix NULL hwe check (Jonathan) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816034033.53837-1-matthew.brost@intel.com	2024-08-17 14:12:19 -07:00
Daniele Ceraolo Spurio	b0ee81dac3	drm/xe: Make exec_queue_kill safe to call twice An upcoming PXP patch will kill queues at runtime when a PXP invalidation event occurs, so we need exec_queue_kill to be safe to call multiple times. v2: Add documentation (Matt B) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814205654.1716586-1-daniele.ceraolospurio@intel.com	2024-08-15 13:08:31 -07:00
Matthew Brost	549dd786b6	drm/xe: Move VM dma-resv lock from xe_exec_queue_create to __xe_exec_queue_init The critical section which requires the VM dma-resv is the call xe_lrc_create in __xe_exec_queue_init. Move this lock to __xe_exec_queue_init holding it just around xe_lrc_create. Not only is good practice, this also fixes a locking double of the VM dma-resv in the error paths of __xe_exec_queue_init as xe_lrc_put tries to acquire this too resulting in a deadlock. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240724152831.1848325-1-matthew.brost@intel.com	2024-08-09 13:11:01 -07:00
Dominik Grzegorzek	6f20fc0993	drm/xe: Move and export xe_hw_engine lookup. Move and export xe_hw_engine lookup. This is in preparation to use this in eudebug code where we want to find active engine. v2: s/tile/gt due to uapi changes (Mika) Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240729130152.100130-1-mika.kuoppala@linux.intel.com	2024-07-30 19:36:20 -07:00
Umesh Nerlige Ramappa	2149ded630	drm/xe: Fix use after free when client stats are captured xe_file_close triggers an asynchronous queue cleanup and then frees up the xef object. Since queue cleanup flushes all pending jobs and the KMD stores client usage stats into the xef object after jobs are flushed, we see a use-after-free for the xef object. Resolve this by taking a reference to xef from xe_exec_queue. While at it, revert an earlier change that contained a partial work around for this issue. v2: - Take a ref to xef even for the VM bind queue (Matt) - Squash patches relevant to that fix and work around (Lucas) v3: Fix typo (Lucas) Fixes: `ce62827bc2` ("drm/xe: Do not access xe file when updating exec queue run_ticks") Fixes: `6109f24f87` ("drm/xe: Add helper to accumulate exec queue runtime") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1908 Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-5-umesh.nerlige.ramappa@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-07-19 00:32:37 -07:00
Matthew Brost	96e7ebb220	drm/xe: Add xe_exec_queue_last_fence_test_dep Helpful to determine if a bind can immediately use CPU or needs to be deferred a drm scheduler job. v7: - Better wording in kernel doc (Matthew Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-4-matthew.brost@intel.com	2024-07-03 22:27:02 -07:00
Francois Dugast	731e46c032	drm/xe/exec_queue: Rename xe_exec_queue::compute to xe_exec_queue::lr The properties of this struct are used in long running context so make that clear by renaming it to lr, in alignment with the rest of the code. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240613170348.723245-1-francois.dugast@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-06-17 16:46:52 -04:00
Matthew Brost	4468d0488e	drm/xe: Drop EXEC_QUEUE_FLAG_BANNED Clean up laying violation of setting q->flags EXEC_QUEUE_FLAG_BANNED bit in GuC backend. Move banned to GuC owned bit and report banned status to upper layers via reset_status vfunc. This is a slight change in behavior as reset_status returns true if wedged or killed bits set too, but in all of these cases submission to queue is no longer allowed. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240604184700.1946918-1-matthew.brost@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-06-07 12:16:36 -04:00
Niranjana Vishwanathapura	264eecdba2	drm/xe: Decouple xe_exec_queue and xe_lrc Decouple xe_lrc from xe_exec_queue and reference count xe_lrc. Removing hard coupling between xe_exec_queue and xe_lrc allows flexible design where the user interface xe_exec_queue can be destroyed independent of the hardware/firmware interface xe_lrc. v2: Fix lrc indexing in wq_item_append() Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240530032211.29299-1-niranjana.vishwanathapura@intel.com	2024-05-29 23:44:41 -07:00
Umesh Nerlige Ramappa	ce62827bc2	drm/xe: Do not access xe file when updating exec queue run_ticks The current code is running into a use after free case where xe file is closed before the exec queue run_ticks can be updated. This is occurring in the xe_file_close path. To fix that, do not access xe file when updating the exec queue run_ticks. Instead store the exec queue run_ticks locally in the exec queue object and accumulate it when the user dumps the drm client stats. We know that the xe file is valid when user is dumping the run_ticks for the drm client, so this effectively removes the dependency on xe file object in xe_exec_queue_update_run_ticks(). v2: - Fix the accumulation of q->run_ticks delta into xe file run_ticks - s/runtime/run_ticks/ (Rodrigo) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1908 Fixes: `6109f24f87` ("drm/xe: Add helper to accumulate exec queue runtime") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240524234744.1352543-2-umesh.nerlige.ramappa@intel.com	2024-05-27 14:07:46 -07:00
Umesh Nerlige Ramappa	45bb564de0	drm/xe: Use run_ticks instead of runtime for client stats Note that runtime is also used in the pm context, so it is confusing to use the same name to denote run time of the drm client. Use a more appropriate name for the client utilization. While at it, drop the incorrect multi-lrc comment in the helper description v2: s/show_runtime/show_run_ticks/ (Rodrigo) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240524234744.1352543-1-umesh.nerlige.ramappa@intel.com	2024-05-27 14:07:44 -07:00
Thomas Hellström	0ac7a2c745	drm/xe: Don't initialize fences at xe_sched_job_create() Pre-allocate but don't initialize fences at xe_sched_job_create(), and initialize / arm them instead at xe_sched_job_arm(). This makes it possible to move xe_sched_job_create() with its memory allocation out of any lock that is required for fence initialization, and that may not allow memory allocation under it. Replaces the struct dma_fence_array for parallell jobs with a struct dma_fence_chain, since the former doesn't allow a split-up between allocation and initialization. v2: - Rebase. - Don't always use the first lrc when initializing parallel lrc fences. - Use dma_fence_chain_contained() to access the lrc fences. v4: - Add an assert that job->lrc_seqno == fence->seqno. (Matthew Brost) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20240527135912.152156-4-thomas.hellstrom@linux.intel.com	2024-05-27 21:26:03 +02:00
Matthew Brost	08f7200899	drm/xe: Decouple job seqno and lrc seqno Tightly coupling these seqno presents problems if alternative fences for jobs are used. Decouple these for correctness. v2: - Slightly reword commit message (Thomas) - Make sure the lrc fence ops are used in comparison (Thomas) - Assume seqno is unsigned rather than signed in format string (Thomas) Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240527135912.152156-2-thomas.hellstrom@linux.intel.com	2024-05-27 21:25:59 +02:00
Rodrigo Vivi	ad1e331fc4	drm/xe: Relax runtime pm protection during execution Limit the protection only during moments of actual job execution, and introduce protection for guc submit fini, which is currently unprotected due to the absence of exec_queue life protection. In the regular use case scenario, user space will create an exec queue, and keep it alive to reuse that until it is done with that kind of workload. For the regular desktop cases, it means that the exec_queue is alive even on idle scenarios where display goes off. This is unacceptable since this would entirely block runtime PM indefinitely, blocking deeper Package-C state. This would be a waste drainage of power. Cc: Matthew Brost <matthew.brost@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522170105.327472-3-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-05-23 11:52:56 -04:00
Umesh Nerlige Ramappa	6109f24f87	drm/xe: Add helper to accumulate exec queue runtime Add a helper to accumulate per-client runtime of all its exec queues. This is called every time a sched job is finished. v2: - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate runtime when job is finished since xe_sched_job_completed() is not a notification that job finished. - Stop trying to update runtime from xe_exec_queue_fini() - that is redundant and may happen after xef is closed, leading to a use-after-free - Do not special case the first timestamp read: the default LRC sets CTX_TIMESTAMP to zero, so even the first sample should be a valid one. - Handle the parallel submission case by multiplying the runtime by width. v3: Update comments Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240517204310.88854-6-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-05-21 06:33:40 -07:00
Rodrigo Vivi	783d6cdc82	drm/xe: Kill xe_device_mem_access_{get*,put} Let's simply convert all the current callers towards direct xe_pm_runtime access and remove this extra layer of indirection. No functional change is expected with this patch since xe_mem_access_get was already using the xe_pm_runtime_get_noresume at this point. v2: Convert all the current callers instead of a big refactor at once. v3: - Rebased - Squashed the GSC/HDCP - Added a new case: sriov_pf_policy - Improved commit message to highlight that there's no functional change in this patch. Reviewed-by: Matthew Auld <matthew.auld@intel.com> #v2 Cc: Suraj Kandpal <suraj.kandpal@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240418143049.43231-1-rodrigo.vivi@intel.com	2024-04-22 09:03:09 -04:00
Rodrigo Vivi	cbb6a7413b	drm/xe: Introduce xe_pm_runtime_get_noresume for inner callers Let's ensure that we have an option for inner callers that will raise WARN if device is not active and not protected by outer callers. Make this also a void function forcing every caller to unconditionally put the reference back afterwards. This will be very important for cases where we want to hold the reference before scheduling a work in a queue. Then the work job will be responsible for putting it back. While at this, already convert a case from mem_access_get_ongoing where it is not checking for the reference and put it back, what would cause the underflow. v2: Fix identation. v3: Convert equivalent missing put from mem_access towards pm_runtime. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-18 08:31:39 -04:00
Bommu Krishnaiah	a3c86b6d7b	drm/xe: prefer snprintf over sprintf since the sprintf() function lacks built-in protection against buffer overflows using the snprintf() function. v2: Removed hard coded values and used sizeof() Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231209235949.54524-2-krishnaiah.bommu@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-04-04 14:54:51 -04:00
Niranjana Vishwanathapura	c9cc3d6586	drm/xe: Use correct function pointer type Use xe_exec_queue_user_extension_fn type for exec_queue_user_extension_funcs.` Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240319174919.1847-1-niranjana.vishwanathapura@intel.com	2024-03-19 22:36:38 -07:00
Niranjana Vishwanathapura	260fa80d4a	drm/xe: Streamline exec queue freeing path Ensure exec queue freeing happens at one place, that is in __xe_exec_queue_free(). It releases q->vm reference also. Set q->vm before handling extensions as they can potentially reference it. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240319175947.15890-1-niranjana.vishwanathapura@intel.com	2024-03-19 22:36:30 -07:00
Matthew Auld	fe87b7dfcb	drm/xe/queue: fix engine_class bounds check The engine_class is the index into the user_to_xe_engine_class, therefore it needs to be less than. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180532.57522-4-matthew.auld@intel.com	2024-03-19 08:47:47 +00:00
Nirmoy Das	79f944eedd	drm/xe: Remove unused 'create' parameter from queue property logic The 'create' parameter in exec_queue_user_extensions was always true. This commit removes the dead parameter and all the relevant dead code. v2: rebase. Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240223143043.22779-1-nirmoy.das@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-03-06 09:38:06 -05:00
Francois Dugast	84a1ed5e67	drm/xe/uapi: Remove unused flags Those cases missed in previous uAPI cleanups were mostly accidentally brought in from i915 or created to exercise the possibilities of gpuvm but they are not used by userspace yet, so let's remove them. They can still be brought back later if needed. v2: - Fix XE_VM_FLAG_FAULT_MODE support in xe_lrc.c (Brian Welty) - Leave DRM_XE_VM_BIND_OP_UNMAP_ALL (José Roberto de Souza) - Ensure invalid flag values are rejected (Rodrigo Vivi) v3: Rebase after removal of persistent exec_queues (Francois Dugast) v4: Rodrigo: Rebase after the new dumpable flag. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222232356.175431-1-rodrigo.vivi@intel.com	2024-02-23 08:33:01 -05:00
Erick Archer	a7a3d73686	drm/xe: Prefer struct_size over open coded arithmetic This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1]. As the "q" variable is a pointer to "struct xe_exec_queue" and this structure ends in a flexible array: struct xe_exec_queue { [...] struct xe_lrc lrc[]; }; the preferred way in the kernel is to use the struct_size() helper to do the arithmetic instead of the argument "size + size * count" in the kzalloc() function. This way, the code is more readable and more safer. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/160 [2] Signed-off-by: Erick Archer <erick.archer@gmx.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240210141913.6611-1-erick.archer@gmx.com	2024-02-22 20:58:20 -08:00
Thomas Hellström	f1a9abc0cf	drm/xe/uapi: Remove support for persistent exec_queues Persistent exec_queues delays explicit destruction of exec_queues until they are done executing, but destruction on process exit is still immediate. It turns out no UMD is relying on this functionality, so remove it. If there turns out to be a use-case in the future, let's re-add. Persistent exec_queues were never used for LR VMs v2: - Don't add an "UNUSED" define for the missing property (Lucas, Rodrigo) v3: - Remove the remaining struct xe_exec_queue::persistent state (Niranjana, Lucas) Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240209113444.8396-1-thomas.hellstrom@linux.intel.com	2024-02-19 12:54:48 +01:00
Matthew Brost	761b333718	drm/xe: Remove exec queue bind.fence_* struct xe_exec_queue bind.fence_* members are unused. Remove these. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213043251.3482928-1-matthew.brost@intel.com	2024-02-14 09:42:47 -08:00
Matthew Brost	a856b67a84	drm/xe: Take a reference in xe_exec_queue_last_fence_get() Take a reference in xe_exec_queue_last_fence_get(). Also fix a reference counting underflow bug VM bind and unbind. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240201004849.2219558-2-matthew.brost@intel.com	2024-02-02 06:45:57 -08:00
Brian Welty	25ce7c5063	drm/xe: Finish refactoring of exec_queue_create Setting of exec_queue user extensions is moved from the end of the ioctl function earlier, into __xe_exec_queue_alloc(). This fixes bug in that the USM attributes for access counters were being applied too late, and effectively were ignored. However, in order to apply user extensions this early, we can no longer call q->ops functions. Instead, make it more efficient. The user extension functions can simply update the q->sched_props values and they will be applied by the backend during q->ops->init(). v2: minor changes for readability (Matt) Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>	2024-01-10 15:01:53 -08:00
Brian Welty	6ae24344e2	drm/xe: Add exec_queue.sched_props.job_timeout_ms The purpose here is to allow to optimize exec_queue_set_job_timeout() in follow-on patch. Currently it does q->ops->set_job_timeout(...). But we'd like to apply exec_queue_user_extensions much earlier and q->ops cannot be called before __xe_exec_queue_init(). It will be much more efficient to instead only have to set q->sched_props.job_timeout_ms when applying user extensions. That value will then be used during q->ops->init(). Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>	2024-01-10 15:01:48 -08:00
Brian Welty	6e144a7d6f	drm/xe: Refactor __xe_exec_queue_create() Split __xe_exec_queue_create() into two functions, alloc and init. We have an issue in that exec_queue_user_extensions are applied too late. In the case of USM properties, these need to be set prior to xe_lrc_init(). Refactor the logic here, so we can resolve this in follow-on. We only need the xe_vm_lock held during the exec_queue_init function. Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>	2024-01-10 15:01:27 -08:00
Brian Welty	a8004af338	drm/xe: Fix modifying exec_queue priority in xe_migrate_init After exec_queue has been created, we cannot simply modify q->priority. This needs to be done by the backend via q->ops. However in this case, it would be more efficient to simply pass a flag when creating the exec_queue and set the desired priority upfront during queue creation. To that end: new flag EXEC_QUEUE_FLAG_HIGH_PRIORITY is introduced. The priority field is moved to be with other scheduling properties and is now exec_queue.sched_props.priority. This is no longer set to initial value by the backend, but is now set within __xe_exec_queue_create(). Fixes: `b4eecedc75` ("drm/xe: Fix potential deadlock handling page faults") Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>	2024-01-09 14:11:58 -08:00
Matthew Brost	d3d767396a	drm/xe/uapi: Remove sync binds Remove concept of async vs sync VM bind queues, rather make all binds async. The following bits have dropped from the uAPI: DRM_XE_ENGINE_CLASS_VM_BIND_ASYNC DRM_XE_ENGINE_CLASS_VM_BIND_SYNC DRM_XE_VM_CREATE_FLAG_ASYNC_DEFAULT DRM_XE_VM_BIND_FLAG_ASYNC To implement sync binds the UMD is expected to use the out-fence interface. v2: Send correct version v3: Drop drm_xe_syncs Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:46:59 -05:00
Rodrigo Vivi	7e9337c29f	drm/xe/uapi: Ensure every uapi struct has drm_xe prefix To ensure consistency and avoid possible later conflicts, let's add drm_xe prefix to xe_user_extension struct. Cc: Francois Dugast <francois.dugast@intel.com> Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com>	2023-12-21 11:46:59 -05:00
Matthew Brost	eb9702ad29	drm/xe: Allow num_batch_buffer / num_binds == 0 in IOCTLs The idea being out-syncs can signal indicating all previous operations on the bind queue are complete. An example use case of this would be support for implementing vkQueueWaitIdle easily. All in-syncs are waited on before signaling out-syncs. This is implemented by forming a composite software fence of in-syncs and installing this fence in the out-syncs and exec queue last fence slot. The last fence must be added as a dependency for jobs on user exec queues as it is possible for the last fence to be a composite software fence (unordered, ioctl with zero bb or binds) rather than hardware fence (ordered, previous job on queue). Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:46:09 -05:00
Lucas De Marchi	5a92da34dd	drm/xe: Rename info.supports_* to info.has_* Rename supports_mmio_ext and supports_usm to use a has_ prefix so the flags are grouped together. This settles on just one variant for positive info matching ("has_") and one for negative ("skip_"). Also make sure the has_* flags are grouped together in xe_pci.c. Reviewed-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20231205145235.2114761-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:27 -05:00
Rodrigo Vivi	0f1d88f278	drm/xe/uapi: Kill exec_queue_set_property All the properties should be immutable and set upon exec_queue creation using the existent extension. So, let's kill this useless and dangerous uapi. Cc: Francois Dugast <francois.dugast@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com>	2023-12-21 11:45:23 -05:00
Matthew Auld	fcf98d68c0	drm/xe: fix mem_access for early lrc generation We spawn some hw queues during device probe to generate the default LRC for every engine type, however the queue destruction step is typically async. Queue destruction needs to do stuff like GuC context deregister which requires GuC CT, which in turn requires an active mem_access ref. The caller during probe is meant to hold the mem_access token, however due to the async destruction it might have already been dropped if we are unlucky. Similar to how we already handle migrate VMs for which there is no mem_access ref, fix this by keeping the callers token alive, releasing it only when destroying the queue. We can treat a NULL vm as indication that we need to grab our own extra ref. Fixes the following splat sometimes seen during load: [ 1682.899930] WARNING: CPU: 1 PID: 8642 at drivers/gpu/drm/xe/xe_device.c:537 xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900209] CPU: 1 PID: 8642 Comm: kworker/u24:97 Tainted: G U W E N 6.6.0-rc3+ #6 [ 1682.900214] Workqueue: submit_wq xe_sched_process_msg_work [xe] [ 1682.900303] RIP: 0010:xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900388] Code: 90 90 90 66 0f 1f 00 0f 1f 44 00 00 53 48 89 fb e8 1e 6c 03 00 48 85 c0 74 06 5b c3 cc cc cc cc 8b 83 28 23 00 00 85 c0 75 f0 <0f> 0b 5b c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 [ 1682.900390] RSP: 0018:ffffc900021cfb68 EFLAGS: 00010246 [ 1682.900394] RAX: 0000000000000000 RBX: ffff8886a96d8000 RCX: 0000000000000000 [ 1682.900396] RDX: 0000000000000001 RSI: ffff8886a6311a00 RDI: ffff8886a96d8000 [ 1682.900398] RBP: ffffc900021cfcc0 R08: 0000000000000001 R09: 0000000000000000 [ 1682.900400] R10: ffffc900021cfcd0 R11: 0000000000000002 R12: 0000000000000004 [ 1682.900402] R13: 0000000000000000 R14: ffff8886a6311990 R15: ffffc900021cfd74 [ 1682.900405] FS: 0000000000000000(0000) GS:ffff888829880000(0000) knlGS:0000000000000000 [ 1682.900407] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1682.900409] CR2: 000055f70bad3fb0 CR3: 000000025243a004 CR4: 00000000003706e0 [ 1682.900412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1682.900413] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1682.900415] Call Trace: [ 1682.900418] <TASK> [ 1682.900420] ? xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900504] ? __warn+0x85/0x170 [ 1682.900510] ? xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900596] ? report_bug+0x171/0x1a0 [ 1682.900604] ? handle_bug+0x3c/0x80 [ 1682.900608] ? exc_invalid_op+0x17/0x70 [ 1682.900612] ? asm_exc_invalid_op+0x1a/0x20 [ 1682.900621] ? xe_device_assert_mem_access+0x27/0x30 [xe] [ 1682.900706] ? xe_device_assert_mem_access+0x12/0x30 [xe] [ 1682.900790] guc_ct_send_locked+0xb9/0x1550 [xe] [ 1682.900882] ? lock_acquire+0xca/0x2b0 [ 1682.900885] ? guc_ct_send+0x3c/0x1a0 [xe] [ 1682.900977] ? lock_is_held_type+0x9b/0x110 [ 1682.900984] ? __mutex_lock+0xc0/0xb90 [ 1682.900989] ? __pfx___drm_printfn_info+0x10/0x10 [ 1682.900999] guc_ct_send+0x53/0x1a0 [xe] [ 1682.901090] ? __lock_acquire+0xf22/0x21b0 [ 1682.901097] ? process_one_work+0x1a0/0x500 [ 1682.901109] xe_guc_ct_send+0x19/0x50 [xe] [ 1682.901202] set_min_preemption_timeout+0x75/0xa0 [xe] [ 1682.901294] disable_scheduling_deregister+0x55/0x250 [xe] [ 1682.901383] ? xe_sched_process_msg_work+0x76/0xd0 [xe] [ 1682.901467] ? lock_release+0xc9/0x260 [ 1682.901474] xe_sched_process_msg_work+0x82/0xd0 [xe] [ 1682.901559] process_one_work+0x20a/0x500 v2: Add the splat Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:45:07 -05:00
Thomas Hellström	fdb6a05383	drm/xe: Internally change the compute_mode and no_dma_fence mode naming The name "compute_mode" can be confusing since compute uses either this mode or fault_mode to achieve the long-running semantics, and compute_mode can, moving forward, enable fault_mode under the hood to work around hardware limitations. Also the name no_dma_fence_mode really refers to what we elsewhere call long-running mode and the mode contrary to what its name suggests allows dma-fences as in-fences. So in an attempt to be more consistent, rename no_dma_fence_mode -> lr_mode compute_mode -> preempt_fence_mode And adjust flags so that preempt_fence_mode sets XE_VM_FLAG_LR_MODE fault_mode sets XE_VM_FLAG_LR_MODE \| XE_VM_FLAG_FAULT_MODE v2: - Fix a typo in the commit message (Oak Zeng) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231127123349.23698-1-thomas.hellstrom@linux.intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:58 -05:00
Francois Dugast	d5dc73dbd1	drm/xe/uapi: Add missing DRM_ prefix in uAPI constants Most constants defined in xe_drm.h use DRM_XE_ as prefix which is helpful to identify the name space. Make this systematic and add this prefix where it was missing. v2: - fix vertical alignment of define values - remove double DRM_ in some variables (José Roberto de Souza) v3: Rebase Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:44:33 -05:00
Priyanka Dandamudi	b8d70702de	drm/xe/xe_exec_queue: Add check for access counter granularity Add conditional check for access counter granularity. This check will return -EINVAL if granularity is beyond 64M which is a hardware limitation. v2: Defined XE_ACC_GRANULARITY_128K 0 XE_ACC_GRANULARITY_2M 1 XE_ACC_GRANULARITY_16M 2 XE_ACC_GRANULARITY_64M 3 as part of uAPI. So, that user can also use it.(Oak) v3: Move uAPI to proper location and give proper documentation.(Brian, Oak) Cc: Oak Zeng <oak.zeng@intel.com> Cc: Janga Rahul Kumar <janga.rahul.kumar@intel.com> Cc: Brian Welty <brian.welty@intel.com> Signed-off-by: Priyanka Dandamudi <priyanka.dandamudi@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:31 -05:00
Matthew Brost	e669f10cd3	drm/xe: Fix VM bind out-sync signaling ordering A case existed where an out-sync of a later VM bind operation could signal before a previous one if the later operation results in a NOP (e.g. a unbind or prefetch to a VA range without any mappings). This breaks the ordering rules, fix this. This patch also lays the groundwork for users to pass in num_binds == 0 and out-syncs. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:18 -05:00
Matthew Brost	f3e9b1f434	drm/xe: Remove async worker and rework sync binds Async worker is gone. All jobs and memory allocations done in IOCTL to align with dma fencing rules. Async vs. sync now means when do bind operations complete relative to the IOCTL. Async completes when out-syncs signal while sync completes when the IOCTL returns. In-syncs and out-syncs are only allowed in async mode. If memory allocations fail in the job creation step the VM is killed. This is temporary, eventually a proper unwind will be done and VM will be usable. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:17 -05:00
Ashutosh Dixit	5dc079d1a8	drm/xe/uapi: Use common drm_xe_ext_set_property extension There really is no difference between 'struct drm_xe_ext_vm_set_property' and 'struct drm_xe_ext_exec_queue_set_property', they are extensions which specify a <property, value> pair. Replace the two extensions with a single common 'struct drm_xe_ext_set_property' extension. The rationale is that rather than have each XE module (including future modules) invent their own property/value extensions, all XE modules use a common set_property extension when possible. Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>	2023-12-21 11:43:17 -05:00
Matthew Brost	bffb257372	drm/xe: Remove XE_EXEC_QUEUE_SET_PROPERTY_COMPUTE_MODE from uAPI Functionality of XE_EXEC_QUEUE_SET_PROPERTY_COMPUTE_MODE deprecated in a previous patch, drop from uAPI. The property is just simply inherented from the VM. v2: - Update commit message (Niranjana) Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2023-12-21 11:43:13 -05:00

1 2

63 Commits