linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-11 20:39:55 +00:00

Author	SHA1	Message	Date
Simona Vetter	b615b9c36c	Merge v6.11-rc7 into drm-next Thomas needs `5a498d4d06` ("drm/fbdev-dma: Only install deferred I/O if necessary") in drm-misc, so start the backmerge cascade. Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>	2024-09-11 09:18:15 +02:00
Tvrtko Ursulin	9d824c7fce	drm/v3d: Disable preemption while updating GPU stats We forgot to disable preemption around the write_seqcount_begin/end() pair while updating GPU stats: [ ] WARNING: CPU: 2 PID: 12 at include/linux/seqlock.h:221 __seqprop_assert.isra.0+0x128/0x150 [v3d] [ ] Workqueue: v3d_bin drm_sched_run_job_work [gpu_sched] <...snip...> [ ] Call trace: [ ] __seqprop_assert.isra.0+0x128/0x150 [v3d] [ ] v3d_job_start_stats.isra.0+0x90/0x218 [v3d] [ ] v3d_bin_job_run+0x23c/0x388 [v3d] [ ] drm_sched_run_job_work+0x520/0x6d0 [gpu_sched] [ ] process_one_work+0x62c/0xb48 [ ] worker_thread+0x468/0x5b0 [ ] kthread+0x1c4/0x1e0 [ ] ret_from_fork+0x10/0x20 Fix it. Cc: Maíra Canal <mcanal@igalia.com> Cc: stable@vger.kernel.org # v6.10+ Fixes: `6abe93b621` ("drm/v3d: Fix race-condition between sysfs/fdinfo and interrupt handler") Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Acked-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240813102505.80512-1-tursulin@igalia.com	2024-08-28 11:36:53 -03:00
Daniel Vetter	4461e9e5c3	Merge v6.11-rc5 into drm-next amdgpu pr conconflicts due to patches cherry-picked to -fixes, I might as well catch up with a backmerge and handle them all. Plus both misc and intel maintainers asked for a backmerge anyway. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2024-08-27 14:09:45 +02:00
Maíra Canal	497d370a64	drm/v3d: Fix out-of-bounds read in `v3d_csd_job_run()` When enabling UBSAN on Raspberry Pi 5, we get the following warning: [ 387.894977] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/v3d/v3d_sched.c:320:3 [ 387.903868] index 7 is out of range for type '__u32 [7]' [ 387.909692] CPU: 0 PID: 1207 Comm: kworker/u16:2 Tainted: G WC 6.10.3-v8-16k-numa #151 [ 387.919166] Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT) [ 387.925961] Workqueue: v3d_csd drm_sched_run_job_work [gpu_sched] [ 387.932525] Call trace: [ 387.935296] dump_backtrace+0x170/0x1b8 [ 387.939403] show_stack+0x20/0x38 [ 387.942907] dump_stack_lvl+0x90/0xd0 [ 387.946785] dump_stack+0x18/0x28 [ 387.950301] __ubsan_handle_out_of_bounds+0x98/0xd0 [ 387.955383] v3d_csd_job_run+0x3a8/0x438 [v3d] [ 387.960707] drm_sched_run_job_work+0x520/0x6d0 [gpu_sched] [ 387.966862] process_one_work+0x62c/0xb48 [ 387.971296] worker_thread+0x468/0x5b0 [ 387.975317] kthread+0x1c4/0x1e0 [ 387.978818] ret_from_fork+0x10/0x20 [ 387.983014] ---[ end trace ]--- This happens because the UAPI provides only seven configuration registers and we are reading the eighth position of this u32 array. Therefore, fix the out-of-bounds read in `v3d_csd_job_run()` by accessing only seven positions on the '__u32 [7]' array. The eighth register exists indeed on V3D 7.1, but it isn't currently used. That being so, let's guarantee that it remains unused and add a note that it could be set in a future patch. Fixes: `0ad5bc1ce4` ("drm/v3d: fix up register addresses for V3D 7.x") Reported-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809152001.668314-1-mcanal@igalia.com	2024-08-12 11:14:21 -03:00
Daniel Vetter	91dae758bd	Merge tag 'drm-misc-next-2024-08-01' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.12: UAPI Changes: virtio: - Define DRM capset Cross-subsystem Changes: dma-buf: - heaps: Clean up documentation printk: - Pass description to kmsg_dump() Core Changes: CI: - Update IGT tests - Point upstream repo to GitLab instance modesetting: - Introduce Power Saving Policy property for connectors - Add might_fault() to drm_modeset_lock priming - Add dynamic per-crtc vblank configuration support panic: - Avoid build-time interference with framebuffer console docs: - Document Colorspace property scheduler: - Remove full_recover from drm_sched_start TTM: - Make LRU walk restartable after dropping locks - Allow direct reclaim to allocate local memory Driver Changes: amdgpu: - Support Power Saving Policy connector property ast: - astdp: Support AST2600 with VGA; Clean up HPD bridge: - Silence error message on -EPROBE_DEFER - analogix: Clean aup - bridge-connector: Fix double free - lt6505: Disable interrupt when powered off - tc358767: Make default DP port preemphasis configurable gma500: - Update i2c terminology ivpu: - Add MODULE_FIRMWARE() lcdif: - Fix pixel clock loongson: - Use GEM refcount over TTM's mgag200: - Improve BMC handling - Support VBLANK intterupts nouveau: - Refactor and clean up internals - Use GEM refcount over TTM's panel: - Shutdown fixes plus documentation - Refactor several drivers for better code sharing - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus DT; Fix porch parameter - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2, CMN N116BCP-EA2, CSW MNB601LS1-4 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor for code sharing sti: - Fix module owner stm: - Avoid UAF wih managed plane and CRTC helpers - Fix module owner - Fix error handling in probe - Depend on COMMON_CLK - ltdc: Fix transparency after disabling plane; Remove unused interrupt tegra: - Call drm_atomic_helper_shutdown() v3d: - Clean up perfmon vkms: - Clean up Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240801121406.GA102996@linux.fritz.box	2024-08-08 18:58:46 +02:00
Maxime Ripard	a1ff5a7d78	Merge drm/drm-fixes into drm-misc-fixes Let's start the new drm-misc-fixes cycle by bringing in 6.11-rc1. Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-07-30 09:09:23 +02:00
Thomas Zimmermann	0e8655b4e8	Merge drm/drm-next into drm-misc-next Backmerging to get a late RC of v6.10 before moving into v6.11. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-07-29 09:35:54 +02:00
Christian König	83b501c179	drm/scheduler: remove full_recover from drm_sched_start This was basically just another one of amdgpus hacks. The parameter allowed to restart the scheduler without turning fence signaling on again. That this is absolutely not a good idea should be obvious by now since the fences will then just sit there and never signal. While at it cleanup the code a bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240722083816.99685-1-christian.koenig@amd.com	2024-07-25 14:05:12 +02:00
Tvrtko Ursulin	32df4abc44	drm/v3d: Fix potential memory leak in the performance extension If fetching of userspace memory fails during the main loop, all drm sync objs looked up until that point will be leaked because of the missing drm_syncobj_put. Fix it by exporting and using a common cleanup helper. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `bae7cb5d68` ("drm/v3d: Create a CPU job extension for the reset performance query job") Cc: Maíra Canal <mcanal@igalia.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-4-tursulin@igalia.com (cherry picked from commit `484de39fa5`) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-07-18 15:49:28 +02:00
Tvrtko Ursulin	0e50fcc20b	drm/v3d: Fix potential memory leak in the timestamp extension If fetching of userspace memory fails during the main loop, all drm sync objs looked up until that point will be leaked because of the missing drm_syncobj_put. Fix it by exporting and using a common cleanup helper. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `9ba0ff3e08` ("drm/v3d: Create a CPU job extension for the timestamp query job") Cc: Maíra Canal <mcanal@igalia.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-3-tursulin@igalia.com (cherry picked from commit `753ce4fea6`) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-07-18 15:49:08 +02:00
Maíra Canal	1fe1c66274	drm/v3d: Fix Indirect Dispatch configuration for V3D 7.1.6 and later `args->cfg[4]` is configured in Indirect Dispatch using the number of batches. Currently, for all V3D tech versions, `args->cfg[4]` equals the number of batches subtracted by 1. But, for V3D 7.1.6 and later, we must not subtract 1 from the number of batches. Implement the fix by checking the V3D tech version and revision. Fixes several `dEQP-VK.synchronization*` CTS tests related to Indirect Dispatch. Fixes: `18b8413b25` ("drm/v3d: Create a CPU job extension for a indirect CSD job") Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-2-mcanal@igalia.com	2024-07-15 12:49:52 -03:00
Tvrtko Ursulin	1be825c5c0	drm/v3d: Do not use intermediate storage when copying performance query results Removing the intermediate buffer removes the last use of the V3D_MAX_COUNTERS define, which will enable further driver cleanup. While at it pull the 32 vs 64 bit copying decision outside the loop in order to reduce the number of conditional instructions. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-9-tursulin@igalia.com	2024-07-13 11:00:32 -03:00
Tvrtko Ursulin	c9d6630f7c	drm/v3d: Size the kperfmon_ids array at runtime Instead of statically reserving pessimistic space for the kperfmon_ids array, make the userspace extension code allocate the exactly required amount of space. Apart from saving some memory at runtime, this also removes the need for the V3D_MAX_PERFMONS macro whose removal will benefit further driver cleanup. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-8-tursulin@igalia.com	2024-07-13 11:00:31 -03:00
Tvrtko Ursulin	484de39fa5	drm/v3d: Fix potential memory leak in the performance extension If fetching of userspace memory fails during the main loop, all drm sync objs looked up until that point will be leaked because of the missing drm_syncobj_put. Fix it by exporting and using a common cleanup helper. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `bae7cb5d68` ("drm/v3d: Create a CPU job extension for the reset performance query job") Cc: Maíra Canal <mcanal@igalia.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-4-tursulin@igalia.com	2024-07-13 11:00:31 -03:00
Tvrtko Ursulin	753ce4fea6	drm/v3d: Fix potential memory leak in the timestamp extension If fetching of userspace memory fails during the main loop, all drm sync objs looked up until that point will be leaked because of the missing drm_syncobj_put. Fix it by exporting and using a common cleanup helper. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `9ba0ff3e08` ("drm/v3d: Create a CPU job extension for the timestamp query job") Cc: Maíra Canal <mcanal@igalia.com> Cc: Iago Toral Quiroga <itoral@igalia.com> Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711135340.84617-3-tursulin@igalia.com	2024-07-13 11:00:31 -03:00
Maíra Canal	f5b798bdc9	drm/v3d: Use V3D_MAX_COUNTERS instead of V3D_PERFCNT_NUM V3D_PERFCNT_NUM represents the maximum number of performance counters for V3D 4.2, but not for V3D 7.1. This means that, if we use V3D_PERFCNT_NUM, we might go out-of-bounds on V3D 7.1. Therefore, use the number of performance counters on V3D 7.1 as the maximum number of counters. This will allow us to create arrays on the stack with reasonable size. Note that userspace must use the value provided by DRM_V3D_PARAM_MAX_PERF_COUNTERS. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240512222655.2792754-6-mcanal@igalia.com	2024-05-20 16:38:03 -03:00
Maíra Canal	6abe93b621	drm/v3d: Fix race-condition between sysfs/fdinfo and interrupt handler In V3D, the conclusion of a job is indicated by a IRQ. When a job finishes, then we update the local and the global GPU stats of that queue. But, while the GPU stats are being updated, a user might be reading the stats from sysfs or fdinfo. For example, on `gpu_stats_show()`, we could think about a scenario where `v3d->queue[queue].start_ns != 0`, then an interrupt happens, we update the value of `v3d->queue[queue].start_ns` to 0, we come back to `gpu_stats_show()` to calculate `active_runtime` and now, `active_runtime = timestamp`. In this simple example, the user would see a spike in the queue usage, that didn't match reality. In order to address this issue properly, use a seqcount to protect read and write sections of the code. Fixes: `09a93cc4f7` ("drm/v3d: Implement show_fdinfo() callback for GPU usage stats") Reported-by: Tvrtko Ursulin <tursulin@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-7-mcanal@igalia.com	2024-04-23 19:32:49 -03:00
Maíra Canal	da483d079b	drm/v3d: Create function to update a set of GPU stats Given a set of GPU stats, that is, a `struct v3d_stats` related to a queue in a given context, create a function that can update this set of GPU stats. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-5-mcanal@igalia.com	2024-04-23 19:32:47 -03:00
Maíra Canal	b136b1953f	drm/v3d: Create a struct to store the GPU stats This will make it easier to instantiate the GPU stats variables and it will create a structure where we can store all the variables that refer to GPU stats. Note that, when we created the struct `v3d_stats`, we renamed `jobs_sent` to `jobs_completed`. This better express the semantics of the variable, as we are only accounting jobs that have been completed. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-4-mcanal@igalia.com	2024-04-23 19:32:46 -03:00
Maíra Canal	52ce97765c	drm/v3d: Create two functions to update all GPU stats variables Currently, we manually perform all operations to update the GPU stats variables. Apart from the code repetition, this is very prone to errors, as we can see on commit `35f4f8c9fc` ("drm/v3d: Don't increment `enabled_ns` twice"). Therefore, create two functions to manage updating all GPU stats variables. Now, the jobs only need to call for `v3d_job_update_stats()` when the job is done and `v3d_job_start_stats()` when starting the job. Co-developed-by: Tvrtko Ursulin <tursulin@igalia.com> Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240420213632.339941-3-mcanal@igalia.com	2024-04-23 19:32:45 -03:00
Maíra Canal	209e8d2695	drm/v3d: Create a CPU job extension for the copy performance query job A CPU job is a type of job that performs operations that requires CPU intervention. A copy performance query job is a job that copy the complete or partial result of a query to a buffer. In order to copy the result of a performance query to a buffer, we need to get the values from the performance monitors. So, create a user extension for the CPU job that enables the creation of a copy performance query job. This user extension will allow the creation of a CPU job that copy the results of a performance query to a BO with the possibility to indicate the availability with a availability bit. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-19-mcanal@igalia.com	2023-12-01 09:47:36 -03:00
Maíra Canal	bae7cb5d68	drm/v3d: Create a CPU job extension for the reset performance query job A CPU job is a type of job that performs operations that requires CPU intervention. A reset performance query job is a job that resets the performance queries by resetting the values of the perfmons. Moreover, we also reset the syncobjs related to the availability of the query. So, create a user extension for the CPU job that enables the creation of a reset performance job. This user extension will allow the creation of a CPU job that resets the perfmons values and resets the availability syncobj. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-18-mcanal@igalia.com	2023-12-01 09:47:35 -03:00
Maíra Canal	6745f3e44a	drm/v3d: Create a CPU job extension to copy timestamp query to a buffer A CPU job is a type of job that performs operations that requires CPU intervention. A copy timestamp query job is a job that copy the complete or partial result of a query to a buffer. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a copy timestamp query job. This user extension will allow the creation of a CPU job that copy the results of a timestamp query to a BO with the possibility to indicate the timestamp availability with a availability bit. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-17-mcanal@igalia.com	2023-12-01 09:47:31 -03:00
Maíra Canal	34a101e642	drm/v3d: Create a CPU job extension for the reset timestamp job A CPU job is a type of job that performs operations that requires CPU intervention. A reset timestamp job is a job that resets the timestamp queries based on the value offset of the first query. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a reset timestamp job. This user extension will allow the creation of a CPU job that resets the timestamp value in the timestamp BO and resets the availability syncobj. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-16-mcanal@igalia.com	2023-12-01 09:42:47 -03:00
Maíra Canal	9ba0ff3e08	drm/v3d: Create a CPU job extension for the timestamp query job A CPU job is a type of job that performs operations that requires CPU intervention. A timestamp query job is a job that calculates the query timestamp and updates the query availability by signaling a syncobj. As V3D doesn't provide any mechanism to obtain a timestamp from the GPU, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of a timestamp query job. This user extension will allow the creation of a CPU job that performs the timestamp query calculation and updates the timestamp BO with the proper value. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-15-mcanal@igalia.com	2023-12-01 09:41:37 -03:00
Maíra Canal	18b8413b25	drm/v3d: Create a CPU job extension for a indirect CSD job A CPU job is a type of job that performs operations that requires CPU intervention. An indirect CSD job is a job that, when executed in the queue, will map the indirect buffer, read the dispatch parameters, and submit a regular dispatch. Therefore, it is a job that needs CPU intervention. So, create a user extension for the CPU job that enables the creation of an indirect CSD. This user extension will allow the creation of a CSD job linked to a CPU job. The CPU job will wait for the indirect CSD job dependencies and, once they are signaled, it will update the CSD job parameters. Co-developed-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-14-mcanal@igalia.com	2023-12-01 09:40:15 -03:00
Maíra Canal	1fe0879efc	drm/v3d: Create tracepoints to track the CPU job Create tracepoints to track the three major events of a CPU job lifetime: 1. Submission of a `v3d_submit_cpu` IOCTL 2. Beginning of the execution of a CPU job 3. Ending of the execution of a CPU job Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-11-mcanal@igalia.com	2023-12-01 09:37:48 -03:00
Melissa Wen	aafc1a2bea	drm/v3d: Add a CPU job submission Create a new type of job, a CPU job. A CPU job is a type of job that performs operations that requires CPU intervention. The overall idea is to use user extensions to enable different types of CPU job, allowing the CPU job to perform different operations according to the type of user extension. The user extension ID identify the type of CPU job that must be dealt. Having a CPU job is interesting for synchronization purposes as a CPU job has a queue like any other V3D job and can be synchoronized by the multisync extension. Signed-off-by: Melissa Wen <mwen@igalia.com> Co-developed-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130164420.932823-9-mcanal@igalia.com	2023-12-01 09:34:19 -03:00
Maíra Canal	509433d814	drm/v3d: Expose the total GPU usage stats on sysfs The previous patch exposed the accumulated amount of active time per client for each V3D queue. But this doesn't provide a global notion of the GPU usage. Therefore, provide the accumulated amount of active time for each V3D queue (BIN, RENDER, CSD, TFU and CACHE_CLEAN), considering all the jobs submitted to the queue, independent of the client. This data is exposed through the sysfs interface, so that if the interface is queried at two different points of time the usage percentage of each of the queues can be calculated. Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com	2023-11-06 10:09:29 -03:00
Maíra Canal	09a93cc4f7	drm/v3d: Implement show_fdinfo() callback for GPU usage stats This patch exposes the accumulated amount of active time per client through the fdinfo infrastructure. The amount of active time is exposed for each V3D queue: BIN, RENDER, CSD, TFU and CACHE_CLEAN. In order to calculate the amount of active time per client, a CPU clock is used through the function local_clock(). The point where the jobs has started is marked and is finally compared with the time that the job had finished. Moreover, the number of jobs submitted to each queue is also exposed on fdinfo through the identifier "v3d-jobs-<queue>". Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com	2023-11-06 10:09:23 -03:00
Iago Toral Quiroga	0ad5bc1ce4	drm/v3d: fix up register addresses for V3D 7.x This patch updates a number of register addresses that have been changed in Raspberry Pi 5 (V3D 7.1) and updates the code to use the corresponding registers and addresses based on the actual V3D version. Signed-off-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-3-itoral@igalia.com	2023-11-02 08:54:39 -03:00
Matthew Brost	a6149f0393	drm/sched: Convert drm scheduler to use a work queue rather than kthread In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1 mapping between a drm_gpu_scheduler and drm_sched_entity. At first this seems a bit odd but let us explain the reasoning below. 1. In Xe the submission order from multiple drm_sched_entity is not guaranteed to be the same completion even if targeting the same hardware engine. This is because in Xe we have a firmware scheduler, the GuC, which allowed to reorder, timeslice, and preempt submissions. If a using shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls apart as the TDR expects submission order == completion order. Using a dedicated drm_gpu_scheduler per drm_sched_entity solve this problem. 2. In Xe submissions are done via programming a ring buffer (circular buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow control on the ring for free. A problem with this design is currently a drm_gpu_scheduler uses a kthread for submission / job cleanup. This doesn't scale if a large number of drm_gpu_scheduler are used. To work around the scaling issue, use a worker rather than kthread for submission / job cleanup. v2: - (Rob Clark) Fix msm build - Pass in run work queue v3: - (Boris) don't have loop in worker v4: - (Tvrtko) break out submit ready, stop, start helpers into own patch v5: - (Boris) default to ordered work queue v6: - (Luben / checkpatch) fix alignment in msm_ringbuffer.c - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue - (Luben) Update comment for drm_sched_wqueue_enqueue - (Luben) Positive check for submit_wq in drm_sched_init - (Luben) s/alloc_submit_wq/own_submit_wq v7: - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue v8: - (Luben) Adjust var names / comments Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>	2023-11-01 17:29:21 -04:00
Luben Tuikov	56e449603f	drm/sched: Convert the GPU scheduler to variable number of run-queues The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Emma Anholt <emma@anholt.net> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com	2023-10-26 12:03:47 -04:00
Melissa Wen	7e302637ba	drm/v3d: centralize error handling when init scheduler fails Remove redundant error message (since now it is very similar to what we do in drm_sched_init) and centralize all error handling in a unique place, as we follow the same steps in any case of failure. Signed-off-by: Melissa Wen <mwen@igalia.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Melissa Wen <melissa.srw@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220228181647.3794298-1-mwen@igalia.com	2022-03-01 22:37:13 -01:00
Jiawei Gu	8ab62eda17	drm/sched: Add device pointer to drm_gpu_scheduler Add device pointer so scheduler's printing can use DRM_DEV_ERROR() instead, which makes life easier under multiple GPU scenario. v2: amend all calls of drm_sched_init() v3: fill dev pointer for all drm_sched_init() calls Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com	2022-02-23 10:04:14 +01:00
Daniel Vetter	da3208e863	drm/v3d: Use scheduler dependency handling With the prep work out of the way this isn't tricky anymore. Aside: The chaining of the various jobs is a bit awkward, with the possibility of failure in bad places. I think with the drm_sched_job_init/arm split and maybe preloading the job->dependencies xarray this should be fixable. v2: Rebase over renamed function names for adding dependencies. Reviewed-by: Melissa Wen <mwen@igalia.com> (v1) Acked-by: Emma Anholt <emma@anholt.net> Cc: Melissa Wen <melissa.srw@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Emma Anholt <emma@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-11-daniel.vetter@ffwll.ch	2021-08-30 10:58:47 +02:00
Daniel Vetter	916044fac8	drm/v3d: Move drm_sched_job_init to v3d_job_init Prep work for using the scheduler dependency handling. We need to call drm_sched_job_init earlier so we can use the new drm_sched_job_await* functions for dependency handling here. v2: Slightly better commit message and rebase to include the drm_sched_job_arm() call (Emma). v3: Cleanup jobs under construction correctly (Emma) v4: Rebase over perfmon patch Reviewed-by: Melissa Wen <mwen@igalia.com> (v3) Acked-by: Emma Anholt <emma@anholt.net> Cc: Melissa Wen <melissa.srw@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Emma Anholt <emma@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-10-daniel.vetter@ffwll.ch	2021-08-30 10:58:40 +02:00
Juan A. Suarez Romero	26a4dc29b7	drm/v3d: Expose performance counters to userspace The V3D engine has several hardware performance counters that can of interest for userspace performance analysis tools. This exposes new ioctls to create and destroy performance monitor objects, as well as to query the counter values. Each created performance monitor object has an ID that can be attached to CL/CSD submissions, so the driver enables the requested counters when the job is submitted, and updates the performance monitor values when the job is done. It is up to the user to ensure all the jobs have been finished before getting the performance monitor values. It is also up to the user to properly synchronize BCL jobs when submitting jobs with different performance monitors attached. Cc: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@linux.ie> Cc: Emma Anholt <emma@anholt.net> To: dri-devel@lists.freedesktop.org Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Acked-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Melissa Wen <melissa.srw@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210608111541.461991-1-jasuarez@igalia.com	2021-07-21 00:19:59 +01:00
Boris Brezillon	78efe21b6f	drm/sched: Allow using a dedicated workqueue for the timeout/fault tdr Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU reset. This leads to extra complexity when we need to synchronize timeout works with the reset work. One solution to address that is to have an ordered workqueue at the driver level that will be used by the different schedulers to queue their timeout work. Thanks to the serialization provided by the ordered workqueue we are guaranteed that timeout handlers are executed sequentially, and can thus easily reset the GPU from the timeout handler without extra synchronization. v5: * Add a new paragraph to the timedout_job() method v3: * New patch v4: * Actually use the timeout_wq to queue the timeout work Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Emma Anholt <emma@anholt.net> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com	2021-07-01 08:53:25 +02:00
Christian König	f2f12eb9c3	drm/scheduler: provide scheduler score externally Allow multiple schedulers to share the load balancing score. This is useful when one engine has different hw rings. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com	2021-02-05 10:47:11 +01:00
Christian König	576a08e008	drm/v3d/v3d_sched: fix scheduler callbacks return status Looks like this was not correctly adjusted. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Fixes: `a6a1f036c7` ("drm/scheduler: Job timeout handler returns status (v3)") Link: https://patchwork.freedesktop.org/patch/msgid/20210201091159.177853-1-christian.koenig@amd.com	2021-02-02 11:10:14 +01:00
Luben Tuikov	a6a1f036c7	drm/scheduler: Job timeout handler returns status (v3) This patch does not change current behaviour. The driver's job timeout handler now returns status indicating back to the DRM layer whether the device (GPU) is no longer available, such as after it's been unplugged, or whether all is normal, i.e. current behaviour. All drivers which make use of the drm_sched_backend_ops' .timedout_job() callback have been accordingly renamed and return the would've-been default value of DRM_GPU_SCHED_STAT_NOMINAL to restart the task's timeout timer--this is the old behaviour, and is preserved by this patch. v2: Use enum as the status of a driver's job timeout callback method. v3: Return scheduler/device information, rather than task information. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Eric Anholt <eric@anholt.net> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Steven Price <steven.price@arm.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/415095/	2021-01-29 11:30:22 +01:00
Lee Jones	d49c4b2c07	drm/v3d/v3d_sched: Demote non-conformant kernel-doc header Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 'sched_job' not described in 'v3d_job_dependency' drivers/gpu/drm/v3d/v3d_sched.c:75: warning: Function parameter or member 's_entity' not described in 'v3d_job_dependency' Cc: Eric Anholt <eric@anholt.net> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-37-lee.jones@linaro.org	2020-11-18 11:51:27 +01:00
Mauro Carvalho Chehab	e9d2871f69	drm: fix some kernel-doc markups Some identifiers have different names between their prototypes and the kernel-doc markup. Others need to be fixed, as kernel-doc markups should use this format: identifier - description Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/12d4ca26f6843618200529ce5445063734d38c04.1605521731.git.mchehab+huawei@kernel.org	2020-11-16 20:48:20 +01:00
Daniel Vetter	bc662528e2	drm/v3d: Delete v3d_dev->dev We already have it in v3d_dev->drm.dev with zero additional pointer chasing. Personally I don't like duplicated pointers like this because: - reviewers need to check whether the pointer is for the same or different objects if there's multiple - compilers have an easier time too But also a bit a bikeshed, so feel free to ignore. Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20200415074034.175360-10-daniel.vetter@ffwll.ch	2020-04-28 15:15:52 +02:00
Christian König	5918045c4e	drm/scheduler: rework job destruction We now destroy finished jobs from the worker thread to make sure that we never destroy a job currently in timeout processing. By this we avoid holding lock around ring mirror list in drm_sched_stop which should solve a deadlock reported by a user. v2: Remove unused variable. v4: Move guilty job free into sched code. v5: Move sched->hw_rq_count to drm_sched_start to account for counter decrement in drm_sched_stop even when we don't call resubmit jobs if guily job did signal. v6: remove unused variable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692 Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com	2019-05-02 15:45:48 -05:00
Eric Anholt	dffa9b7a78	drm/v3d: Add missing implicit synchronization. It is the expectation of existing userspace (X11 + Mesa, in particular) that jobs submitted to the kernel against a shared BO will get implicitly synchronized by their submission order. If we want to allow clever userspace to disable implicit synchronization, we should do that under its own submit flag (as amdgpu and lima do). Note that we currently only implicitly sync for the rendering pass, not binning -- if you texture-from-pixmap in the binning vertex shader (vertex coordinate generation), you'll miss out on synchronization. Fixes flickering when multiple clients are running in parallel, particularly GL apps and compositors. v2: Fix a missing refcount on the CSD done fence for L2 cleaning. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-6-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>	2019-04-18 09:54:16 -07:00
Eric Anholt	d223f98f02	drm/v3d: Add support for compute shader dispatch. The compute shader dispatch interface is pretty simple -- just pass in the regs that userspace has passed us, with no CLs to run. However, with no CL to run it means that we need to do manual cache flushing of the L2 after the HW execution completes (for SSBO, atomic, and image_load_store writes that are the output of compute shaders). This doesn't yet expose the L2 cache's ability to have a region of the address space not write back to memory (which could be used for shared_var storage). So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing the ES31 tests), and on the kernel side on 7278 (failing atomic compswap tests in a way that doesn't reproduce on simpenrose). v2: Fix excessive allocation for the clean_job (reported by Dan Carpenter). Keep refs on jobs until clean_job is finished, to avoid spurious MMU errors if the output BOs are freed by userspace before L2 cleaning is finished. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>	2019-04-18 09:54:10 -07:00
Eric Anholt	a783a09ee7	drm/v3d: Refactor job management. The CL submission had two jobs embedded in an exec struct. When I added TFU support, I had to replicate some of the exec stuff and some of the job stuff. As I went to add CSD, it became clear that actually what was in exec should just be in the two CL jobs, and it would let us share a lot more code between the 4 queues. v2: Fix missing error path in TFU ioctl's bo[] allocation. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-3-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>	2019-04-18 09:54:07 -07:00
Eric Anholt	3f0b646e1a	drm/v3d: Rename the fence signaled from IRQs to "irq_fence". We have another thing called the "done fence" that tracks when the scheduler considers the job done, and having the shared name was confusing. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190313235211.28995-2-eric@anholt.net Reviewed-by: Dave Emett <david.emett@broadcom.com>	2019-04-01 10:44:34 -07:00

1 2

62 Commits