To make sure that we don't unintentionally perform any unclocked and/or
unpowered R/W operation on GPU registers, before turning off clocks and
regulators we must make sure that no GPU, JOB or MMU ISR execution is
pending: doing that requires to add a mechanism to synchronize the
interrupts on suspend.
Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform
interrupts masking and ISR execution synchronization, and then call
those in the panfrost_device_runtime_suspend() handler in the exact
sequence of job (may require mmu!) -> mmu -> gpu.
As a side note, JOB and MMU suspend_irq functions needed some special
treatment: as their interrupt handlers will unmask interrupts, it was
necessary to add an `is_suspended` bitmap which is used to address the
possible corner case of unintentional IRQ unmasking because of ISR
execution after a call to synchronize_irq().
At resume, clear each is_suspended bit in the reset path of JOB/MMU
to allow unmasking the interrupts.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231204114215.54575-4-angelogioacchino.delregno@collabora.com
Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.
This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.
However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.
In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231110001638.71750-1-dakr@redhat.com
In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.
1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.
A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.
v2:
- (Rob Clark) Fix msm build
- Pass in run work queue
v3:
- (Boris) don't have loop in worker
v4:
- (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
- (Boris) default to ordered work queue
v6:
- (Luben / checkpatch) fix alignment in msm_ringbuffer.c
- (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
- (Luben) Update comment for drm_sched_wqueue_enqueue
- (Luben) Positive check for submit_wq in drm_sched_init
- (Luben) s/alloc_submit_wq/own_submit_wq
v7:
- (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
- (Luben) Adjust var names / comments
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
The drm-stats fdinfo tags made available to user space are drm-engine,
drm-cycles, drm-max-freq and drm-curfreq, one per job slot.
This deviates from standard practice in other DRM drivers, where a single
set of key:value pairs is provided for the whole render engine. However,
Panfrost has separate queues for fragment and vertex/tiler jobs, so a
decision was made to calculate bus cycles and workload times separately.
Maximum operating frequency is calculated at devfreq initialisation time.
Current frequency is made available to user space because nvtop uses it
when performing engine usage calculations.
It is important to bear in mind that both GPU cycle and kernel time numbers
provided are at best rough estimations, and always reported in excess from
the actual figure because of two reasons:
- Excess time because of the delay between the end of a job processing,
the subsequent job IRQ and the actual time of the sample.
- Time spent in the engine queue waiting for the GPU to pick up the next
job.
To avoid race conditions during enablement/disabling, a reference counting
mechanism was introduced, and a job flag that tells us whether a given job
increased the refcount. This is necessary, because user space can toggle
cycle counting through a debugfs file, and a given job might have been in
flight by the time cycle counting was disabled.
The main goal of the debugfs cycle counter knob is letting tools like nvtop
or IGT's gputop switch it at any time, to avoid power waste in case no
engine usage measuring is necessary.
Also add a documentation file explaining the possible values for fdinfo's
engine keystrings and Panfrost-specific drm-curfreq-<keystr> pairs.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230929181616.2769345-3-adrian.larumbe@collabora.com
For a while now it's been allowed for a MMU context to outlive it's
corresponding panfrost_priv, however the job structure still references
panfrost_priv to get hold of the MMU context. If panfrost_priv has been
freed this is a use-after-free which I've been able to trigger resulting
in a splat.
To fix this, drop the reference to panfrost_priv in the job structure
and add a direct reference to the MMU structure which is what's actually
needed.
Fixes: 7fdc48cc63 ("drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv")
Signed-off-by: Steven Price <steven.price@arm.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220519152003.81081-1-steven.price@arm.com
Instead of distingting between shared and exclusive fences specify
the fence usage while adding fences.
Rework all drivers to use this interface instead and deprecate the old one.
v2: some kerneldoc comments suggested by Daniel
v3: fix a missing case in radeon
v4: rebase on nouveau changes, fix lockdep and temporary disable warning
v5: more documentation updates
v6: separate internal dma_resv changes from this patch, avoids to
disable warning temporary, rebase on upstream changes
v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com
If the process who submitted these jobs decided to close the FD before
the jobs are done it probably means it doesn't care about the result.
v5:
* Add a panfrost_exception_is_fault() helper and the
DRM_PANFROST_EXCEPTION_MAX_NON_FAULT value
v4:
* Don't disable/restore irqs when taking the job_lock (not needed since
this lock is never taken from an interrupt context)
v3:
* Set fence error to ECANCELED when a TERMINATED exception is received
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-15-boris.brezillon@collabora.com
This is not yet needed because we let active jobs be killed during by
the reset and we don't really bother making sure they can be restarted.
But once we start adding soft-stop support, controlling when we deal
with the remaining interrrupts and making sure those are handled before
the reset is issued gets tricky if we keep job interrupts active.
Let's prepare for that and mask+flush job IRQs before issuing a reset.
v4:
* Add a comment explaining why we WARN_ON(!job) in the irq handler
* Keep taking the job_lock when evicting stalled jobs
v3:
* New patch
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-11-boris.brezillon@collabora.com
Now that we can pass our own workqueue to drm_sched_init(), we can use
an ordered workqueue on for both the scheduler timeout tdr and our own
reset work (which we use when the reset is not caused by a fault/timeout
on a specific job, like when we have AS_ACTIVE bit stuck). This
guarantees that the timeout handlers and reset handler can't run
concurrently which drastically simplifies the locking.
v5:
* Don't call cancel_delayed_timeout() in the reset path (those works
are canceled in drm_sched_stop())
v4:
* Actually pass the reset workqueue to drm_sched_init()
* Don't call cancel_work_sync() in panfrost_reset(). It will deadlock
since it might be called from the reset work, which is executing and
cancel_work_sync() will wait for the handler to return. Checking the
reset pending status should avoid spurious resets
v3:
* New patch
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-10-boris.brezillon@collabora.com
We've fixed many races in panfrost_job_timedout() but some remain.
Instead of trying to fix it again, let's simplify the logic and move
the reset bits to a separate work scheduled when one of the queue
reports a timeout.
v5:
- Simplify panfrost_scheduler_stop() (Steven Price)
- Always restart the queue in panfrost_scheduler_start() even if
the status is corrupted (Steven Price)
v4:
- Rework the logic to prevent a race between drm_sched_start()
(reset work) and drm_sched_job_timedout() (timeout work)
- Drop Steven's R-b
- Add dma_fence annotation to the panfrost_reset() function (Daniel Vetter)
v3:
- Replace the atomic_cmpxchg() by an atomic_xchg() (Robin Murphy)
- Add Steven's R-b
v2:
- Use atomic_cmpxchg() to conditionally schedule the reset work
(Steven Price)
Fixes: 1a11a88cfd ("drm/panfrost: Fix job timeout handling")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201105151704.2010667-1-boris.brezillon@collabora.com
If more than two jobs end up timeout-ing concurrently, only one of them
(the one attached to the scheduler acquiring the lock) is fully handled.
The other one remains in a dangling state where it's no longer part of
the scheduling queue, but still blocks something in scheduler, leading
to repetitive timeouts when new jobs are queued.
Let's make sure all bad jobs are properly handled by the thread
acquiring the lock.
v3:
- Add Steven's R-b
- Don't take the sched_lock when stopping the schedulers
v2:
- Fix the subject prefix
- Stop the scheduler before returning from panfrost_job_timedout()
- Call cancel_delayed_work_sync() after drm_sched_stop() to make sure
no timeout handlers are in flight when we reset the GPU (Steven Price)
- Make sure we release the reset lock before restarting the
schedulers (Steven Price)
Fixes: f3ba91228e ("drm/panfrost: Add initial panfrost driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002122506.1374183-1-boris.brezillon@collabora.com
The calls to panfrost_devfreq_record_busy() and
panfrost_devfreq_record_idle() must be balanced to ensure that the
devfreq utilisation is correctly reported. But there are two cases where
this doesn't work correctly.
In panfrost_job_hw_submit() if pm_runtime_get_sync() fails or the
WARN_ON() fires then no call to panfrost_devfreq_record_busy() is made,
but when the job times out the corresponding _record_idle() call is
still made in panfrost_job_timedout(). Move the call up to ensure that
it always happens.
Secondly panfrost_job_timedout() only makes a single call to
panfrost_devfreq_record_idle() even if it is cleaning up multiple jobs.
Move the call inside the loop to ensure that the number of
_record_idle() calls matches the number of _record_busy() calls.
Fixes: 9e62b885f7 ("drm/panfrost: Simplify devfreq utilisation tracking")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200522153653.40754-1-steven.price@arm.com
drm-misc-next for 5.7:
UAPI Changes:
- lima: Add support for heap buffers
Cross-subsystem Changes:
Core Changes:
- Implement mode_config mode_valid for memory constrained drivers
- Bus format negociation between bridges
- Consolidate fake vblank events for drivers without vblank interrupts
- drm/bufs: dma_alloc related cleanups
- drm/dp_mst: Various fixes
- drm/print: New drm_device based print helpers
- Thomas is a drm-misc maintainer now!
Driver Changes:
- DPMS cleanups for atomic drivers
- Removal of owner field in SPI tinydrm drivers
- Removal of explicit dependency on DT for tinydrm drivers
- Conversion to YAML schemas for DT bindings
- tidss: New driver
- virtio: various reworks and fixes
- Our usual dozen or so new panels or bridges
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200210093421.xu4sofldm6wm6xq6@gilmour.lan