Extend the timeout for recovering vram bos from shadows on sr-iov
to cover the worst case scenario for timeslices and VFs
Under runtime, the wait fence time could be quite long when
other VFs are in exclusive mode. For example, for 4 VF, every
VF's exclusive timeout time is set to 3s, then the worst case is
9s. If the VF number is more than 4,then the worst case time will
be longer.
The 8s is the test data, with setting to 8s, it will pass the TDR
test for 1000 times.
SWDEV-161490
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
idx can be indirectly controlled by user-space, hence leading to a
potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c:408 amdgpu_set_pp_force_state()
warn: potential spectre issue 'data.states'
Fix this by sanitizing idx before using it to index data.states
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function ttm_bo_put releases a reference to a TTM buffer object. The
function's name is more aligned to the Linux kernel convention of naming
ref-counting function _get and _put.
A call to ttm_bo_unref takes the address of the TTM BO object's pointer and
clears the pointer's value to NULL. This is not necessary in most cases and
sometimes even worked around by the calling code. A call to ttm_bo_put only
releases the reference without clearing the pointer.
The current behaviour of cleaning the pointer is kept in the calling code,
but should be removed if not required in a later patch.
v2:
* set prefix to drm/amdgpu
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function ttm_bo_get acquires a reference on a TTM buffer object. The
function's name is more aligned to the Linux kernel convention of naming
ref-counting function _get and _put.
v2:
* changed prefix to drm/amdgpu
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The bo_list handle is allocated by OP_CREATE, so in OP_UPDATE here we just
re-create the bo_list object and replace the handle. This way we don't
need locking to protect the bo_list because it's always re-created when
changed.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Properly handle cases where one or more instance of the IP
block may be harvested.
v2: make sure ip_num_rings is initialized amdgpu_queue_mgr.c
v3: rebase on Christian's UVD changes, drop unused var
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We are going to need that for the second UVD instance on Vega20.
v2: rename to patch_cs_in_place
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
cik_pcie_gen3_enable() is only called by cik_common_hw_init(), which is
never called in atomic context.
cik_pcie_gen3_enable() calls mdelay() to busily wait, which is not
necessary.
mdelay() can be replaced with msleep().
This is found by a static analysis tool named DCNS written by myself.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The scheduler of the entity is decided by the run queue on which
it is queued. This patch avoids us the effort required to maintain
a sync between rq and sched field when we start shifting entites
among different rqs.
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Going to completely rework the context to ring mapping with Nayan's GSoC
work, but for now just stopping to expose the second UVD instance should
do it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch moves amdgpu_fbdev_set_suspend() to the beginning
of suspend sequence.
This is to ensure fbcon does not to write to the VRAM
after GPU is powerd down.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The warning turned out to be not so useful, as BO destruction tends to
be deferred to a workqueue.
Also, we should be preventing any damage from this now, so not really
important anymore to fix code doing this.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use the newly split ip suspend functions to do suspend displays
first (to deal with atomic so that FBs can be unpinned before
attempting to evict vram), then evict vram, then suspend the
other IPs. Also move the non-DC pinning code to only be
called in the non-DC cases since atomic should take care of
DC.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107065
Fixes: e00fb85 drm: Stop updating plane->crtc/fb/old_fb on atomic drivers
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to do some IPs earlier to deal with ordering issues
similar to how resume is split into two phases. Do DCE first
to deal with atomic, then do the rest.
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check the supported functions mask before calling the bios
requests method.
Reviewed-by: Jim Qu <Jim.Qu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Problem:
amdgpu_ttm_set_buffer_funcs_status destroys adev->mman.entity on suspend
without releasing adev->mman.bdev.man[TTM_PL_VRAM].move fence
so on resume the new drm_sched_entity.fence_context causes
the warning against the old fence context which is different.
Fix:
When destroying sched_entity in amdgpu_ttm_set_buffer_funcs_status
release man->move and set the pointer to NULL.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch removes the usage of console_(un)lock
by replacing drm_fb_helper_set_suspend() to
drm_fb_helper_set_suspend_unlocked() which locks and
unlocks the console instead of locking ourselves.
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
While the console_lock is held, console output will be buffered, till
its unlocked it wont be emitted, hence its ideal to unlock sooner to enable
debugging/detecting/fixing of any issue in the remaining sequence of events
in resume path.
The concern here is about consoles other than fbcon on the device,
e.g. a serial console
[How]
This patch restructures the console_lock, console_unlock around
amdgpu_fbdev_set_suspend() and moves this new block appropriately.
V2: Kept amdgpu_fbdev_set_suspend after pci_set_power_state
V3: Updated the commit message to clarify the real concern that this patch
addresses.
V4: code clean-up.
V5: fixed return value
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
More features for 4.19:
- Map processes to vmids for debugging GPUVM faults
- Raven gfxoff fixes
- Initial gfxoff support for vega12
- Use defines for interrupt sources rather than magic numbers
- DC aux fixes
- Finish DC logging TODO
- Add more DC debugfs interfaces for conformance testing
- Add CRC support for DCN
- Scheduler rework in preparation for load balancing
- Unify common smu9 code
- Clean up UVD instancing support
- ttm cleanups
- Misc fixes and cleanups
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719194001.3488-1-alexander.deucher@amd.com