Commit Graph

24759 Commits

Author SHA1 Message Date
Ville Syrjälä
f7be2c2150 drm/i915: Disable CLKOUT_DP bending on LPT/WPT as needed
When we want to use SPLL for FDI we want SSC, which means we have to
disable clock bending for the PCH SSC reference (bend and spread are
mutually exclusive). So let's turn off bending when we want spread.
In case the BIOS enabled clock bending for some reason we'll just turn
it off and enable the spread mode instead.

Not sure what happens if the BIOS is actually using the bend source for
HDMI at this time, but I suppose it should be no worse than what already
happens when we simply turn on the spread.

We don't currently use the bend source for anything, and only use the
PCH SSC reference for the SPLL to drive FDI (always with spread).

v2: Fix the %5 vs %10 fumble for SSCDITHPHASE (Paulo)
    Add 'WARN_ON(steps % 5 != 0)' sanity check (Paulo)
    Fix typos in commit message (Paulo)

Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449260379-14093-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2015-12-08 16:30:20 +02:00
Mika Kuoppala
6704d45528 drm/i915/skl: Double RC6 WRL always on
WaRsDoubleRc6WrlWithCoarsePowerGating should
be enabled for all Skylakes. Make it so.

Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449505785-20812-2-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit e7674b8c31)
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-08 16:18:45 +02:00
Mika Kuoppala
344df9809f drm/i915/skl: Disable coarse power gating up until F0
There is conflicting info between E0 and F0 steppings
for this workarounds. Trust more authoritative source and
be conservative and extend also for F0.

This prevents numerous (>50) gpu hangs with SKL GT4e
during piglit run.

References: HSD: gen9lp/2134184
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449505785-20812-1-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 6686ece19f)
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-08 16:18:29 +02:00
Ashley Towns
8e3911178e exynos: fixes an incorrect header guard
in the exynos gpu driver where the preprocessor #ifndef/#define
variables were mismatched.

Signed-off-by: Ashley Towns <mail@ashleytowns.id.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-12-08 15:07:49 +01:00
Mika Kuoppala
e7674b8c31 drm/i915/skl: Double RC6 WRL always on
WaRsDoubleRc6WrlWithCoarsePowerGating should
be enabled for all Skylakes. Make it so.

Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449505785-20812-2-git-send-email-mika.kuoppala@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-08 16:03:06 +02:00
Mika Kuoppala
6686ece19f drm/i915/skl: Disable coarse power gating up until F0
There is conflicting info between E0 and F0 steppings
for this workarounds. Trust more authoritative source and
be conservative and extend also for F0.

This prevents numerous (>50) gpu hangs with SKL GT4e
during piglit run.

References: HSD: gen9lp/2134184
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449505785-20812-1-git-send-email-mika.kuoppala@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-08 15:55:57 +02:00
Thomas Hellstrom
8fbf9d92a7 drm/vmwgfx: Implement the cursor_set2 callback v2
Fixes native drm clients like Fedora 23 Wayland which now appears to
be able to use cursor hotspots without strange cursor offsets.
Also fixes a couple of ignored error paths.

Since the core drm cursor hotspot is incompatible with the legacy vmwgfx
hotspot (the core drm hotspot is reset when the drm_mode_cursor ioctl
is used), we need to keep track of both and add them when the device
hotspot is set. We assume that either is always zero.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
2015-12-08 12:55:46 +01:00
Tvrtko Ursulin
4a1e1d055b drm/i915: Remove incorrect warning in context cleanup
Commit e9f24d5fb7
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date:   Mon Oct 5 13:26:36 2015 +0100

    drm/i915: Clean up associated VMAs on context destruction

Added a warning based on an incorrect assumption that all VMAs
in a VM will be on the inactive list at the point last reference
to a context and VM is dropped.

This is not true because i915_gem_object_retire__read will not
put VMA on the inactive list until all activities on the object
in question (in all VMs) have been retired.

As a consequence, whether or not a context/VM will be destroyed
with its VMAs still on the active list, can depend on completely
unrelated activities using the same object from a different
context or engine.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638
Testcase: igt/gem_request_retire/retire-vma-not-inactive
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1448025816-25584-1-git-send-email-tvrtko.ursulin@linux.intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 408952d43b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-12-08 12:26:08 +02:00
Eric Anholt
214613656b drm/vc4: Add an interface for capturing the GPU state after a hang.
This can be parsed with vc4-gpu-tools tools for trying to figure out
what was going on.

v2: Use __u32-style types.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:49:49 -08:00
Eric Anholt
b501bacc60 drm/vc4: Add support for async pageflips.
An async pageflip stores the modeset to be done and executes it once
the BOs are ready to be displayed.  This gets us about 3x performance
in full screen rendering with pageflipping.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:10:03 -08:00
Eric Anholt
d5b1a78a77 drm/vc4: Add support for drawing 3D frames.
The user submission is basically a pointer to a command list and a
pointer to uniforms.  We copy those in to the kernel, validate and
relocate them, and store the result in a GPU BO which we queue for
execution.

v2: Drop support for NV shader recs (not necessary for GL), simplify
    vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style
    types.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:05:10 -08:00
Eric Anholt
d3f5168a08 drm/vc4: Bind and initialize the V3D engine.
This is the component of the GPU that does 3D rendering.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:05:10 -08:00
Eric Anholt
1fa81589bb drm/vc4: Fix a typo in a V3D debug register.
Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:05:09 -08:00
Eric Anholt
463873d570 drm/vc4: Add an API for creating GPU shaders in GEM BOs.
Since we have no MMU, the kernel needs to validate that the submitted
shader code won't make any accesses to memory that the user doesn't
control, which involves banning some operations (general purpose DMA
writes), and tracking where we need to write out pointers for other
operations (texture sampling).  Once it's validated, we return a GEM
BO containing the shader, which doesn't allow mapping for write or
exporting to other subsystems.

v2: Use __u32-style types.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:05:09 -08:00
Eric Anholt
d5bc60f6ad drm/vc4: Add create and map BO ioctls.
While there exist dumb APIs for creating and mapping BOs, one of the
rules is that drivers doing 3D acceleration have to provide their own
APIs for buffer allocation (besides, the pitch/height parameters of
the dumb alloc don't really make sense for a lot of 3D allocations).

v2: Use __u32-style types, use "drm.h" instead of <drm/drm.h>.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:04:57 -08:00
Eric Anholt
c826a6e106 drm/vc4: Add a BO cache.
We need to allocate new BOs in the kernel as part of each frame, but
the CMA allocator is way too slow for that.  As an optimization, keep
track of recently-freed BOs and reuse them, with a 1 second timeout to
fully free them back to the system.

This improves 3D performance by about 15%.

Signed-off-by: Eric Anholt <eric@anholt.net>
2015-12-07 20:01:56 -08:00
Eric Anholt
10028c5ab1 drm: Create a driver hook for allocating GEM object structs.
The CMA helpers had no way for a driver to extend the struct with its
own fields.  Since the CMA helpers are mostly "Allocate a
drm_gem_cma_object, then fill in a few fields", it's hard to write as
pure helpers without passing in a driver callback for the allocate
step.

Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-12-07 20:01:48 -08:00
Dave Airlie
e876b41ab0 Back merge tag 'v4.4-rc4' into drm-next
We've picked up a few conflicts and it would be nice
to resolve them before we move onwards.
2015-12-08 11:04:26 +10:00
Rodrigo Vivi
dfaf37baa0 drm/i915: Fix idle_frames counter.
'commit 97173eaf5 ("drm/i915: PSR: Increase idle_frames")' was a mistake.
The special case it tried to cover was already being covered by
the DP_PSR_NO_TRAIN_ON_EXIT. So this ended up duplicated.

So, instead of reverting that let's take this opportunity and unify
the idle_frame definition in a single place so we standardize the access
and avoid room for that same mistake again.

Few changes with this patch:

1. Instead of just respecting the VBT we set a
global minumum with max(). So we are sure that we will avoid corner cases
in case VBT is doing something we don't understand.

2. Instead of minimum 5 we use 6. When introducing the idle_frames += 4 case
we considered that minimum was 2. All because the off-by-one issue.

v2: Unified idle_frame definition.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449528320-27655-1-git-send-email-rodrigo.vivi@intel.com
2015-12-07 14:48:11 -08:00
Maarten Lankhorst
0d014ff344 drm/i915: Remove double wait_for_vblank on broadwell.
wait_vblank is already set in intel_plane_atomic_calc_changes
for broadwell, waiting for a double vblank is overkill.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-5-git-send-email-maarten.lankhorst@linux.intel.com
2015-12-07 10:59:10 +01:00
Maarten Lankhorst
b900111459 drm/i915/skl: Update watermarks before the crtc is disabled.
On skylake some of the registers are only writable when the correct
power wells are enabled. Because of this watermarks have to be updated
before the crtc turns off, or you get unclaimed register read and write
warnings.

This patch needs to be modified slightly to apply to -fixes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92181
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-4-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2015-12-07 10:58:33 +01:00
Maarten Lankhorst
92826fcdfc drm/i915: Calculate watermark related members in the crtc_state, v4.
This removes pre/post_wm_update from intel_crtc->atomic, and
creates atomic state for it in intel_crtc.

Changes since v1:
- Rebase on top of wm changes.
Changes since v2:
- Split disable_cxsr into a separate patch.
Changes since v3:
- Move some of the changes to intel_wm_need_update.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56603A49.5000507@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-12-07 10:56:08 +01:00
Maarten Lankhorst
ab1d3a0e5a drm/i915: Move disable_cxsr to the crtc_state.
intel_crtc->atomic will be removed later on, move this member
to intel_crtc_state.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-2-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
2015-12-07 10:55:47 +01:00
Dave Airlie
47c0fd7282 Merge tag 'topic/drm-misc-2015-12-04' of git://anongit.freedesktop.org/drm-intel into drm-next
New -misc pull. Big thing is Thierry's atomic helpers for system suspend
resume, which I'd like to use in i915 too. Hence the pull.

* tag 'topic/drm-misc-2015-12-04' of git://anongit.freedesktop.org/drm-intel:
  drm: keep connector status change logging human readable
  drm/atomic-helper: Reject attempts at re-stealing encoders
  drm/atomic-helper: Implement subsystem-level suspend/resume
  drm: Implement drm_modeset_lock_all_ctx()
  drm/gma500: Add driver private mutex for the fault handler
  drm/gma500: Drop dev->struct_mutex from mmap offset function
  drm/gma500: Drop dev->struct_mutex from fbdev init/teardown code
  drm/gma500: Drop dev->struct_mutex from modeset code
  drm/gma500: Use correct unref in the gem bo create function
  drm/edid: Make the detailed timing CEA/HDMI mode fixup accept up to 5kHz clock difference
  drm/atomic_helper: Add drm_atomic_helper_disable_planes_on_crtc()
  drm: Serialise multiple event readers
  drm: Drop dev->event_lock spinlock around faulting copy_to_user()
2015-12-07 18:17:09 +10:00
Zeng Zhaoxiu
a4d8a0fe45 i915: Replace "hweight8(dev_priv->info.subslice_7eu[i]) != 1" with "!is_power_of_2(dev_priv->info.subslice_7eu[i])"
Signed-off-by: Zeng Zhaoxiu <zhaoxiu.zeng@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449397590-14292-1-git-send-email-zhaoxiu.zeng@gmail.com
2015-12-07 08:44:20 +01:00
Al Viro
dfbf53ed54 vgaarb: remove bogus checks
neither ->release() nor ->poll() can be called unless ->open()
has succeeded on the same struct file, so checking for "has
open() failed" is pointless.  What's more, ->poll() doesn't
return -E... - it always returns a bitmap of POLL... values,
so the dead code in that one had been actively bogus.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-12-06 21:17:17 -05:00
Daniel Vetter
0b8ebeacf5 drm/armada: use a private mutex to protect priv->linear
Reusing the Big DRM Lock just leaks, and the few things left that
dev->struct_mutex protected are very well contained - it's just the
linear drm_mm manager.

With this armada is completely struct_mutex free!

v2: Convert things properly and also take the lock in
armada_gem_free_object, and remove the stale comment (Russell).

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-05 21:44:07 +00:00
Daniel Vetter
4bd3fd443a drm/armada: drop struct_mutex from cursor paths
The kms state itself is already protected by the modeset locks
acquired by the drm core. The only thing left is gem bo state, and
since the cursor code expects small objects which are statically
mapped at create time and then invariant over the lifetime of the gem
bo there's nothing to protect.

See armada_gem_dumb_create -> armada_gem_linear_back which assigns
obj->addr which is the only thing used by the cursor code.

Only tricky bit is to switch to the _unlocked unreference function.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-05 21:44:07 +00:00
Dave Airlie
df4d4aa96d Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux into drm-next
A few more last minute fixes for 4.4 on top of my pull request from
earlier this week.  The big change here is a vblank regression fix due to
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks
were missed".  Beyond that, a hotplug fix and a few VM fixes.

* 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3)
  drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)
  drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
  drm/amdgpu: add spin lock to protect freed list in vm (v2)
  drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2
  drm/amdgpu: take a BO reference for the user fence
  drm/amdgpu: take a BO reference in the display code
  drm/amdgpu: set snooped flags only on system addresses v2
  drm/amdgpu: fix race condition in amd_sched_entity_push_job
  drm/amdgpu: add err check for pin userptr
  add blacklist for thinkpad T40p
  drm/amdgpu: fix VM page table reference counting
  drm/amdgpu: fix userptr flags check
2015-12-05 16:15:38 +10:00
Daniel Vetter
03a97d8255 drm/i915: Update DRIVER_DATE to 20151204
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-12-04 21:56:02 +01:00
Alex Deucher
8e36f9d33c drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3)
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many
vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core
more fragile to drivers which don't update hw vblank counters and
vblank timestamps in sync with firing of the vblank irq and
essentially at leading edge of vblank.

This exposed a problem with radeon-kms/amdgpu-kms which do not
satisfy above requirements:

The vblank irq fires a few scanlines before start of vblank, but
programmed pageflips complete at start of vblank and
vblank timestamps update at start of vblank, whereas the
hw vblank counter increments only later, at start of vsync.

This leads to problems like off by one errors for vblank counter
updates, vblank counters apparently going backwards or vblank
timestamps apparently having time going backwards. The net result
is stuttering of graphics in games, or little hangs, as well as
total failure of timing sensitive applications.

See bug #93147 for an example of the regression on Linux 4.4-rc:

https://bugs.freedesktop.org/show_bug.cgi?id=93147

This patch tries to align all above events better from the
viewpoint of the drm core / of external callers to fix the problem:

1. The apparent start of vblank is shifted a few scanlines earlier,
so the vblank irq now always happens after start of this extended
vblank interval and thereby drm_update_vblank_count() always samples
the updated vblank count and timestamp of the new vblank interval.

To achieve this, the reporting of scanout positions by
radeon_get_crtc_scanoutpos() now operates as if the vblank starts
radeon_crtc->lb_vblank_lead_lines before the real start of the hw
vblank interval. This means that the vblank timestamps which are based
on these scanout positions will now update at this earlier start of
vblank.

2. The driver->get_vblank_counter() function will bump the returned
vblank count as read from the hw by +1 if the query happens after
the shifted earlier start of the vblank, but before the real hw increment
at start of vsync, so the counter appears to increment at start of vblank
in sync with the timestamp update.

3. Calls from vblank irq-context and regular non-irq calls are now
treated identical, always simulating the shifted vblank start, to
avoid inconsistent results for queries happening from vblank irq vs.
happening from drm_vblank_enable() or vblank_disable_fn().

4. The radeon_flip_work_func will delay mmio programming a pageflip until
the start of the real vblank iff it happens to execute inside the shifted
earlier start of the vblank, so pageflips now also appear to execute at
start of the shifted vblank, in sync with vblank counter and timestamp
updates. This to avoid some races between updates of vblank count and
timestamps that are used for swap scheduling and pageflip execution which
could cause pageflips to execute before the scheduled target vblank.

The lb_vblank_lead_lines "fudge" value is calculated as the size of
the display controllers line buffer in scanlines for the given video
mode: Vblank irq's are triggered by the line buffer logic when the line
buffer refill for a video frame ends, ie. when the line buffer source read
position enters the hw vblank. This means that a vblank irq could fire at
most as many scanlines before the current reported scanout position of the
crtc timing generator as the number of scanlines the line buffer can
maximally hold for a given video mode.

This patch has been successfully tested on a RV730 card with DCE-3 display
engine and on a evergreen card with DCE-4 display engine, in single-display
and dual-display configuration, with different video modes.

A similar patch is needed for amdgpu-kms to fix the same problem.

Limitations:

- Maybe replace the udelay() in the flip_work_func() by a suitable
  usleep_range() for a bit better efficiency? Will try that.

- Line buffer sizes in pixels are hard-coded on < DCE-4 to a value
  i just guessed to be high enough to work ok, lacking info on the true
  sizes atm.

Probably fixes: fdo#93147

Port of Mario's radeon fix to amdgpu.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(v1) Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>

(v2) Refine amdgpu_flip_work_func() for better efficiency.

     In amdgpu_flip_work_func, replace the busy waiting udelay(5)
     with event lock held by a more performance and energy efficient
     usleep_range() until at least predicted true start of hw vblank,
     with some slack for scheduler happiness. Release the event lock
     during waits to not delay other outputs in doing their stuff, as
     the waiting can last up to 200 usecs in some cases.

     Also small fix to code comment and formatting in that function.

(v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>

(v3) Fix crash in crtc disabled case
2015-12-04 15:15:07 -05:00
Mika Kuoppala
15620206ae drm/i915/skl: Add SKL GT4 PCI IDs
Add Skylake Intel Graphics GT4 PCI IDs

v2: Rebase

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1446811876-303-1-git-send-email-mika.kuoppala@intel.com
2015-12-04 19:03:50 +00:00
Mario Kleiner
5b5561b366 drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many
vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core
more fragile to drivers which don't update hw vblank counters and
vblank timestamps in sync with firing of the vblank irq and
essentially at leading edge of vblank.

This exposed a problem with radeon-kms/amdgpu-kms which do not
satisfy above requirements:

The vblank irq fires a few scanlines before start of vblank, but
programmed pageflips complete at start of vblank and
vblank timestamps update at start of vblank, whereas the
hw vblank counter increments only later, at start of vsync.

This leads to problems like off by one errors for vblank counter
updates, vblank counters apparently going backwards or vblank
timestamps apparently having time going backwards. The net result
is stuttering of graphics in games, or little hangs, as well as
total failure of timing sensitive applications.

See bug #93147 for an example of the regression on Linux 4.4-rc:

https://bugs.freedesktop.org/show_bug.cgi?id=93147

This patch tries to align all above events better from the
viewpoint of the drm core / of external callers to fix the problem:

1. The apparent start of vblank is shifted a few scanlines earlier,
so the vblank irq now always happens after start of this extended
vblank interval and thereby drm_update_vblank_count() always samples
the updated vblank count and timestamp of the new vblank interval.

To achieve this, the reporting of scanout positions by
radeon_get_crtc_scanoutpos() now operates as if the vblank starts
radeon_crtc->lb_vblank_lead_lines before the real start of the hw
vblank interval. This means that the vblank timestamps which are based
on these scanout positions will now update at this earlier start of
vblank.

2. The driver->get_vblank_counter() function will bump the returned
vblank count as read from the hw by +1 if the query happens after
the shifted earlier start of the vblank, but before the real hw increment
at start of vsync, so the counter appears to increment at start of vblank
in sync with the timestamp update.

3. Calls from vblank irq-context and regular non-irq calls are now
treated identical, always simulating the shifted vblank start, to
avoid inconsistent results for queries happening from vblank irq vs.
happening from drm_vblank_enable() or vblank_disable_fn().

4. The radeon_flip_work_func will delay mmio programming a pageflip until
the start of the real vblank iff it happens to execute inside the shifted
earlier start of the vblank, so pageflips now also appear to execute at
start of the shifted vblank, in sync with vblank counter and timestamp
updates. This to avoid some races between updates of vblank count and
timestamps that are used for swap scheduling and pageflip execution which
could cause pageflips to execute before the scheduled target vblank.

The lb_vblank_lead_lines "fudge" value is calculated as the size of
the display controllers line buffer in scanlines for the given video
mode: Vblank irq's are triggered by the line buffer logic when the line
buffer refill for a video frame ends, ie. when the line buffer source read
position enters the hw vblank. This means that a vblank irq could fire at
most as many scanlines before the current reported scanout position of the
crtc timing generator as the number of scanlines the line buffer can
maximally hold for a given video mode.

This patch has been successfully tested on a RV730 card with DCE-3 display
engine and on a evergreen card with DCE-4 display engine, in single-display
and dual-display configuration, with different video modes.

A similar patch is needed for amdgpu-kms to fix the same problem.

Limitations:

- Line buffer sizes in pixels are hard-coded on < DCE-4 to a value
  i just guessed to be high enough to work ok, lacking info on the true
  sizes atm.

Fixes: fdo#93147

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>

(v1) Tested-by: Dave Witbrodt <dawitbro@sbcglobal.net>

(v2) Refine radeon_flip_work_func() for better efficiency:

     In radeon_flip_work_func, replace the busy waiting udelay(5)
     with event lock held by a more performance and energy efficient
     usleep_range() until at least predicted true start of hw vblank,
     with some slack for scheduler happiness. Release the event lock
     during waits to not delay other outputs in doing their stuff, as
     the waiting can last up to 200 usecs in some cases.

     Retested on DCE-3 and DCE-4 to verify it still works nicely.

(v2) Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 13:11:41 -05:00
Lyude
cb5d416643 drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt
HPD signals on DVI ports can be fired off before the pins required for
DDC probing actually make contact, due to the pins for HPD making
contact first. This results in a HPD signal being asserted but DDC
probing failing, resulting in hotplugging occasionally failing.

This is somewhat rare on most cards (depending on what angle you plug
the DVI connector in), but on some cards it happens constantly. The
Radeon R5 on the machine used for testing this patch for instance, runs
into this issue just about every time I try to hotplug a DVI monitor and
as a result hotplugging almost never works.

Rescheduling the hotplug work for a second when we run into an HPD
signal with a failing DDC probe usually gives enough time for the rest
of the connector's pins to make contact, and fixes this issue.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 13:09:12 -05:00
jimqu
81d75a30c6 drm/amdgpu: add spin lock to protect freed list in vm (v2)
there is a protection fault about freed list when OCL test.
add a spin lock to protect it.

v2: drop changes in vm_fini

Signed-off-by: JimQu <jim.qu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-12-04 13:08:13 -05:00
Christian König
9c97b5ab4a drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2
The gtt_end is already inclusive, we don't need to subtract one here.

v2 (chk): keep the fix for the VM code, cause here it really applies.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 13:06:59 -05:00
Christian König
f3f1769283 drm/amdgpu: take a BO reference for the user fence
No need for a GEM reference here.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 13:05:44 -05:00
Christian König
e9d951a832 drm/amdgpu: take a BO reference in the display code
No need for the GEM reference here.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 12:32:47 -05:00
Christian König
6d99905a8c drm/amdgpu: set snooped flags only on system addresses v2
Not necessary for VRAM.

v2: no need to check if ttm is NULL.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 12:31:46 -05:00
jimqu
9c4153b1ee drm/amdgpu: add spin lock to protect freed list in vm (v2)
there is a protection fault about freed list when OCL test.
add a spin lock to protect it.

v2: drop changes in vm_fini

Signed-off-by: JimQu <jim.qu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-12-04 12:23:38 -05:00
Daniel Vetter
af3302b907 Revert "drm/i915: Extend LRC pinning to cover GPU context writeback"
This reverts commit 6d65ba943a.

Mika Kuoppala traced down a use-after-free crash in module unload to
this commit, because ring->last_context is leaked beyond when the
context gets destroyed. Mika submitted a quick fix to patch that up in
the context destruction code, but that's too much of a hack.

The right fix is instead for the ring to hold a full reference onto
it's last context, like we do for legacy contexts.

Since this is causing a regression in BAT it gets reverted before we
can close this.

Cc: Nick Hoath <nicholas.hoath@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alex Dai <yu.dai@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93248
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-12-04 17:34:40 +01:00
Tom St Denis
eb64526f5a amdgpu/gfxv8: Remove magic numbers from function gfx_v8_0_tiling_mode_table_init()
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 11:26:52 -05:00
Nicolai Hähnle
786b521908 drm/amdgpu: fix race condition in amd_sched_entity_push_job
As soon as we leave the spinlock after the job has been added to the job
queue, we can no longer rely on the job's data to be available.

I have seen a null-pointer dereference due to sched == NULL in
amd_sched_wakeup via amd_sched_entity_push_job and
amd_sched_ib_submit_kernel_helper. Since the latter initializes
sched_job->sched with the address of the ring scheduler, which is
guaranteed to be non-NULL, this race appears to be a likely culprit.

Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Bugzilla: https://bugs.freedesktop.org/attachment.cgi?bugid=93079
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-12-04 11:26:52 -05:00
Chunming Zhou
ba98f9e56c drm/amdgpu: add err check for pin userptr
Missing error check if the operation failed.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-12-04 11:26:51 -05:00
Tom St Denis
0d07db7e10 amdgpu/gfxv8: Simplification in gfx_v8_0_enable_gui_idle_interrupt()
Simplified the function by folding the two paths into one.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 11:26:50 -05:00
Tom St Denis
544b8a74c7 amdgpu/gfxv8: Simplification of gfx_v8_0_create_bitmask()
Simplification of the function gfx_v8_0_create_bitmask().

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 11:26:50 -05:00
Tom St Denis
90bea0abf6 amdgpu/gfxv8: Cleanup of gfx_v8_0_tiling_mode_table_init() (v2)
Simplification and LOC reduction of function gfx_v8_0_tiling_mode_table_init()

v2: remove spurious break
bug: https://bugs.freedesktop.org/show_bug.cgi?id=93236

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04 11:26:49 -05:00
Deepak M
61ad992875 drm/i915: Correct the Ref clock value for BXT
The reference clock for BXT is 19.2 MHz not 19.5 MHz, updating the
correct value here.

Signed-off-by: Deepak M <m.deepak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449238659-12473-2-git-send-email-m.deepak@intel.com
2015-12-04 17:21:36 +01:00
Daniel Vetter
a9287dbc26 drm/i915: Restore skl_gt3 device info
This was broken in

commit 6a8beeffed
Author: Wayne Boyer <wayne.boyer@intel.com>
Date:   Wed Dec 2 13:28:14 2015 -0800

    drm/i915: Clean up device info structure definitions

and I didn't spot this while reviewing. We really need that CI farm up
asap!

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Wayne Boyer <wayne.boyer@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-12-04 16:17:23 +01:00
Chris Wilson
b6aa087361 drm/i915: Fix RPS pointer passed from wait_ioctl to i915_wait_request
In commit 2e1b873072 [v4.2]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Apr 27 13:41:22 2015 +0100

    drm/i915: Convert RPS tracking to a intel_rps_client struct

we converted the __i915_wait_request() to take a new intel_rps_client
struct (rather than having to pass fake drm_i915_file_private structs).
However, due to use of passing a void pointer, I didn't spot one
callsite in wait-ioctl was passing the wrong pointer.

Fwiw, the impact of this bug is zero. Along the rps path, we always
first call list_empty(rps) which when we pass in the wrong pointer
always evaluates to false and we return early and never chase the
invalid pointers.

The user visible impact is then wait-ioctl doesn't get the same
waitboosting as the other interfaces (set-domain, throttle), which is a
performance concern for the *very* few users of the wait interface.
There is also a libdrm_intel patch to use the wait-ioctl for
drm_intel_bo_wait_rendering() if anyone feels inclined to review
libdrm_intel patches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Add Chris' explanation for why the impact of this is pretty
close to 0.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-12-04 16:13:50 +01:00