Commit Graph

570 Commits

Author SHA1 Message Date
Christian König
96800ba9e5 drm/ttm: fix out-of-bounds read in ttm_put_pages() v2
commit a66477b0ef upstream.

When ttm_put_pages() tries to figure out whether it's dealing with
transparent hugepages, it just reads past the bounds of the pages array
without a check.

v2: simplify the test if enough pages are left in the array (Christian).

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 5c42c64f7d ("drm/ttm: fix the fix for huge compound pages")
Cc: stable@vger.kernel.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-27 09:36:37 +02:00
Trigger Huang
b10cc08374 drm/ttm: Fix bo_global and mem_global kfree error
[ Upstream commit 30f33126fe ]

ttm_bo_glob and ttm_mem_glob are defined as structure instance, while
not allocated by kzalloc, so kfree should not be invoked to release
them anymore. Otherwise, it will cause the following kernel BUG when
unloading amdgpu module

[   48.419294] kernel BUG at /build/linux-5s7Xkn/linux-4.15.0/mm/slub.c:3894!
[   48.419352] invalid opcode: 0000 [#1] SMP PTI
[   48.419387] Modules linked in: amdgpu(OE-) amdchash(OE) amdttm(OE) amd_sched(OE) amdkcl(OE) amd_iommu_v2 drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi pcbc snd_seq snd_seq_device snd_timer aesni_intel snd soundcore joydev aes_x86_64 crypto_simd glue_helper cryptd input_leds mac_hid serio_raw binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 8139too psmouse i2c_piix4 8139cp mii floppy pata_acpi
[   48.419782] CPU: 1 PID: 1281 Comm: modprobe Tainted: G           OE    4.15.0-20-generic #21-Ubuntu
[   48.419838] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   48.419901] RIP: 0010:kfree+0x137/0x180
[   48.419934] RSP: 0018:ffffb02101273bf8 EFLAGS: 00010246
[   48.419974] RAX: ffffeee1418ad7e0 RBX: ffffffffc075f100 RCX: ffff8fed7fca7ed0
[   48.420025] RDX: 0000000000000000 RSI: 000000000003440e RDI: 0000000022400000
[   48.420073] RBP: ffffb02101273c10 R08: 0000000000000010 R09: ffff8fed7ffd3680
[   48.420121] R10: ffffeee1418ad7c0 R11: ffff8fed7ffd3000 R12: ffffffffc075e2c0
[   48.420169] R13: ffffffffc074ec10 R14: ffff8fed73063900 R15: ffff8fed737428e8
[   48.420216] FS:  00007fdc912ec540(0000) GS:ffff8fed7fc80000(0000) knlGS:0000000000000000
[   48.420267] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   48.420308] CR2: 000055fa40c30060 CR3: 000000023470a006 CR4: 00000000003606e0
[   48.420358] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   48.420405] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   48.420452] Call Trace:
[   48.420485]  ttm_bo_global_kobj_release+0x20/0x30 [amdttm]
[   48.420528]  kobject_release+0x6a/0x180
[   48.420562]  kobject_put+0x28/0x50
[   48.420595]  ttm_bo_global_release+0x36/0x50 [amdttm]
[   48.420636]  amdttm_bo_device_release+0x119/0x180 [amdttm]
[   48.420678]  ? amdttm_bo_clean_mm+0xa6/0xf0 [amdttm]
[   48.420760]  amdgpu_ttm_fini+0xc9/0x180 [amdgpu]
[   48.420821]  amdgpu_bo_fini+0x12/0x40 [amdgpu]
[   48.420889]  gmc_v9_0_sw_fini+0x40/0x50 [amdgpu]
[   48.420947]  amdgpu_device_fini+0x36f/0x4c0 [amdgpu]
[   48.421007]  amdgpu_driver_unload_kms+0xb4/0x150 [amdgpu]
[   48.421058]  drm_dev_unregister+0x46/0xf0 [drm]
[   48.421102]  drm_dev_unplug+0x12/0x70 [drm]

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20 09:16:02 +02:00
Christian König
546486c5b1 drm/ttm: fix LRU handling in ttm_buffer_object_transfer
[ Upstream commit d6e820fcd4 ]

We need to set the NO_EVICT flag on the ghost object or otherwise we are
adding it to the LRU.

When it is added to the LRU we can run into a race between destroying
and evicting it again.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21 14:15:15 +01:00
Huang Rui
df36b2fb83 drm/ttm: clean up non-x86 definitions on ttm_tt
All non-x86 definitions are moved to ttm_set_memory header, so remove it from
ttm_tt.c.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-01 17:23:56 -05:00
Huang Rui
fe710322b8 drm/ttm: fix missed conversion of set_pages_array_uc
This patch fixed the error when do not configure CONFIG_X86, otherwise, below
error will be encountered.

All errors (new ones prefixed by >>):

   drivers/gpu/drm/ttm/ttm_page_alloc_dma.c: In function 'ttm_set_pages_caching':
>> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:272:7: error: implicit declaration of function 'set_pages_array_uc'; did you mean
+'ttm_set_pages_array_uc'? [-Werror=implicit-function-declaration]
      r = set_pages_array_uc(pages, cpages);
          ^~~~~~~~~~~~~~~~~~
          ttm_set_pages_array_uc
   cc1: some warnings being treated as errors

Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-08-01 17:22:20 -05:00
Bas Nieuwenhuizen
610b399f1f drm/ttm: Merge hugepage attr changes in ttm_dma_page_put. (v2)
Every set_pages_array_wb call resulted in cross-core
interrupts and TLB flushes. Merge more of them for
less overhead.

This reduces the time needed to free a 1.6 GiB GTT WC
buffer as part of Vulkan CTS from  ~2 sec to < 0.25 sec.
(Allocation still takes more than 2 sec though)

(v2): use set_pages_wb instead of set_memory_wb.

Signed-off-by: Bas Nieuwenhuizen <basni@chromium.org>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-27 15:00:22 -05:00
Huang Rui
d55f9b8742 drm/ttm: clean up non-x86 definitions on ttm_page_alloc
All non-x86 definitions are moved to ttm_set_memory header, so remove it from
ttm_page_alloc.c.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-27 15:00:14 -05:00
Huang Rui
c7bb1e57e2 drm/ttm: clean up non-x86 definitions on ttm_page_alloc_dma
All non-x86 definitions are moved to ttm_set_memory header, so remove it from
ttm_page_alloc_dma.c.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-27 15:00:08 -05:00
Thomas Zimmermann
f44907593d drm/ttm: Replace ttm_bo_unref() with ttm_bo_put()
A call to ttm_bo_unref() clears the supplied pointer to NULL, while
ttm_bo_put() does not. None of the converted call sites requires the
pointer to become NULL, so the respective assign operations has been
left out from the patch.

Signed-off-by: Thomas Zimmermann <contact@tzimmermann.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-10 14:18:28 -05:00
Thomas Zimmermann
8129fdad38 drm/ttm: Replace ttm_bo_reference() with ttm_bo_get()
Signed-off-by: Thomas Zimmermann <contact@tzimmermann.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-10 14:18:22 -05:00
Thomas Zimmermann
89c815ef07 drm/ttm: Introduce ttm_bo_get() and ttm_bo_put() for ref counting
The TTM buffer-object interface provides ttm_bo_reference() and
ttm_bo_unref() for managing reference counts. Replacing them with
ttm_bo_get() and ttm_bo_put() aligns the API with conventions used
throughout the Linux kernel.

The implementation of ttm_bo_unref() clears the supplied pointer
to NULL. This leads to workarounds where the caller saves the
pointer's value before de-referencing the BO. ttm_bo_put() does
not clear the supplied pointer.

Signed-off-by: Thomas Zimmermann <contact@tzimmermann.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-10 14:18:15 -05:00
Gustavo A. R. Silva
31e1c59796 drm/ttm: use swap macro in ttm_bo_handle_move_mem
Make use of the swap macro and remove unnecessary variable *tmp_mem*.
This makes the code easier to read and maintain. Also, reduces the
stack usage.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-10 14:18:08 -05:00
Dave Airlie
565c17b5f0 Merge branch 'drm-next-4.19' of git://people.freedesktop.org/~agd5f/linux into drm-next
First feature request for 4.19.  Highlights:
- Add initial amdgpu documentation
- Add initial GPU scheduler documention
- GPU scheduler fixes for dying processes
- Add support for the JPEG engine on VCN
- Switch CI to use powerplay by default
- EDC support for CZ
- More powerplay cleanups
- Misc DC fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

Link: https://patchwork.freedesktop.org/patch/msgid/20180621161138.3008-1-alexander.deucher@amd.com
2018-06-22 13:19:05 +10:00
Souptick Joarder
4daa4fba3a gpu: drm: ttm: Adding new return type vm_fault_t
Use new return type vm_fault_t for fault handler. For
now, this is just documenting that the function returns
a VM_FAULT value rather than an errno. Once all instances
are converted, vm_fault_t will become a distinct type.

Ref-> commit 1c8f422059 ("mm: change return type to vm_fault_t")

Previously vm_insert_{mixed,pfn} returns err which driver
mapped into VM_FAULT_* type. The new function
vmf_insert_{mixed,pfn} will replace this inefficiency by
returning VM_FAULT_* type.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-06-19 13:17:38 -05:00
Kees Cook
6da2ec5605 treewide: kmalloc() -> kmalloc_array()
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:

        kmalloc(a * b, gfp)

with:
        kmalloc_array(a * b, gfp)

as well as handling cases of:

        kmalloc(a * b * c, gfp)

with:

        kmalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kmalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kmalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kmalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kmalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kmalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kmalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kmalloc
+ kmalloc_array
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kmalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kmalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kmalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kmalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kmalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kmalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kmalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kmalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kmalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kmalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kmalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kmalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kmalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kmalloc(C1 * C2 * C3, ...)
|
  kmalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kmalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kmalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kmalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kmalloc(sizeof(THING) * C2, ...)
|
  kmalloc(sizeof(TYPE) * C2, ...)
|
  kmalloc(C1 * C2 * C3, ...)
|
  kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kmalloc
+ kmalloc_array
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Dirk Hohndel
1297bf2e91 Add SPDX idenitifier and clarify license
This is dual licensed under GPL-2.0 or MIT.

Signed-off-by: Dirk Hohndel (VMware) <dirk@hohndel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:44:27 -05:00
Junwei Zhang
967c650d49 drm/ttm: remove priority hard code when initializing ttm bo
Then priority could be set before initialization.
By default, it requires to kzalloc ttm bo. In fact, we always do so.

Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: David Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:44:23 -05:00
Michel Dänzer
00862edba1 drm/ttm: Use GFP_TRANSHUGE_LIGHT for allocating huge pages
GFP_TRANSHUGE tries very hard to allocate huge pages, which can result
in long delays with high memory pressure. I have observed firefox
freezing for up to around a minute due to this while restic was taking
a full system backup.

Since we don't really need huge pages, use GFP_TRANSHUGE_LIGHT |
__GFP_NORETRY instead, in order to fail quickly when there are no huge
pages available.

Set __GFP_KSWAPD_RECLAIM as well, in order for huge pages to be freed
up in the background if necessary.

With these changes, I'm no longer seeing freezes during a restic backup.

Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:44:17 -05:00
Christian König
5452cf44d6 drm/ttm: keep a reference to transfer pipelined BOs
Make sure the transfered BO is never destroy before the transfer BO.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-05-15 13:43:11 -05:00
Thomas Hellstrom
9c11fcf1a7 drm/ttm: Export the ttm_k[un]map_atomic_prot API.
It will be used by vmwgfx cpu blit.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2018-03-22 11:10:06 +01:00
Thomas Hellstrom
403c1826a4 drm/ttm: Clean up kmap_atomic_prot selection code
Use helpers to perform the kmap_atomic_prot() functionality to
a) Avoid in-function ifdefs that violate the kernel coding policy,
b) Facilitate exporting the functionality.

This commit should not change any functionality.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2018-03-22 11:09:37 +01:00
Christian König
536bbeba9b drm/ttm: move initializing ttm->sg into ttm_tt_init_fields
Better to set this with all other fields as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 14:38:28 -05:00
Christian König
dde5da2379 drm/ttm: add bo as parameter to the ttm_tt_create callback
Instead of calculating the size in bytes just to recalculate the number
of pages from it pass the BO directly to the function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 14:38:27 -05:00
Christian König
5d95109815 drm/ttm: add ttm_bo_pipeline_gutting
Allows us to gut a BO of it's backing store when the driver says that it
isn't needed any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 14:38:27 -05:00
Christian König
75a57669cb drm/ttm: add ttm_sg_tt_init
This allows drivers to only allocate dma addresses, but not a page
array.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 14:38:24 -05:00
Christian König
81f5ec0255 drm/ttm: move ttm_tt defines into ttm_tt.h
Let's stop mangling everything in a single header and create one header
per object instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-03-14 14:38:24 -05:00
Christian König
45a9d154f6 drm/ttm: cleanup ttm_tt_create
Cleanup ttm_tt_create a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:46 -05:00
Christian König
97b7e1b8b5 drm/ttm: move ttm_tt_create into ttm_tt.c v2
Rename ttm_bo_add_ttm to ttm_tt_create and move it into ttm_tt.c.

v2: separate the cleanup.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:46 -05:00
Roger He
ec3fe391bd drm/ttm: check if free mem space is under the lower limit
the free mem space and the lower limit both include two parts:
system memory and swap space.

For the OOM triggered by TTM, that is the case as below:
first swap space is full of swapped out pages and soon
system memory also is filled up with ttm pages. and then
any memory allocation request will run into OOM.

to cover two cases:
a. if no swap disk at all or free swap space is under swap mem
   limit but available system mem is bigger than sys mem limit,
   allow TTM allocation;

b. if the available system mem is less than sys mem limit but
   free swap space is bigger than swap mem limit, allow TTM
   allocation.

v2: merge two memory limit(swap and system) into one
v3: keep original behavior except ttm_opt_ctx->flags with
    TTM_OPT_FLAG_FORCE_ALLOC
v4: always set force_alloc as tx->flags & TTM_OPT_FLAG_FORCE_ALLOC
v5: add an attribute for lower_mem_limit
v6: set lower_mem_limit as 0 to keep original behavior

Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:46 -05:00
Christian König
724daa4fd6 drm/ttm: drop persistent_swap_storage from ttm_bo_init and co
Never used as parameter, the only driver actually using this is nouveau
and there it is initialized after the BO is initialized.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:45 -05:00
Christian König
231cdafc75 drm/ttm: drop ttm->dummy_read_page
Only used by the AGP backend and there it can be easily accessed using
ttm->bdev->glob.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:45 -05:00
Christian König
3231a7696e drm/ttm: drop ttm->glob
The pointer is available as ttm->bdev->glob as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:45 -05:00
Christian König
3839263362 drm/ttm: drop bo->glob
The pointer is available as bo->bdev->glob as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:44 -05:00
Christian König
e44fcf71f4 drm/ttm: add default implementations for ttm_tt_(un)populate
Use ttm_pool_populate/ttm_pool_unpopulate if the driver doesn't provide
a function.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:41 -05:00
Roger He
40d5250dbb drm/ttm: set TTM_OPT_FLAG_FORCE_ALLOC in ttm_bo_force_list_clean
Because ttm_bo_force_list_clean() is only called on two occasions:
1. By ttm_bo_evict_mm() during suspend.
2. By ttm_bo_clean_mm() when the driver unloads.
On both cases we absolutely don't want any memory allocation failure.

Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:34 -05:00
Roger He
aa7662b67b drm/ttm: add bit flag TTM_OPT_FLAG_FORCE_ALLOC
set TTM_OPT_FLAG_FORCE_ALLOC when we are servicing for page
fault routine.

for ttm_mem_global_reserve if in page fault routine, allow the gtt
pages reservation always. because page fault routing already grabbed
system memory and the allowance of this exception is harmless.
Otherwise, it will trigger OOM killer.

will be used later.

v2: set the FORCE_ALLOC always
v3: minor refine

Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:34 -05:00
Roger He
d330fca115 drm/ttm: use bit flag to replace allow_reserved_eviction in ttm_operation_ctx
for saving memory and more bit flag can be used in future

Signed-off-by: Roger He <Hongbo.He@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:34 -05:00
Christian König
ec92937056 drm/ttm: set page mapping during allocation
To aid debugging set the page mapping during allocation instead of
during VM faults.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-26 23:09:33 -05:00
Christian König
25893a14c9 drm/ttm: add ttm_tt_populate wrapper
Stop calling the driver callback directly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:52 -05:00
Tom St Denis
7ca34ddc78 drm/ttm: Simplify ttm_dma_page_put()
Remove redundant store of return code.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:02 -05:00
Tom St Denis
42bdbb6e0f drm/ttm: Fix coding style in ttm_dma_pool_alloc_new_pages()
Add missing {} braces.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:01 -05:00
Tom St Denis
ddde985c41 drm/ttm: Fix coding style in ttm_tt_swapout()
Add missing {} braces.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:01 -05:00
Tom St Denis
07d48da4f4 drm/ttm: Simplify ttm_eu_reserve_buffers()
Hoist the comparison of the ret to -EDEADLK above
the two code paths to simplify the function.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:19:00 -05:00
Tom St Denis
de8dfb8e34 drm/ttm: Remove unncessary retval from ttm_bo_vm_fault()
The dual ret/retval was more complex than need be.  Now
we drop the retval variable and assign the appropriate VM
codes to ret instead.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:18:59 -05:00
Tom St Denis
449f797a94 drm/ttm: Fix coding style in ttm_bo_move_memcpy()
Add missing {} braces.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:18:59 -05:00
Tom St Denis
add3d95d73 drm/ttm: Simplify ttm_dma_find_pool() (v2)
Flip the logic of the comparison and remove
the redudant variable for the pool address.

(v2): Remove {} bracing.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:18:58 -05:00
Tom St Denis
c68edaa0c1 drm/ttm: Fix coding style in ttm_pool_store()
Correct missing {} style.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:18:58 -05:00
Tom St Denis
5b4262d7a2 drm/ttm: Change ttm_tt page allocations to return errors
Explicitly return errors in ttm_tt_alloc_page_directory() and
ttm_dma_tt_alloc_page_directory() instead of relying on
further logic to detect errors.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:18:57 -05:00
Tom St Denis
420457acfb drm/ttm: Add a default BO destructor to simplify code (v2)
(v2): Remove stray ; noticed by Felix

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:18:56 -05:00
Tom St Denis
43c7c41b25 drm/ttm: Fix coding style in ttm_bo.c
Correct indentation and {} brace style.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-02-19 14:18:56 -05:00