mirror of
https://github.com/raspberrypi/linux.git
synced 2025-12-23 02:04:02 +00:00
Pull MM updates from Andrew Morton:
- The series "zram: optimal post-processing target selection" from
Sergey Senozhatsky improves zram's post-processing selection
algorithm. This leads to improved memory savings.
- Wei Yang has gone to town on the mapletree code, contributing several
series which clean up the implementation:
- "refine mas_mab_cp()"
- "Reduce the space to be cleared for maple_big_node"
- "maple_tree: simplify mas_push_node()"
- "Following cleanup after introduce mas_wr_store_type()"
- "refine storing null"
- The series "selftests/mm: hugetlb_fault_after_madv improvements" from
David Hildenbrand fixes this selftest for s390.
- The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng
implements some rationaizations and cleanups in the page mapping
code.
- The series "mm: optimize shadow entries removal" from Shakeel Butt
optimizes the file truncation code by speeding up the handling of
shadow entries.
- The series "Remove PageKsm()" from Matthew Wilcox completes the
migration of this flag over to being a folio-based flag.
- The series "Unify hugetlb into arch_get_unmapped_area functions" from
Oscar Salvador implements a bunch of consolidations and cleanups in
the hugetlb code.
- The series "Do not shatter hugezeropage on wp-fault" from Dev Jain
takes away the wp-fault time practice of turning a huge zero page
into small pages. Instead we replace the whole thing with a THP. More
consistent cleaner and potentiall saves a large number of pagefaults.
- The series "percpu: Add a test case and fix for clang" from Andy
Shevchenko enhances and fixes the kernel's built in percpu test code.
- The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett
optimizes mremap() by avoiding doing things which we didn't need to
do.
- The series "Improve the tmpfs large folio read performance" from
Baolin Wang teaches tmpfs to copy data into userspace at the folio
size rather than as individual pages. A 20% speedup was observed.
- The series "mm/damon/vaddr: Fix issue in
damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON
splitting.
- The series "memcg-v1: fully deprecate charge moving" from Shakeel
Butt removes the long-deprecated memcgv2 charge moving feature.
- The series "fix error handling in mmap_region() and refactor" from
Lorenzo Stoakes cleanup up some of the mmap() error handling and
addresses some potential performance issues.
- The series "x86/module: use large ROX pages for text allocations"
from Mike Rapoport teaches x86 to use large pages for
read-only-execute module text.
- The series "page allocation tag compression" from Suren Baghdasaryan
is followon maintenance work for the new page allocation profiling
feature.
- The series "page->index removals in mm" from Matthew Wilcox remove
most references to page->index in mm/. A slow march towards shrinking
struct page.
- The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs
interface tests" from Andrew Paniakin performs maintenance work for
DAMON's self testing code.
- The series "mm: zswap swap-out of large folios" from Kanchana Sridhar
improves zswap's batching of compression and decompression. It is a
step along the way towards using Intel IAA hardware acceleration for
this zswap operation.
- The series "kasan: migrate the last module test to kunit" from
Sabyrzhan Tasbolatov completes the migration of the KASAN built-in
tests over to the KUnit framework.
- The series "implement lightweight guard pages" from Lorenzo Stoakes
permits userapace to place fault-generating guard pages within a
single VMA, rather than requiring that multiple VMAs be created for
this. Improved efficiencies for userspace memory allocators are
expected.
- The series "memcg: tracepoint for flushing stats" from JP Kobryn uses
tracepoints to provide increased visibility into memcg stats flushing
activity.
- The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky
fixes a zram buglet which potentially affected performance.
- The series "mm: add more kernel parameters to control mTHP" from
Maíra Canal enhances our ability to control/configuremultisize THP
from the kernel boot command line.
- The series "kasan: few improvements on kunit tests" from Sabyrzhan
Tasbolatov has a couple of fixups for the KASAN KUnit tests.
- The series "mm/list_lru: Split list_lru lock into per-cgroup scope"
from Kairui Song optimizes list_lru memory utilization when lockdep
is enabled.
* tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (215 commits)
cma: enforce non-zero pageblock_order during cma_init_reserved_mem()
mm/kfence: add a new kunit test test_use_after_free_read_nofault()
zram: fix NULL pointer in comp_algorithm_show()
memcg/hugetlb: add hugeTLB counters to memcg
vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event
mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount
zram: ZRAM_DEF_COMP should depend on ZRAM
MAINTAINERS/MEMORY MANAGEMENT: add document files for mm
Docs/mm/damon: recommend academic papers to read and/or cite
mm: define general function pXd_init()
kmemleak: iommu/iova: fix transient kmemleak false positive
mm/list_lru: simplify the list_lru walk callback function
mm/list_lru: split the lock to per-cgroup scope
mm/list_lru: simplify reparenting and initial allocation
mm/list_lru: code clean up for reparenting
mm/list_lru: don't export list_lru_add
mm/list_lru: don't pass unnecessary key parameters
kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller
kasan: change kasan_atomics kunit test as KUNIT_CASE_SLOW
kasan: use EXPORT_SYMBOL_IF_KUNIT to export symbols
...
235 lines
5.8 KiB
C
235 lines
5.8 KiB
C
// SPDX-License-Identifier: GPL-2.0
|
|
/*
|
|
* Copyright (C) 2013 Linaro Limited
|
|
* Author: AKASHI Takahiro <takahiro.akashi@linaro.org>
|
|
* Copyright (C) 2017 Andes Technology Corporation
|
|
*/
|
|
|
|
#include <linux/ftrace.h>
|
|
#include <linux/uaccess.h>
|
|
#include <linux/memory.h>
|
|
#include <linux/stop_machine.h>
|
|
#include <asm/cacheflush.h>
|
|
#include <asm/text-patching.h>
|
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
|
void ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
|
|
{
|
|
mutex_lock(&text_mutex);
|
|
|
|
/*
|
|
* The code sequences we use for ftrace can't be patched while the
|
|
* kernel is running, so we need to use stop_machine() to modify them
|
|
* for now. This doesn't play nice with text_mutex, we use this flag
|
|
* to elide the check.
|
|
*/
|
|
riscv_patch_in_stop_machine = true;
|
|
}
|
|
|
|
void ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
|
|
{
|
|
riscv_patch_in_stop_machine = false;
|
|
mutex_unlock(&text_mutex);
|
|
}
|
|
|
|
static int ftrace_check_current_call(unsigned long hook_pos,
|
|
unsigned int *expected)
|
|
{
|
|
unsigned int replaced[2];
|
|
unsigned int nops[2] = {NOP4, NOP4};
|
|
|
|
/* we expect nops at the hook position */
|
|
if (!expected)
|
|
expected = nops;
|
|
|
|
/*
|
|
* Read the text we want to modify;
|
|
* return must be -EFAULT on read error
|
|
*/
|
|
if (copy_from_kernel_nofault(replaced, (void *)hook_pos,
|
|
MCOUNT_INSN_SIZE))
|
|
return -EFAULT;
|
|
|
|
/*
|
|
* Make sure it is what we expect it to be;
|
|
* return must be -EINVAL on failed comparison
|
|
*/
|
|
if (memcmp(expected, replaced, sizeof(replaced))) {
|
|
pr_err("%p: expected (%08x %08x) but got (%08x %08x)\n",
|
|
(void *)hook_pos, expected[0], expected[1], replaced[0],
|
|
replaced[1]);
|
|
return -EINVAL;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target,
|
|
bool enable, bool ra)
|
|
{
|
|
unsigned int call[2];
|
|
unsigned int nops[2] = {NOP4, NOP4};
|
|
|
|
if (ra)
|
|
make_call_ra(hook_pos, target, call);
|
|
else
|
|
make_call_t0(hook_pos, target, call);
|
|
|
|
/* Replace the auipc-jalr pair at once. Return -EPERM on write error. */
|
|
if (patch_insn_write((void *)hook_pos, enable ? call : nops, MCOUNT_INSN_SIZE))
|
|
return -EPERM;
|
|
|
|
return 0;
|
|
}
|
|
|
|
int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
|
|
{
|
|
unsigned int call[2];
|
|
|
|
make_call_t0(rec->ip, addr, call);
|
|
|
|
if (patch_insn_write((void *)rec->ip, call, MCOUNT_INSN_SIZE))
|
|
return -EPERM;
|
|
|
|
return 0;
|
|
}
|
|
|
|
int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
|
|
unsigned long addr)
|
|
{
|
|
unsigned int nops[2] = {NOP4, NOP4};
|
|
|
|
if (patch_insn_write((void *)rec->ip, nops, MCOUNT_INSN_SIZE))
|
|
return -EPERM;
|
|
|
|
return 0;
|
|
}
|
|
|
|
/*
|
|
* This is called early on, and isn't wrapped by
|
|
* ftrace_arch_code_modify_{prepare,post_process}() and therefor doesn't hold
|
|
* text_mutex, which triggers a lockdep failure. SMP isn't running so we could
|
|
* just directly poke the text, but it's simpler to just take the lock
|
|
* ourselves.
|
|
*/
|
|
int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
|
|
{
|
|
int out;
|
|
|
|
mutex_lock(&text_mutex);
|
|
out = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
|
|
mutex_unlock(&text_mutex);
|
|
|
|
return out;
|
|
}
|
|
|
|
int ftrace_update_ftrace_func(ftrace_func_t func)
|
|
{
|
|
int ret = __ftrace_modify_call((unsigned long)&ftrace_call,
|
|
(unsigned long)func, true, true);
|
|
|
|
return ret;
|
|
}
|
|
|
|
struct ftrace_modify_param {
|
|
int command;
|
|
atomic_t cpu_count;
|
|
};
|
|
|
|
static int __ftrace_modify_code(void *data)
|
|
{
|
|
struct ftrace_modify_param *param = data;
|
|
|
|
if (atomic_inc_return(¶m->cpu_count) == num_online_cpus()) {
|
|
ftrace_modify_all_code(param->command);
|
|
/*
|
|
* Make sure the patching store is effective *before* we
|
|
* increment the counter which releases all waiting CPUs
|
|
* by using the release variant of atomic increment. The
|
|
* release pairs with the call to local_flush_icache_all()
|
|
* on the waiting CPU.
|
|
*/
|
|
atomic_inc_return_release(¶m->cpu_count);
|
|
} else {
|
|
while (atomic_read(¶m->cpu_count) <= num_online_cpus())
|
|
cpu_relax();
|
|
|
|
local_flush_icache_all();
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
void arch_ftrace_update_code(int command)
|
|
{
|
|
struct ftrace_modify_param param = { command, ATOMIC_INIT(0) };
|
|
|
|
stop_machine(__ftrace_modify_code, ¶m, cpu_online_mask);
|
|
}
|
|
#endif
|
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
|
|
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
|
|
unsigned long addr)
|
|
{
|
|
unsigned int call[2];
|
|
unsigned long caller = rec->ip;
|
|
int ret;
|
|
|
|
make_call_t0(caller, old_addr, call);
|
|
ret = ftrace_check_current_call(caller, call);
|
|
|
|
if (ret)
|
|
return ret;
|
|
|
|
return __ftrace_modify_call(caller, addr, true, false);
|
|
}
|
|
#endif
|
|
|
|
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
|
/*
|
|
* Most of this function is copied from arm64.
|
|
*/
|
|
void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
|
|
unsigned long frame_pointer)
|
|
{
|
|
unsigned long return_hooker = (unsigned long)&return_to_handler;
|
|
unsigned long old;
|
|
|
|
if (unlikely(atomic_read(¤t->tracing_graph_pause)))
|
|
return;
|
|
|
|
/*
|
|
* We don't suffer access faults, so no extra fault-recovery assembly
|
|
* is needed here.
|
|
*/
|
|
old = *parent;
|
|
|
|
if (!function_graph_enter(old, self_addr, frame_pointer, parent))
|
|
*parent = return_hooker;
|
|
}
|
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
|
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
|
|
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
|
|
struct ftrace_ops *op, struct ftrace_regs *fregs)
|
|
{
|
|
prepare_ftrace_return(&arch_ftrace_regs(fregs)->ra, ip, arch_ftrace_regs(fregs)->s0);
|
|
}
|
|
#else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
|
|
extern void ftrace_graph_call(void);
|
|
int ftrace_enable_ftrace_graph_caller(void)
|
|
{
|
|
return __ftrace_modify_call((unsigned long)&ftrace_graph_call,
|
|
(unsigned long)&prepare_ftrace_return, true, true);
|
|
}
|
|
|
|
int ftrace_disable_ftrace_graph_caller(void)
|
|
{
|
|
return __ftrace_modify_call((unsigned long)&ftrace_graph_call,
|
|
(unsigned long)&prepare_ftrace_return, false, true);
|
|
}
|
|
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
|
|
#endif /* CONFIG_DYNAMIC_FTRACE */
|
|
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
|