Commit Graph

46384 Commits

Author SHA1 Message Date
Kohei Enju
c03bb2fa32 bpf: Fix out-of-bounds read in check_atomic_load/store()
syzbot reported the following splat [0].

In check_atomic_load/store(), register validity is not checked before
atomic_ptr_type_ok(). This causes the out-of-bounds read in is_ctx_reg()
called from atomic_ptr_type_ok() when the register number is MAX_BPF_REG
or greater.

Call check_load_mem()/check_store_reg() before atomic_ptr_type_ok()
to avoid the OOB read.

However, some tests introduced by commit ff3afe5da9 ("selftests/bpf: Add
selftests for load-acquire and store-release instructions") assume
calling atomic_ptr_type_ok() before checking register validity.
Therefore the swapping of order unintentionally changes verifier messages
of these tests.

For example in the test load_acquire_from_pkt_pointer(), expected message
is 'BPF_ATOMIC loads from R2 pkt is not allowed' although actual messages
are different.

  validate_msgs:FAIL:754 expect_msg
  VERIFIER LOG:
  =============
  Global function load_acquire_from_pkt_pointer() doesn't return scalar. Only those are supported.
  0: R1=ctx() R10=fp0
  ; asm volatile ( @ verifier_load_acquire.c:140
  0: (61) r2 = *(u32 *)(r1 +0)          ; R1=ctx() R2_w=pkt(r=0)
  1: (d3) r0 = load_acquire((u8 *)(r2 +0))
  invalid access to packet, off=0 size=1, R2(id=0,off=0,r=0)
  R2 offset is outside of the packet
  processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
  =============
  EXPECTED   SUBSTR: 'BPF_ATOMIC loads from R2 pkt is not allowed'
  #505/19  verifier_load_acquire/load-acquire from pkt pointer:FAIL

This is because instructions in the test don't pass check_load_mem() and
therefore don't enter the atomic_ptr_type_ok() path.
In this case, we have to modify instructions so that they pass the
check_load_mem() and trigger atomic_ptr_type_ok().
Similarly for store-release tests, we need to modify instructions so that
they pass check_store_reg().

Like load_acquire_from_pkt_pointer(), modify instructions in:
  load_acquire_from_sock_pointer()
  store_release_to_ctx_pointer()
  store_release_to_pkt_pointer()

Also in store_release_to_sock_pointer(), check_store_reg() returns error
early and atomic_ptr_type_ok() is not triggered, since write to sock
pointer is not possible in general.
We might be able to remove the test, but for now let's leave it and just
change the expected message.

[0]
 BUG: KASAN: slab-out-of-bounds in is_ctx_reg kernel/bpf/verifier.c:6185 [inline]
 BUG: KASAN: slab-out-of-bounds in atomic_ptr_type_ok+0x3d7/0x550 kernel/bpf/verifier.c:6223
 Read of size 4 at addr ffff888141b0d690 by task syz-executor143/5842

 CPU: 1 UID: 0 PID: 5842 Comm: syz-executor143 Not tainted 6.14.0-rc3-syzkaller-gf28214603dc6 #0
 Call Trace:
  <TASK>
  __dump_stack lib/dump_stack.c:94 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
  print_address_description mm/kasan/report.c:408 [inline]
  print_report+0x16e/0x5b0 mm/kasan/report.c:521
  kasan_report+0x143/0x180 mm/kasan/report.c:634
  is_ctx_reg kernel/bpf/verifier.c:6185 [inline]
  atomic_ptr_type_ok+0x3d7/0x550 kernel/bpf/verifier.c:6223
  check_atomic_store kernel/bpf/verifier.c:7804 [inline]
  check_atomic kernel/bpf/verifier.c:7841 [inline]
  do_check+0x89dd/0xedd0 kernel/bpf/verifier.c:19334
  do_check_common+0x1678/0x2080 kernel/bpf/verifier.c:22600
  do_check_main kernel/bpf/verifier.c:22691 [inline]
  bpf_check+0x165c8/0x1cca0 kernel/bpf/verifier.c:23821

Reported-by: syzbot+a5964227adc0f904549c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a5964227adc0f904549c
Tested-by: syzbot+a5964227adc0f904549c@syzkaller.appspotmail.com
Fixes: e24bbad29a8d ("bpf: Introduce load-acquire and store-release instructions")
Fixes: ff3afe5da9 ("selftests/bpf: Add selftests for load-acquire and store-release instructions")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250322045340.18010-5-enjuk@amazon.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-22 06:15:57 -07:00
Ryan Roberts
a2c6f9c3ca selftests/mm: speed up split_huge_page_test
create_pagecache_thp_and_fd() was previously writing a file sized at twice
the PMD size by making a per-byte write syscall.  This was quite slow when
the PMD size is 4M, but completely intolerable for 32M (PMD size for
arm64's 16K page size), and 512M (PMD size for arm64's 64K page size).

The byte pattern has a 256 byte period, so let's create a 1K buffer and
fill it with exactly 4 periods.  Then we can write the buffer as many
times as is required to fill the file.  This makes things much more
tolerable.

The test now passes for 16K page size.  It still fails for 64K page size
because MAX_PAGECACHE_ORDER is too small for 512M folio size (I think).

Link: https://lkml.kernel.org/r/20250318174343.243631-3-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Rafael Aquini <raquini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-21 22:03:16 -07:00
Ryan Roberts
735b3f7e77 selftests/mm: uffd-unit-tests support for hugepages > 2M
uffd-unit-tests uses a memory area with a fixed 32M size.  Then it
calculates the number of pages by dividing by page_size, which itself is
either the base page size or the PMD huge page size depending on the test
config.  For the latter, we end up with nr_pages=1 for arm64 16K base
pages, and nr_pages=0 for 64K base pages.  This doesn't end well.

So let's make the 32M size a floor and also ensure that we have at least 2
pages given the PMD size.  With this change, the tests pass on arm64 64K
base page size configuration.

Link: https://lkml.kernel.org/r/20250318174343.243631-2-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Rafael Aquini <raquini@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-21 22:03:15 -07:00
Brendan Jackman
d8a866c766 selftests/mm: add commentary about 9pfs bugs
As discussed here:

https://lore.kernel.org/lkml/Z9RRkL1hom48z3Tt@google.com/

This code could benefit from some more commentary.

To avoid needing to comment the same thing in multiple places (I guess
more of these SKIPs will need to be added over time, for now I am only
like 20% of the way through Project Run run_vmtests.sh Successfully), add
a dummy "skip tests for this specific reason" function that basically just
serves as a hook to hang comments on.

Link: https://lkml.kernel.org/r/20250317-9pfs-comments-v1-1-9ac96043e146@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-21 22:03:14 -07:00
Ian Rogers
307ef667e9 libbpf: Add namespace for errstr making it libbpf_errstr
When statically linking symbols can be replaced with those from other
statically linked libraries depending on the link order and the hoped
for "multiple definition" error may not appear. To avoid conflicts it
is good practice to namespace symbols, this change renames errstr to
libbpf_errstr. To avoid churn a #define is used to turn use of
errstr(err) to libbpf_errstr(err).

Fixes: 1633a83bf9 ("libbpf: Introduce errstr() for stringifying errno")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250320222439.1350187-1-irogers@google.com
2025-03-21 13:44:54 -07:00
Ming Lei
ffde32a49a selftests: ublk: fix starting ublk device
Firstly ublk char device node may not be created by udev yet, so wait
a while until it can be opened or timeout.

Secondly delete created ublk device in case of start failure, otherwise
the device becomes zombie.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250321135324.259677-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-21 14:09:24 -06:00
John Stultz
e40d3709c0 selftests/timers: Improve skew_consistency by testing with other clockids
Lei Chen reported a bug with CLOCK_MONOTONIC_COARSE having inconsistencies
when NTP is adjusting the clock frequency.

This has gone seemingly undetected for ~15 years, illustrating a clear gap
in our testing.

The skew_consistency test is intended to catch this sort of problem, but
was focused on only evaluating CLOCK_MONOTONIC, and thus missed the problem
on CLOCK_MONOTONIC_COARSE.

So adjust the test to run with all clockids for 60 seconds each instead of
10 minutes with just CLOCK_MONOTONIC.

Reported-by: Lei Chen <lei.chen@smartx.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250320200306.1712599-2-jstultz@google.com
Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/
2025-03-21 19:16:18 +01:00
Breno Leitao
4b73dc83ed selftests: netconsole: Add tests for 'release' feature in sysdata
Expands the self-tests to include the 'release' feature in
sysdata.

Verifies that enabling the 'release' feature appends the
correct data and ensures that disabling it functions as expected.

When enabled, the message should have an item similar to in the
userdata: `release=$(uname -r)`

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250314-netcons_release-v1-5-07979c4b86af@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21 18:59:25 +01:00
Mickaël Salaün
15383a0d63 landlock: Add the errata interface
Some fixes may require user space to check if they are applied on the
running kernel before using a specific feature.  For instance, this
applies when a restriction was previously too restrictive and is now
getting relaxed (e.g. for compatibility reasons).  However, non-visible
changes for legitimate use (e.g. security fixes) do not require an
erratum.

Because fixes are backported down to a specific Landlock ABI, we need a
way to avoid cherry-pick conflicts.  The solution is to only update a
file related to the lower ABI impacted by this issue.  All the ABI files
are then used to create a bitmask of fixes.

The new errata interface is similar to the one used to get the supported
Landlock ABI version, but it returns a bitmask instead because the order
of fixes may not match the order of versions, and not all fixes may
apply to all versions.

The actual errata will come with dedicated commits.  The description is
not actually used in the code but serves as documentation.

Create the landlock_abi_version symbol and use its value to check errata
consistency.

Update test_base's create_ruleset_checks_ordering tests and add errata
tests.

This commit is backportable down to the first version of Landlock.

Fixes: 3532b0b435 ("landlock: Enable user space to infer supported features")
Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250318161443.279194-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-03-21 12:12:19 +01:00
Eric Biggers
ca17aa6640 crypto: lib/chacha - remove unused arch-specific init support
All implementations of chacha_init_arch() just call
chacha_init_generic(), so it is pointless.  Just delete it, and replace
chacha_init() with what was previously chacha_init_generic().

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-21 17:39:06 +08:00
Ilkka Koskinen
182f12f319 perf vendor events arm64 AmpereOneX: Fix frontend_bound calculation
frontend_bound metrics was miscalculated due to different scaling in
a couple of metrics it depends on. Change the scaling to match with
AmpereOne.

Fixes: 16438b652b ("perf vendor events arm64 AmpereOneX: Add core PMU events and metrics")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250313201559.11332-3-ilkka@os.amperecomputing.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:57 -07:00
Ilkka Koskinen
c0b60ce461 perf vendor events arm64: AmpereOne/AmpereOneX: Mark LD_RETIRED impacted by errata
Atomic instructions are both memory-reading and memory-writing
instructions and so should be counted by both LD_RETIRED and ST_RETIRED
performance monitoring events. However LD_RETIRED does not count atomic
instructions.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250313201559.11332-2-ilkka@os.amperecomputing.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:57 -07:00
Ian Rogers
7b172b92c1 perf trace: Fix evlist memory leak
Leak sanitizer was reporting a memory leak in the "perf record and
replay" test. Add evlist__delete to trace__exit, also ensure
trace__exit is called after trace__record.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-15-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:35 -07:00
Ian Rogers
874fa827df perf trace: Fix BTF memory leak
Add missing btf__free in trace__exit.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-14-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:31 -07:00
Ian Rogers
ccc60dce3e perf trace: Make syscall table stable
Namhyung fixed the syscall table being reallocated and moving by
reloading the system call pointer after a move:
https://lore.kernel.org/lkml/Z9YHCzINiu4uBQ8B@google.com/
This could be brittle so this patch changes the syscall table to be an
array of pointers of "struct syscall" that don't move. Remove
unnecessary copies and searches with this change.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:27 -07:00
Ian Rogers
95b802ca9d perf syscalltbl: Mask off ABI type for MIPS system calls
Arnd Bergmann described that MIPS system calls don't necessarily start
from 0 as an ABI prefix is applied:
https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
When decoding the "id" (aka system call number) for MIPS ignore values
greater-than 1000.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:23 -07:00
Ian Rogers
16ab5c708d perf build: Remove Makefile.syscalls
Now a single beauty file is generated and used by all architectures,
remove the per-architecture Makefiles, Kbuild files and previous
generator script.

Note: there was conversation with Charlie Jenkins
<charlie@rivosinc.com> and they'd written an alternate approach to
support multiple architectures:
https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
It would have been better to have helped Charlie fix their series (my
apologies) but they agreed that the approach taken here was likely
best for longer term maintainability:
https://lore.kernel.org/lkml/Z6Jk_UN9i69QGqUj@ghost/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:20 -07:00
Ian Rogers
1470eaa574 perf syscalltbl: Use lookup table containing multiple architectures
Switch to use the lookup table containing all architectures rather
than tables matching the perf binary.

This fixes perf trace when executed on a 32-bit i386 binary on an
x86-64 machine. Note in the following the system call names of the
32-bit i386 binary as seen by an x86-64 perf.

Before:
```
         ? (         ): a.out/447296  ... [continued]: munmap())                                           = 0
     0.024 ( 0.001 ms): a.out/447296 recvfrom(ubuf: 0x2, size: 4160585708, flags: DONTROUTE|CTRUNC|TRUNC|DONTWAIT|EOR|WAITALL|FIN|SYN|CONFIRM|RST|ERRQUEUE|NOSIGNAL|WAITFORONE|BATCH|SOCK_DEVMEM|ZEROCOPY|FASTOPEN|CMSG_CLOEXEC|0x91f80000, addr: 0xe30, addr_len: 0xffce438c) = 1475198976
     0.042 ( 0.003 ms): a.out/447296 lgetxattr(name: "", value: 0x3, size: 34)                             = 4160344064
     0.054 ( 0.003 ms): a.out/447296 dup2(oldfd: -134422744, newfd: 4)                                     = -1 ENOENT (No such file or directory)
     0.060 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x2e646c2f6374652f,.iov_len = (__kernel_size_t)7307199665335594867,}, vlen: 557056, pos_h: 4160585708) = 3
     0.074 ( 0.004 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2)                              = 4160237568
     0.080 ( 0.001 ms): a.out/447296 lstat(filename: "", statbuf: 0x193f6)                                 = 0
     0.089 ( 0.007 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x3833692f62696c2f,.iov_len = (__kernel_size_t)3276497845987585334,}, vlen: 557056, pos_h: 4160585708) = 3
     0.097 ( 0.002 ms): a.out/447296 close(fd: 3</proc/447296/status>)                                     = 512
     0.103 ( 0.002 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2050)                           = 4157935616
     0.107 ( 0.007 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x5, size: 2066)             = 4158078976
     0.116 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x1, size: 2066)             = 4159639552
     0.121 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 2066)             = 4160184320
     0.129 ( 0.002 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 50)               = 4160196608
     0.138 ( 0.001 ms): a.out/447296 lstat(filename: "")                                                   = 0
     0.145 ( 0.002 ms): a.out/447296 mq_timedreceive(mqdes: 4291706800, u_msg_ptr: 0xf7f9ea48, msg_len: 134616640, u_msg_prio: 0xf7fd7fec, u_abs_timeout: (struct __kernel_timespec){.tv_sec = (__kernel_time64_t)-578174027777317696,.tv_nsec = (long long int)4160349376,}) = 0
     0.148 ( 0.001 ms): a.out/447296 mkdirat(dfd: -134617816, pathname: " ��� ���▒���▒���", mode: IFREG|ISUID|IRUSR|IWGRP|0xf7fd0000) = 447296
     0.150 ( 0.001 ms): a.out/447296 process_vm_writev(pid: -134617812, lvec: (struct iovec){.iov_base = (void *)0xf7f9e9c8f7f9e4c0,.iov_len = (__kernel_size_t)4160349376,}, liovcnt: 4160588048, rvec: (struct iovec){}, riovcnt: 4160585708, flags: 4291707352) = 0
     0.197 ( 0.004 ms): a.out/447296 capget(header: 4160184320, dataptr: 8192)                             = 0
     0.202 ( 0.002 ms): a.out/447296 capget(header: 1448669184, dataptr: 4096)                             = 0
     0.208 ( 0.002 ms): a.out/447296 capget(header: 4160577536, dataptr: 8192)                             = 0
     0.220 ( 0.001 ms): a.out/447296 getxattr(pathname: "", name: "c������", value: 0xf7f77e34, size: 1)  = 0
     0.228 ( 0.005 ms): a.out/447296 fchmod(fd: -134729728, mode: IRUGO|IWUGO|IFREG|IFIFO|ISVTX|IXUSR|0x10000) = 0
     0.240 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: 0x5658e008, pos_h: 4160192052)            = 3
     0.250 ( 0.008 ms): a.out/447296 close(fd: 3</proc/447296/status>)                                     = 1436
     0.260 ( 0.018 ms): a.out/447296 stat(filename: "", statbuf: 0xffce32ac)                               = 1436
     0.288 (1000.213 ms): a.out/447296 readlinkat(buf: 0xffce31d4, bufsiz: 4291703244)                       = 0
```

After:
```
         ? (         ): a.out/442930  ... [continued]: execve())                                           = 0
     0.023 ( 0.002 ms): a.out/442930 brk()                                                                 = 0x57760000
     0.052 ( 0.003 ms): a.out/442930 access(filename: 0xf7f5af28, mode: R)                                 = -1 ENOENT (No such file or directory)
     0.059 ( 0.009 ms): a.out/442930 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
     0.078 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>)                                     = 0
     0.087 ( 0.007 ms): a.out/442930 openat(dfd: CWD, filename: "/lib/i386-linux-", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
     0.095 ( 0.002 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdbb70, count: 512)         = 512
     0.135 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>)                                     = 0
     0.148 ( 0.001 ms): a.out/442930 set_tid_address(tidptr: 0xf7f2b528)                                   = 442930 (a.out)
     0.150 ( 0.001 ms): a.out/442930 set_robust_list(head: 0xf7f2b52c, len: 12)                            =
     0.196 ( 0.004 ms): a.out/442930 mprotect(start: 0xf7f03000, len: 8192, prot: READ)                    = 0
     0.202 ( 0.002 ms): a.out/442930 mprotect(start: 0x5658e000, len: 4096, prot: READ)                    = 0
     0.207 ( 0.002 ms): a.out/442930 mprotect(start: 0xf7f63000, len: 8192, prot: READ)                    = 0
     0.230 ( 0.005 ms): a.out/442930 munmap(addr: 0xf7f10000, len: 103414)                                 = 0
     0.244 ( 0.010 ms): a.out/442930 openat(dfd: CWD, filename: 0x5658d008)                                = 3
     0.255 ( 0.007 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdb67c, count: 4096)        = 1436
     0.264 ( 0.018 ms): a.out/442930 write(fd: 1</dev/pts/4>, buf: , count: 1436)                          = 1436
     0.292 (1000.173 ms): a.out/442930 clock_nanosleep(rqtp: { .tv_sec: 17866546940376776704, .tv_nsec: 4159878336 }, rmtp: 0xffbdb59c) = 0
  1000.478 (         ): a.out/442930 exit_group()                                                          = ?
```

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:13 -07:00
Ian Rogers
0fb641f0a1 perf trace beauty: Add syscalltbl.sh generating all system call tables
Rather than generating individual syscall header files generate a
single trace/beauty/generated/syscalltbl.c. In a syscalltbls array
have references to each architectures tables along with the
corresponding e_machine. When the 32-bit or 64-bit table is ambiguous,
match the perf binary's type. For ARM32 don't use the arm64 32-bit
table which is smaller. EM_NONE is present for is no machine matches.

Conditionally compile the tables, only having the appropriate 32 and
64-bit table. If ALL_SYSCALLTBL is defined all tables can be
compiled.

Add comment for noreturn column suggested by Arnd Bergmann:
https://lore.kernel.org/lkml/d47c35dd-9c52-48e7-a00d-135572f11fbb@app.fastmail.com/
and added in commit 9142be9e64 ("x86/syscall: Mark exit[_group]
syscall handlers __noreturn").

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:08 -07:00
Ian Rogers
70351029b5 perf thread: Add support for reading the e_machine type for a thread
First try to read the e_machine from the dsos associated with the
thread's maps. If live use the executable from /proc/pid/exe and read
the e_machine from the ELF header. On failure use EM_HOST. Change
builtin-trace syscall functions to pass e_machine from the thread
rather than EM_HOST, so that in later patches when syscalltbl can use
the e_machine the system calls are specific to the architecture.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:05 -07:00
Ian Rogers
afffec6f03 perf dso: Add support for reading the e_machine type for a dso
For ELF file dsos read the e_machine from the ELF header. For kernel
types assume the e_machine matches the perf tool. In other cases
return EM_NONE.

When reading from the ELF header use DSO__SWAP that may need
dso->needs_swap initializing. Factor out dso__swap_init to allow this.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:02 -07:00
Ian Rogers
5c2938fe78 perf syscalltbl: Remove struct syscalltbl
The syscalltbl held entries of system call name and number pairs,
generated from a native syscalltbl at start up. As there are gaps in
the system call number there is a notion of index into the
table. Going forward we want the system call table to be identifiable
by a machine type, for example, i386 vs x86-64. Change the interface
to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
is passed (2) the index to syscall number and system call name mapping
is computed at build time.

Two tables are used for this, an array of system call number to name,
an array of system call numbers sorted by the system call name. The
sorted array doesn't store strings in part to save memory and
relocations. The index notion is carried forward and is an index into
the sorted array of system call numbers, the data structures are
opaque (held only in syscalltbl.c), and so the number of indices for a
machine type is exposed as a new API.

The arrays are computed in the syscalltbl.sh script and so no start-up
time computation and storage is necessary.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:57 -07:00
Ian Rogers
3d94b8441c perf trace: Reorganize syscalls
Identify struct syscall information in the syscalls table by a machine
type and syscall number, not just system call number. Having the
machine type means that 32-bit system calls can be differentiated from
64-bit ones on a machine capable of both. Having a table for all
machine types and all system call numbers would be too large, so
maintain a sorted array of system calls as they are encountered.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:53 -07:00
Ian Rogers
af472d3c44 perf syscalltbl: Remove syscall_table.h
The definition of "static const char *const syscalltbl[] = {" is done
in a generated syscalls_32.h or syscalls_64.h that is architecture
dependent. In order to include the appropriate file a syscall_table.h
is found via the perf include path and it includes the syscalls_32.h
or syscalls_64.h as appropriate.

To support having multiple syscall tables, one for 32-bit and one for
64-bit, or for different architectures, an include path cannot be
used. Remove syscall_table.h because of this and inline what it does
into syscalltbl.c.

For architectures without a syscall_table.h this will cause a failure
to include either syscalls_32.h or syscalls_64.h rather than a failure
to include syscall_table.h. For architectures that only included one
or other, the behavior matches BITS_PER_LONG as previously done on
architectures supporting both syscalls_32.h and syscalls_64.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:35 -07:00
Ian Rogers
4773175c9d perf dso: kernel-doc for enum dso_binary_type
There are many and non-obvious meanings to the dso_binary_type enum
values. Add kernel-doc to speed interpretting their meanings.

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:25 -07:00
Ian Rogers
f1794ecb0c perf dso: Move libunwind dso_data variables into ifdef
The variables elf_base_addr, debug_frame_offset, eh_frame_hdr_addr and
eh_frame_hdr_offset are only accessed in unwind-libunwind-local.c
which is conditionally built on having libunwind support. Make the
variables conditional on libunwind support too.

Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:56:29 -07:00
Ming Lei
96af5af47b selftests: ublk: fix write cache implementation
For loop target, write cache isn't enabled, and each write isn't be
marked as DSYNC too.

Fix it by enabling write cache, meantime fix FLUSH implementation
by not taking LBA range into account, and there isn't such info
for FLUSH command.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250321004758.152572-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20 20:01:03 -06:00
Ming Lei
beb31982ad selftests: ublk: add variable for user to not show test result
Some user decides test result by exit code only, and wouldn't like to be
bothered by the test result.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20 17:18:55 -06:00
Ming Lei
fe2230d921 selftests: ublk: don't show modprobe failure
ublk_drv may be built-in, so don't show modprobe failure, and we
do check `/dev/ublk-control` for skipping test if ublk_drv isn't
enabled.

Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20 17:18:55 -06:00
Ming Lei
8764c1a72b selftests: ublk: add one dependency header
Add one dependency helper which can include new uapi definition which
isn't synced from kernel.

This way also helps a lot for downstream test deployment.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250320013743.4167489-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20 17:18:55 -06:00
Paolo Abeni
6f13bec53a Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says:

====================
pull-request: bpf-next 2025-03-13

The following pull-request contains BPF updates for your *net-next* tree.

We've added 4 non-merge commits during the last 3 day(s) which contain
a total of 2 files changed, 35 insertions(+), 12 deletions(-).

The main changes are:

1) bpf_getsockopt support for TCP_BPF_RTO_MIN and TCP_BPF_DELACK_MAX,
   from Jason Xing

bpf-next-for-netdev

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Add bpf_getsockopt() for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN
  tcp: bpf: Support bpf_getsockopt for TCP_BPF_DELACK_MAX
  tcp: bpf: Support bpf_getsockopt for TCP_BPF_RTO_MIN
  tcp: bpf: Introduce bpf_sol_tcp_getsockopt to support TCP_BPF flags
====================

Link: https://patch.msgid.link/20250313221620.2512684-1-martin.lau@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 21:48:14 +01:00
Paolo Abeni
f491593394 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc8).

Conflict:

tools/testing/selftests/net/Makefile
  03544faad7 ("selftest: net: add proc_net_pktgen")
  3ed61b8938 ("selftests: net: test for lwtunnel dst ref loops")

tools/testing/selftests/net/config:
  85cb3711ac ("selftests: net: Add test cases for link and peer netns")
  3ed61b8938 ("selftests: net: test for lwtunnel dst ref loops")

Adjacent commits:

tools/testing/selftests/net/Makefile
  c935af429e ("selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY")
  355d940f4d ("Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 21:38:01 +01:00
Björn Töpel
e16e64f9e0 selftests/bpf: Sanitize pointer prior fclose()
There are scenarios where env.{sub,}test_state->stdout_saved, can be
NULL, e.g. sometimes when the watchdog timeout kicks in, or if the
open_memstream syscall is not available.

Avoid crashing test_progs by adding an explicit NULL check prior the
fclose() call.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250318081648.122523-1-bjorn@kernel.org
2025-03-20 10:35:07 -07:00
Paolo Bonzini
0afd104fb3 Merge tag 'kvmarm-6.15' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.15

 - Nested virtualization support for VGICv3, giving the nested
   hypervisor control of the VGIC hardware when running an L2 VM

 - Removal of 'late' nested virtualization feature register masking,
   making the supported feature set directly visible to userspace

 - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
   of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers

 - Paravirtual interface for discovering the set of CPU implementations
   where a VM may run, addressing a longstanding issue of guest CPU
   errata awareness in big-little systems and cross-implementation VM
   migration

 - Userspace control of the registers responsible for identifying a
   particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
   allowing VMs to be migrated cross-implementation

 - pKVM updates, including support for tracking stage-2 page table
   allocations in the protected hypervisor in the 'SecPageTable' stat

 - Fixes to vPMU, ensuring that userspace updates to the vPMU after
   KVM_RUN are reflected into the backing perf events
2025-03-20 12:54:12 -04:00
Paolo Bonzini
c0f99fb4e5 Merge tag 'kvm-riscv-6.15-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.15

- Disable the kernel perf counter during configure
- KVM selftests improvements for PMU
- Fix warning at the time of KVM module removal
2025-03-20 12:53:34 -04:00
Linus Torvalds
5fc3193608 Merge tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from can, bluetooth and ipsec.

  This contains a last minute revert of a recent GRE patch, mostly to
  allow me stating there are no known regressions outstanding.

  Current release - regressions:

   - revert "gre: Fix IPv6 link-local address generation."

   - eth: ti: am65-cpsw: fix NAPI registration sequence

  Previous releases - regressions:

   - ipv6: fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().

   - mptcp: fix data stream corruption in the address announcement

   - bluetooth: fix connection regression between LE and non-LE adapters

   - can:
       - flexcan: only change CAN state when link up in system PM
       - ucan: fix out of bound read in strscpy() source

  Previous releases - always broken:

   - lwtunnel: fix reentry loops

   - ipv6: fix TCP GSO segmentation with NAT

   - xfrm: force software GSO only in tunnel mode

   - eth: ti: icssg-prueth: add lock to stats

  Misc:

   - add Andrea Mayer as a maintainer of SRv6"

* tag 'net-6.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
  MAINTAINERS: Add Andrea Mayer as a maintainer of SRv6
  Revert "gre: Fix IPv6 link-local address generation."
  Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."
  net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES
  tools headers: Sync uapi/asm-generic/socket.h with the kernel sources
  mptcp: Fix data stream corruption in the address announcement
  selftests: net: test for lwtunnel dst ref loops
  net: ipv6: ioam6: fix lwtunnel_output() loop
  net: lwtunnel: fix recursion loops
  net: ti: icssg-prueth: Add lock to stats
  net: atm: fix use after free in lec_send()
  xsk: fix an integer overflow in xp_create_and_assign_umem()
  net: stmmac: dwc-qos-eth: use devm_kzalloc() for AXI data
  selftests: drv-net: use defer in the ping test
  phy: fix xa_alloc_cyclic() error handling
  dpll: fix xa_alloc_cyclic() error handling
  devlink: fix xa_alloc_cyclic() error handling
  ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create().
  ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw().
  net: ipv6: fix TCP GSO segmentation with NAT
  ...
2025-03-20 09:39:15 -07:00
Namhyung Kim
d10a7aaaf8 perf report: Disable children column for data type profiling
I've realized that it doesn't make sense to accumulate the samples to
parent in the callchain when data type profiling is enabled.  Because it
won't have the same data type access in the parent.  Otherwise it'd see
something like this:

  $ perf report -s type --stdio -g none
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:Pu'
  # Event count (approx.): 8266456478
  #
  # Children  Latency      Self   Latency  Data Type
  # ........  .......  ........  ........  .........
  #
     698.97%   697.72%    99.80%    99.61%  (unknown)
       0.09%    0.18%     0.09%     0.18%  Elf64_Rela
       0.05%    0.10%     0.05%     0.10%  unsigned char
       0.05%    0.10%     0.05%     0.10%  struct exit_function_list
       0.00%    0.01%     0.00%     0.01%  struct rtld_global

Link: https://lore.kernel.org/r/20250307080829.354947-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Namhyung Kim
6df71c7237 perf report: Allow hierarchy mode for --children
It was prohibited because the output fields in the children mode were
not handled properly with hierarchy.  But we can have the output fields
in the same level, it can allow them together.

For example, latency mode adds more output fields by default and now
they are displayed properly.

  $ perf record --latency -g -- perf test -w thloop

  $ perf report -H --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:Pu'
  # Event count (approx.): 8266456478
  #
  #       Children  Latency  Overhead   Latency  Command / Shared Object / Symbol
  # ...........................................  ........................................................
  #
       0.08%    0.16%   100.00%   100.00%        perf
          0.08%    0.16%     0.24%     0.47%        ld-linux-x86-64.so.2
             0.12%    0.24%     0.12%     0.24%        [.] _dl_relocate_object
             0.08%    0.16%     0.08%     0.16%        [.] _dl_lookup_symbol_x
             0.03%    0.06%     0.03%     0.06%        [.] strcmp
             0.00%    0.01%     0.00%     0.01%        [.] _dl_start
             0.00%    0.00%     0.00%     0.00%        [.] _dl_start_user
             0.00%    0.00%     0.00%     0.00%        [.] _dl_sysdep_start
             0.00%    0.00%     0.00%     0.00%        [.] _start
             0.00%    0.00%     0.00%     0.00%        [.] dl_main
          0.03%    0.06%     0.03%     0.06%        libLLVM-16.so.1
             0.03%    0.06%     0.03%     0.06%        [.] llvm::StringMapImpl::RehashTable(unsigned int)
             0.00%    0.00%     0.00%     0.00%        [.] 0x00007f137ccd18e8
          0.00%    0.00%    99.66%    99.31%        perf
            99.66%   99.31%    99.66%    99.31%        [.] test_loop
              |
              |--49.86%--0x7f137b633d68
              |          0x55dbdbbb7d2c
              ...

Link: https://lore.kernel.org/r/20250307080829.354947-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Namhyung Kim
a1bbd66627 perf sort: Keep output fields in the same level
This is useful for hierarchy output mode where the first level is
considered as output fields.  We want them in the same level so that it
can show only the remaining groups in the hierarchy.

Before:
  $ perf report -s overhead,sample,period,comm,dso -H --stdio
  ...
  #          Overhead  Samples / Period / Command / Shared Object
  # .................  ..........................................
  #
     100.00%           4035
        100.00%           3835883066
           100.00%           perf
               99.37%           perf
                0.50%           ld-linux-x86-64.so.2
                0.06%           [unknown]
                0.04%           libc.so.6
                0.02%           libLLVM-16.so.1

After:
  $ perf report -s overhead,sample,period,comm,dso -H --stdio
  ...
  #    Overhead       Samples        Period  Command / Shared Object
  # .......................................  .......................
  #
     100.00%          4035    3835883066     perf
         99.37%          4005    3811826223     perf
          0.50%            19      19210014     ld-linux-x86-64.so.2
          0.06%             8       2367089     [unknown]
          0.04%             2       1720336     libc.so.6
          0.02%             1        759404     libLLVM-16.so.1

Acked-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307080829.354947-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Guillaume Nault
355d940f4d Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."
This reverts commit 6f50175cca.

Commit 183185a18f ("gre: Fix IPv6 link-local address generation.") is
going to be reverted. So let's revert the corresponding kselftest
first.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/259a9e98f7f1be7ce02b53d0b4afb7c18a8ff747.1742418408.git.gnault@redhat.com
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 15:46:16 +01:00
Christian Brauner
4d5483a42c selftests/pidfd: third test for multi-threaded exec polling
Ensure that during a multi-threaded exec and premature thread-group
leader exit no exit notification is generated.

Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-4-da678ce805bf@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20 15:32:43 +01:00
Christian Brauner
9b6f723db5 selftests/pidfd: second test for multi-threaded exec polling
Ensure that during a multi-threaded exec and premature thread-group
leader exit no exit notification is generated.

Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-3-da678ce805bf@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20 15:32:43 +01:00
Christian Brauner
db7ce91e22 selftests/pidfd: first test for multi-threaded exec polling
Add first test for premature thread-group leader exit.

Link: https://lore.kernel.org/r/20250320-work-pidfs-thread_group-v4-2-da678ce805bf@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-20 15:32:43 +01:00
Alexander Mikhalitsyn
23b763302c tools headers: Sync uapi/asm-generic/socket.h with the kernel sources
This also fixes a wrong definitions for SCM_TS_OPT_ID & SO_RCVPRIORITY.

Accidentally found while working on another patchset.

Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jason Xing <kerneljasonxing@gmail.com>
Cc: Anna Emese Nyiri <annaemesenyiri@gmail.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Fixes: a89568e9be ("selftests: txtimestamp: add SCM_TS_OPT_ID test")
Fixes: e45469e594 ("sock: Introduce SO_RCVPRIORITY socket option")
Link: https://lore.kernel.org/netdev/20250314195257.34854-1-kuniyu@amazon.com/
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250314214155.16046-1-aleksandr.mikhalitsyn@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 15:14:46 +01:00
Justin Iurman
3ed61b8938 selftests: net: test for lwtunnel dst ref loops
As recently specified by commit 0ea09cbf83 ("docs: netdev: add a note
on selftest posting") in net-next, the selftest is therefore shipped in
this series. However, this selftest does not really test this series. It
needs this series to avoid crashing the kernel. What it really tests,
thanks to kmemleak, is what was fixed by the following commits:
- commit c71a192976 ("net: ipv6: fix dst refleaks in rpl, seg6 and
ioam6 lwtunnels")
- commit 92191dd107 ("net: ipv6: fix dst ref loops in rpl, seg6 and
ioam6 lwtunnels")
- commit c64a0727f9 ("net: ipv6: fix dst ref loop on input in seg6
lwt")
- commit 13e55fbaec ("net: ipv6: fix dst ref loop on input in rpl
lwt")
- commit 0e7633d7b9 ("net: ipv6: fix dst ref loop in ila lwtunnel")
- commit 5da15a9c11 ("net: ipv6: fix missing dst ref drop in ila
lwtunnel")

Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20250314120048.12569-4-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 11:25:52 +01:00
Geliang Tang
9cf0128e64 selftests: mptcp: add pm sysctl mapping tests
This patch checks if the newly added net.mptcp.path_manager is mapped
successfully from or to the old net.mptcp.pm_type in userspace_pm.sh.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-12-f4e4a88efc50@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 10:14:49 +01:00
James Clark
f5b07010c1 libperf: Don't remove -g when EXTRA_CFLAGS are used
When using EXTRA_CFLAGS, for example "EXTRA_CFLAGS=-DREFCNT_CHECKING=1",
this construct stops setting -g which you'd expect would not be affected
by adding extra flags. Additionally, EXTRA_CFLAGS should be the last
thing to be appended so that it can be used to undo any defaults. And no
condition is required, just += appends to any existing CFLAGS and also
appends or doesn't append EXTRA_CFLAGS if they are or aren't set.

It's not clear why DEBUG=1 is required for -g in Perf when in libperf
it's always on, but I don't think we need to change that behavior now
because someone may be depending on it.

Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250319114009.417865-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 17:00:39 -07:00
Thomas Richter
431db90a73 perf pmu: Handle memory failure in tool_pmu__new()
On linux-next
commit 72c6f57a41 ("perf pmu: Dynamically allocate tool PMU")
allocated PMU named "tool" dynamicly. However that allocation
can fail and a NULL pointer is returned. That case is currently
not handled and would result in an invalid address reference.
Add a check for NULL pointer.

Fixes: 72c6f57a41 ("perf pmu: Dynamically allocate tool PMU")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250319122820.2898333-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 17:00:16 -07:00
James Clark
6d2dcd6352 perf: intel-tpebs: Fix incorrect usage of zfree()
zfree() requires an address otherwise it frees what's in name, rather
than name itself. Pass the address of name to fix it.

This was the only incorrect occurrence in Perf found using a search.

Fixes: 8db5cabcf1 ("perf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency value for a metric.")
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250319101614.190922-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 16:56:56 -07:00
Ian Rogers
58b8b5d142 perf cpumap: Increment reference count for online cpumap
Thomas Richter <tmricht@linux.ibm.com> reported a double put on the
cpumap for the placeholder core PMU:
https://lore.kernel.org/lkml/20250318095132.1502654-3-tmricht@linux.ibm.com/
Requiring the caller to get the cpumap is not how these things are
usually done, switch cpu_map__online to do the get and then fix up any
use cases where a put is needed.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20250318171914.145616-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 16:56:33 -07:00