Commit Graph

1398081 Commits

Author SHA1 Message Date
Ravi Bangoria
3c723f4497 perf test: Fix lock contention test
Couple of independent fixes:

1. Wire in SIGSEGV handler that terminates the test with a failure code.

2. Use "--lock-cgroup" instead of "-g"; "-g" was proposed but never
   merged. See commit 4d1792d0a2 ("perf lock contention: Add
   --lock-cgroup option")

3. Call cleanup() on every normal exit so trap_cleanup() doesn't mistake
   it for an unexpected signal and emit a false-negative "Unexpected
   signal in main" message.

Before patch:

  # ./perf test -vv "lock contention"
   85: kernel lock contention analysis test:
  --- start ---
  test child forked, pid 610711
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --lock-cgroup
  Unexpected signal in test_aggr_cgroup
  ---- end(0) ----
   85: kernel lock contention analysis test                            : Ok

After patch:

  # ./perf test -vv "lock contention"
   85: kernel lock contention analysis test:
  --- start ---
  test child forked, pid 602637
  Testing perf lock record and perf lock contention
  Testing perf lock contention --use-bpf
  Testing perf lock record and perf lock contention at the same time
  Testing perf lock contention --threads
  Testing perf lock contention --lock-addr
  Testing perf lock contention --lock-cgroup
  Testing perf lock contention --type-filter (w/ spinlock)
  Testing perf lock contention --lock-filter (w/ tasklist_lock)
  Testing perf lock contention --callstack-filter (w/ unix_stream)
  [Skip] Could not find 'unix_stream'
  Testing perf lock contention --callstack-filter with task aggregation
  [Skip] Could not find 'unix_stream'
  Testing perf lock contention --cgroup-filter
  Testing perf lock contention CSV output
  ---- end(0) ----
   85: kernel lock contention analysis test                            : Ok

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Tycho Andersen <tycho@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13 17:29:00 -03:00
Ravi Bangoria
d0206db94b perf lock: Fix segfault due to missing kernel map
Kernel maps are encoded in PERF_RECORD_MMAP2 samples but "perf lock
report" and "perf lock contention" do not process MMAP2 samples.

Because of that, machine->vmlinux_map stays NULL and any later access
triggers a segmentation fault.

Fix it by adding ->mmap2() callbacks.

Fixes: 53b00ff358 ("perf record: Make --buildid-mmap the default")
Reported-by: Tycho Andersen (AMD) <tycho@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Tycho Andersen (AMD) <tycho@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13 17:17:41 -03:00
Arnaldo Carvalho de Melo
84003ab3d0 tools headers UAPI: Sync KVM's vmx.h with the kernel to pick SEAMCALL exit reason
To pick the changes in:

  9d7dfb95da ("KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or TDCALL")

The 'perf kvm-stat' tool uses the exit reasons that are included in the
VMX_EXIT_REASONS define, this new SEAMCALL isn't included there (TDCALL
is), so shouldn't be causing any change in behaviour, this patch ends up
being just addressess the following perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h

Please see tools/include/uapi/README for further details.

Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13 17:16:34 -03:00
Arnaldo Carvalho de Melo
a09e5967ad perf build: Don't fail fast path feature detection when binutils-devel is not available
This is one more remnant of the BUILD_NONDISTRO series to make building
with binutils-devel opt-in due to license incompatibility.

In this case just the references at link time were still in place, which
make building the test-all.bin file fail, which wasn't detected before
probably because the last test was done with binutils-devel available,
doh.

Now:

  $ rpm -q binutils-devel
  package binutils-devel is not installed
  $ file /tmp/build/perf-tools/feature/test-all.bin
  /tmp/build/perf-tools/feature/test-all.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
  dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
  BuildID[sha1]=4b5388a346b51f1b993f0b0dbd49f4570769b03c, for GNU/Linux 3.2.0, not stripped
  $

Fixes: 970ae86307 ("perf build: The bfd features are opt-in, stop testing for them by default")
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13 17:16:34 -03:00
Thomas Falcon
85c894a80a perf header: Write bpf_prog (infos|btfs)_cnt to data file
With commit f0d0f978f3 ("perf header: Don't write empty BPF/BTF
info"), the write_bpf_( prog_info() | btf() ) functions exit without
writing anything if env->bpf_prog.(infos| btfs)_cnt is zero.

process_bpf_( prog_info() | btf() ), however, still expect a "count"
value to exist in the data file. If btf information is empty, for
example, process_bpf_btf will read garbage or some other data as the
number of btf nodes in the data file. As a result, the data file will
not be processed correctly.

Instead, write the count to the data file and exit if it is zero.

Fixes: f0d0f978f3 ("perf header: Don't write empty BPF/BTF info")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13 17:16:33 -03:00
Linus Torvalds
9b9e43704d Merge tag 'slab-for-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:

 - Fix memory leak of objects from remote NUMA node when bulk freeing to
   a cache with sheaves (Harry Yoo)

* tag 'slab-for-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slub: fix memory leak in free_to_pcs_bulk()
2025-11-13 11:42:44 -08:00
Linus Torvalds
8b4a014e28 Merge tag 'linux_kselftest-fixes-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan:
 "Fixes event-filter-function.tc tracing test failure caused when a
  first run to sample events triggers kmem_cache_free which interferes
  with the rest of the test.

  Fix this by calling sample_events twice to eliminate the
  kmem_cache_free related noise from the sampling"

* tag 'linux_kselftest-fixes-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/tracing: Run sample events to clear page cache events
2025-11-13 11:37:40 -08:00
Linus Torvalds
d0309c0543 Merge tag 'net-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from Bluetooth and Wireless. No known outstanding
  regressions.

  Current release - regressions:

   - eth:
      - bonding: fix mii_status when slave is down
      - mlx5e: fix missing error assignment in mlx5e_xfrm_add_state()

  Previous releases - regressions:

   - sched: limit try_bulk_dequeue_skb() batches

   - ipv4: route: prevent rt_bind_exception() from rebinding stale fnhe

   - af_unix: initialise scc_index in unix_add_edge()

   - netpoll: fix incorrect refcount handling causing incorrect cleanup

   - bluetooth: don't hold spin lock over sleeping functions

   - hsr: Fix supervision frame sending on HSRv0

   - sctp: prevent possible shift out-of-bounds

   - tipc: fix use-after-free in tipc_mon_reinit_self().

   - dsa: tag_brcm: do not mark link local traffic as offloaded

   - eth: virtio-net: fix incorrect flags recording in big mode

  Previous releases - always broken:

   - sched: initialize struct tc_ife to fix kernel-infoleak

   - wifi:
      - mac80211: reject address change while connecting
      - iwlwifi: avoid toggling links due to wrong element use

   - bluetooth: cancel mesh send timer when hdev removed

   - strparser: fix signed/unsigned mismatch bug

   - handshake: fix memory leak in tls_handshake_accept()

  Misc:

   - selftests: mptcp: fix some flaky tests"

* tag 'net-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (60 commits)
  hsr: Follow standard for HSRv0 supervision frames
  hsr: Fix supervision frame sending on HSRv0
  virtio-net: fix incorrect flags recording in big mode
  ipv4: route: Prevent rt_bind_exception() from rebinding stale fnhe
  wifi: iwlwifi: mld: always take beacon ies in link grading
  wifi: iwlwifi: mvm: fix beacon template/fixed rate
  wifi: iwlwifi: fix aux ROC time event iterator usage
  net_sched: limit try_bulk_dequeue_skb() batches
  selftests: mptcp: join: properly kill background tasks
  selftests: mptcp: connect: trunc: read all recv data
  selftests: mptcp: join: userspace: longer transfer
  selftests: mptcp: join: endpoints: longer transfer
  selftests: mptcp: join: rm: set backup flag
  selftests: mptcp: connect: fix fallback note due to OoO
  ethtool: fix incorrect kernel-doc style comment in ethtool.h
  mlx5: Fix default values in create CQ
  Bluetooth: btrtl: Avoid loading the config file on security chips
  net/mlx5e: Fix potentially misleading debug message
  net/mlx5e: Fix wraparound in rate limiting for values above 255 Gbps
  net/mlx5e: Fix maxrate wraparound in threshold between units
  ...
2025-11-13 11:20:25 -08:00
Harry Yoo
cbcff934fa mm/slub: fix memory leak in free_to_pcs_bulk()
The commit 989b09b739 ("slab: skip percpu sheaves for remote object
freeing") introduced the remote_objects array in free_to_pcs_bulk() to
skip sheaves when objects from a remote node are freed.

However, the array is flushed only when:
  1) the array becomes full (++remote_nr >= PCS_BATCH_MAX), or
  2) slab_free_hook() returns false and size becomes zero.

When neither of the conditions is met, objects in the array are leaked.
This resulted in a memory leak [1], where 82 GiB of memory was allocated
for the maple_node cache.

Flush the array after successfully freeing objects to sheaves
in the do_free: path.

In the meantime, move the snippet if (!size) goto flush_remote; outside
the while loop for readability. Let's say all objects in the array are
from a remote node: then we acquire s->cpu_sheaves->lock and try to free
an object even when size is zero. This doesn't appear to be harmful,
but isn't really readable.

Reported-by: Tytus Rogalewski <admin@simplepod.ai>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220765 [1]
Closes: https://lore.kernel.org/linux-mm/20251107094809.12e9d705b7bf4815783eb184@linux-foundation.org
Closes: https://lore.kernel.org/all/aRGDTwbt2EIz2CYn@hyeyoo
Fixes: 989b09b739 ("slab: skip percpu sheaves for remote object freeing")
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
Link: https://patch.msgid.link/20251111125331.12246-1-harry.yoo@oracle.com
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Tytus Rogalewski <admin@simplepod.ai>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-11-13 19:56:46 +01:00
Paolo Abeni
94909c53e4 Merge branch 'hsr-send-correct-hsrv0-supervision-frames'
Felix Maurer says:

====================
hsr: Send correct HSRv0 supervision frames

Hangbin recently reported that the hsr selftests were failing and noted
that the entries in the node table were not merged, i.e., had
00:00:00:00:00:00 as MacAddressB forever [1].

This failure only occured with HSRv0 because it was not sending
supervision frames anymore. While debugging this I found that we were
not really following the HSRv0 standard for the supervision frames we
sent, so I additionally made a few changes to get closer to the standard
and restore a more correct behavior we had a while ago.

The selftests can still fail because they take a while and run into the
timeout. I did not include a change of the timeout because I have more
improvements to the selftests mostly ready that change the test duration
but are net-next material.

[1]: https://lore.kernel.org/netdev/aMONxDXkzBZZRfE5@fedora/
====================

Link: https://patch.msgid.link/cover.1762876095.git.fmaurer@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-13 15:55:06 +01:00
Felix Maurer
b2c26c82f7 hsr: Follow standard for HSRv0 supervision frames
For HSRv0, the path_id has the following meaning:
- 0000: PRP supervision frame
- 0001-1001: HSR ring identifier
- 1010-1011: Frames from PRP network (A/B, with RedBoxes)
- 1111: HSR supervision frame

Follow the IEC 62439-3:2010 standard more closely by setting the right
path_id for HSRv0 supervision frames (actually, it is correctly set when
the frame is constructed, but hsr_set_path_id() overwrites it) and set a
fixed HSR ring identifier of 1. The ring identifier seems to be generally
unused and we ignore it anyways on reception, but some fixed identifier is
definitely better than using one identifier in one direction and a wrong
identifier in the other.

This was also the behavior before commit f266a683a4 ("net/hsr: Better
frame dispatch") which introduced the alternating path_id. This was later
moved to hsr_set_path_id() in commit 451d8123f8 ("net: prp: add packet
handling support").

The IEC 62439-3:2010 also contains 6 unused bytes after the MacAddressA in
the HSRv0 supervision frames. Adjust a TODO comment accordingly.

Fixes: f266a683a4 ("net/hsr: Better frame dispatch")
Fixes: 451d8123f8 ("net: prp: add packet handling support")
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/ea0d5133cd593856b2fa673d6e2067bf1d4d1794.1762876095.git.fmaurer@redhat.com
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-13 15:55:04 +01:00
Felix Maurer
96a3a03abf hsr: Fix supervision frame sending on HSRv0
On HSRv0, no supervision frames were sent. The supervison frames were
generated successfully, but failed the check for a sufficiently long mac
header, i.e., at least sizeof(struct hsr_ethhdr), in hsr_fill_frame_info()
because the mac header only contained the ethernet header.

Fix this by including the HSR header in the mac header when generating HSR
supervision frames. Note that the mac header now also includes the TLV
fields. This matches how we set the headers on rx and also the size of
struct hsrv0_ethhdr_sp.

Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Closes: https://lore.kernel.org/netdev/aMONxDXkzBZZRfE5@fedora/
Fixes: 9cfb5e7f0d ("net: hsr: fix hsr_init_sk() vs network/transport headers.")
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/4354114fea9a642fe71f49aeeb6c6159d1d61840.1762876095.git.fmaurer@redhat.com
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-13 15:55:04 +01:00
Linus Torvalds
2ccec59446 Merge tag 'erofs-for-6.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:

 - Add Chunhai Guo as a EROFS reviewer to get more eyes from interested
   industry vendors

 - Fix infinite loop caused by incomplete crafted zstd-compressed data
   (thanks to Robert again!)

* tag 'erofs-for-6.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: avoid infinite loop due to incomplete zstd-compressed data
  MAINTAINERS: erofs: add myself as reviewer
2025-11-13 05:02:59 -08:00
Linus Torvalds
967a72fa7f Merge tag 'v6.18-rc5-smb-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:

 - Fix smbdirect (RDMA) disconnect hang bug

 - Fix potential Denial of Service when connection limit exceeded

 - Fix smbdirect (RDMA) connection (potentially accessing freed memory)
   bug

* tag 'v6.18-rc5-smb-server-fixes' of git://git.samba.org/ksmbd:
  smb: server: let smb_direct_disconnect_rdma_connection() turn CREATED into DISCONNECTED
  ksmbd: close accepted socket when per-IP limit rejects connection
  smb: server: rdma: avoid unmapping posted recv on accept failure
2025-11-13 04:57:38 -08:00
Xuan Zhuo
0eff2eaa53 virtio-net: fix incorrect flags recording in big mode
The purpose of commit 703eec1b24 ("virtio_net: fixing XDP for fully
checksummed packets handling") is to record the flags in advance, as
their value may be overwritten in the XDP case. However, the flags
recorded under big mode are incorrect, because in big mode, the passed
buf does not point to the rx buffer, but rather to the page of the
submitted buffer. This commit fixes this issue.

For the small mode, the commit c11a49d58a ("virtio_net: Fix mismatched
buf address when unmapping for small packets") fixed it.

Tested-by: Alyssa Ross <hi@alyssa.is>
Fixes: 703eec1b24 ("virtio_net: fixing XDP for fully checksummed packets handling")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20251111090828.23186-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-13 13:16:30 +01:00
Linus Torvalds
6fa9041b71 Merge tag 'nfsd-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
 "Address recently reported issues or issues found at the recent NFS
  bake-a-thon held in Raleigh, NC.

  Issues reported with v6.18-rc:
   - Address a kernel build issue
   - Reorder SEQUENCE processing to avoid spurious NFS4ERR_SEQ_MISORDERED

  Issues that need expedient stable backports:
   - Close a refcount leak exposure
   - Report support for NFSv4.2 CLONE correctly
   - Fix oops during COPY_NOTIFY processing
   - Prevent rare crash after XDR encoding failure
   - Prevent crash due to confused or malicious NFSv4.1 client"

* tag 'nfsd-6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  Revert "SUNRPC: Make RPCSEC_GSS_KRB5 select CRYPTO instead of depending on it"
  nfsd: ensure SEQUENCE replay sends a valid reply.
  NFSD: Never cache a COMPOUND when the SEQUENCE operation fails
  NFSD: Skip close replay processing if XDR encoding fails
  NFSD: free copynotify stateid in nfs4_free_ol_stateid()
  nfsd: add missing FATTR4_WORD2_CLONE_BLKSIZE from supported attributes
  nfsd: fix refcount leak in nfsd_set_fh_dentry()
2025-11-12 18:41:01 -08:00
Linus Torvalds
92385a075a Merge tag 'dma-mapping-6.18-2025-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:

 - two minor fixes for DMA API infrastructure: restoring proper
   structure padding used in benchmark tests (Qinxin Xia) and global
   DMA_BIT_MASK macro rework to make it a bit more clang friendly (James
   Clark)

* tag 'dma-mapping-6.18-2025-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
  dma-mapping: Allow use of DMA_BIT_MASK(64) in global scope
  dma-mapping: benchmark: Restore padding to ensure uABI remained consistent
2025-11-12 18:31:22 -08:00
Linus Torvalds
e927c520e1 Merge tag 'loongarch-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:

 - Fix a Rust build error

 - Fix exception/interrupt, memory management, perf event, hardware
   breakpoint, kexec and KVM bugs

* tag 'loongarch-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Fix max supported vCPUs set with EIOINTC
  LoongArch: KVM: Skip PMU checking on vCPU context switch
  LoongArch: KVM: Restore guest PMU if it is enabled
  LoongArch: KVM: Add delay until timer interrupt injected
  LoongArch: KVM: Set page with write attribute if dirty track disabled
  LoongArch: kexec: Print out debugging message if required
  LoongArch: kexec: Initialize the kexec_buf structure
  LoongArch: Use correct accessor to read FWPC/MWPC
  LoongArch: Refine the init_hw_perf_events() function
  LoongArch: Remove __GFP_HIGHMEM masking in pud_alloc_one()
  LoongArch: Let {pte,pmd}_modify() record the status of _PAGE_DIRTY
  LoongArch: Consolidate max_pfn & max_low_pfn calculation
  LoongArch: Consolidate early_ioremap()/ioremap_prot()
  LoongArch: Use physical addresses for CSR_MERRENTRY/CSR_TLBRENTRY
  LoongArch: Clarify 3 MSG interrupt features
  rust: Add -fno-isolate-erroneous-paths-dereference to bindgen_skip_c_flags
2025-11-12 18:21:30 -08:00
Linus Torvalds
89ee862a4d Merge tag 'alpha-fixes-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha fix from Matt Turner:
 "Add Magnus as a maintainer of the alpha port"

* tag 'alpha-fixes-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
  MAINTAINERS: Add Magnus Lindholm as maintainer for alpha port
2025-11-12 18:18:12 -08:00
Jakub Kicinski
fe82c4f8a2 Merge tag 'wireless-2025-11-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:

====================
Couple more fixes:
 - mwl8k: work around FW expecting a DSSS element in beacons
 - ath11k: report correct TX status
 - iwlwifi: avoid toggling links due to wrong element use
 - iwlwifi: fix beacon template rate on older devices
 - iwlwifi: fix loop iterator being used after loop
 - mac80211: disallow address changes while using the address
 - mac80211: avoid bad rate warning in monitor/sniffer mode
 - hwsim: fix potential NULL deref (on monitor injection)

* tag 'wireless-2025-11-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: iwlwifi: mld: always take beacon ies in link grading
  wifi: iwlwifi: mvm: fix beacon template/fixed rate
  wifi: iwlwifi: fix aux ROC time event iterator usage
  wifi: mwl8k: inject DSSS Parameter Set element into beacons if missing
  wifi: mac80211_hwsim: Fix possible NULL dereference
  wifi: mac80211: skip rate verification for not captured PSDUs
  wifi: mac80211: reject address change while connecting
  wifi: ath11k: zero init info->status in wmi_process_mgmt_tx_comp()
====================

Link: https://patch.msgid.link/20251112114621.15716-5-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-12 09:33:09 -08:00
Chuang Wang
ac1499fcd4 ipv4: route: Prevent rt_bind_exception() from rebinding stale fnhe
The sit driver's packet transmission path calls: sit_tunnel_xmit() ->
update_or_create_fnhe(), which lead to fnhe_remove_oldest() being called
to delete entries exceeding FNHE_RECLAIM_DEPTH+random.

The race window is between fnhe_remove_oldest() selecting fnheX for
deletion and the subsequent kfree_rcu(). During this time, the
concurrent path's __mkroute_output() -> find_exception() can fetch the
soon-to-be-deleted fnheX, and rt_bind_exception() then binds it with a
new dst using a dst_hold(). When the original fnheX is freed via RCU,
the dst reference remains permanently leaked.

CPU 0                             CPU 1
__mkroute_output()
  find_exception() [fnheX]
                                  update_or_create_fnhe()
                                    fnhe_remove_oldest() [fnheX]
  rt_bind_exception() [bind dst]
                                  RCU callback [fnheX freed, dst leak]

This issue manifests as a device reference count leak and a warning in
dmesg when unregistering the net device:

  unregister_netdevice: waiting for sitX to become free. Usage count = N

Ido Schimmel provided the simple test validation method [1].

The fix clears 'oldest->fnhe_daddr' before calling fnhe_flush_routes().
Since rt_bind_exception() checks this field, setting it to zero prevents
the stale fnhe from being reused and bound to a new dst just before it
is freed.

[1]
ip netns add ns1
ip -n ns1 link set dev lo up
ip -n ns1 address add 192.0.2.1/32 dev lo
ip -n ns1 link add name dummy1 up type dummy
ip -n ns1 route add 192.0.2.2/32 dev dummy1
ip -n ns1 link add name gretap1 up arp off type gretap \
    local 192.0.2.1 remote 192.0.2.2
ip -n ns1 route add 198.51.0.0/16 dev gretap1
taskset -c 0 ip netns exec ns1 mausezahn gretap1 \
    -A 198.51.100.1 -B 198.51.0.0/16 -t udp -p 1000 -c 0 -q &
taskset -c 2 ip netns exec ns1 mausezahn gretap1 \
    -A 198.51.100.1 -B 198.51.0.0/16 -t udp -p 1000 -c 0 -q &
sleep 10
ip netns pids ns1 | xargs kill
ip netns del ns1

Cc: stable@vger.kernel.org
Fixes: 67d6d681e1 ("ipv4: make exception cache less predictible")
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251111064328.24440-1-nashuiliang@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-12 06:46:36 -08:00
Johannes Berg
a35f64a216 Merge tag 'iwlwifi-fixes-2025-11-12' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next
Miri Korenblit says:
====================
iwlwifi fixes:

- avoid link toggling
- fix beacon template rate
- don't use iterator outside the loop
====================

Link: https://patch.msgid.link/DM3PPF63A6024A9E52FF4A7B23F283B7FC7A3CCA@DM3PPF63A6024A9.namprd11.prod.outlook.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-12 09:51:05 +01:00
Miri Korenblit
1a222625b4 wifi: iwlwifi: mld: always take beacon ies in link grading
One of the factors of a link's grade is the channel load, which is
calculated from the AP's bss load element.
The current code takes this element from the beacon for an active link,
and from bss->ies for an inactive link.

bss->ies is set to either the beacon's ies or to the probe response
ones, with preference to the probe response (meaning that if there was
even one probe response, the ies of it will be stored in bss->ies and
won't be overiden by the beacon ies).

The probe response can be very old, i.e. from the connection time,
where a beacon is updated before each link selection (which is
triggered only after a passive scan).

In such case, the bss load element in the probe response will not
include the channel load caused by the STA, where the beacon will.

This will cause the inactive link to always have a lower channel
load, and therefore an higher grade than the active link's one.

This causes repeated link switches, causing the throughput to drop.

Fix this by always taking the ies from the beacon, as those are for
sure new.

Fixes: d1e879ec60 ("wifi: iwlwifi: add iwlmld sub-driver")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20251110145652.b493dbb1853a.I058ba7309c84159f640cc9682d1bda56dd56a536@changeid
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2025-11-12 09:54:46 +02:00
Johannes Berg
3592c0083f wifi: iwlwifi: mvm: fix beacon template/fixed rate
During the development of the rate changes, I evidently made
some changes that shouldn't have been there; beacon templates
with rate_n_flags are only in old versions, so no changes to
them should have been necessary, and evidently broke on some
devices. This also would have broken fixed (injection) rates,
it would seem. Restore the old handling of this.

Fixes: dabc88cb3b ("wifi: iwlwifi: handle v3 rates")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220558
Reviewed-by: Benjamin Berg <benjamin.berg@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://patch.msgid.link/20251008112044.3bb8ea849d8d.I90f4d2b2c1f62eaedaf304a61d2ab9e50c491c2d@changeid
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2025-11-12 09:54:46 +02:00
Junjie Cao
f4c737d449 wifi: iwlwifi: fix aux ROC time event iterator usage
The list_for_each_entry() iterator must not be used outside the loop.
Even though we break and check for NULL, doing so still violates kernel
iteration rules and triggers Coccinelle's use_after_iter.cocci warning.

Cache the matched entry in aux_roc_te and use it consistently after the
loop. This follows iterator best practices, resolves the warning, and
makes the code more maintainable.

Signed-off-by: Junjie Cao <junjie.cao@intel.com>
Link: https://patch.msgid.link/20251016014919.383565-1-junjie.cao@intel.com
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
2025-11-12 09:54:46 +02:00
Eric Dumazet
0345552a65 net_sched: limit try_bulk_dequeue_skb() batches
After commit 100dfa74cad9 ("inet: dev_queue_xmit() llist adoption")
I started seeing many qdisc requeues on IDPF under high TX workload.

$ tc -s qd sh dev eth1 handle 1: ; sleep 1; tc -s qd sh dev eth1 handle 1:
qdisc mq 1: root
 Sent 43534617319319 bytes 268186451819 pkt (dropped 0, overlimits 0 requeues 3532840114)
 backlog 1056Kb 6675p requeues 3532840114
qdisc mq 1: root
 Sent 43554665866695 bytes 268309964788 pkt (dropped 0, overlimits 0 requeues 3537737653)
 backlog 781164b 4822p requeues 3537737653

This is caused by try_bulk_dequeue_skb() being only limited by BQL budget.

perf record -C120-239 -e qdisc:qdisc_dequeue sleep 1 ; perf script
...
 netperf 75332 [146]  2711.138269: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1292 skbaddr=0xff378005a1e9f200
 netperf 75332 [146]  2711.138953: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1213 skbaddr=0xff378004d607a500
 netperf 75330 [144]  2711.139631: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1233 skbaddr=0xff3780046be20100
 netperf 75333 [147]  2711.140356: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1093 skbaddr=0xff37800514845b00
 netperf 75337 [151]  2711.141037: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1353 skbaddr=0xff37800460753300
 netperf 75337 [151]  2711.141877: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1367 skbaddr=0xff378004e72c7b00
 netperf 75330 [144]  2711.142643: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1202 skbaddr=0xff3780045bd60000
...

This is bad because :

1) Large batches hold one victim cpu for a very long time.

2) Driver often hit their own TX ring limit (all slots are used).

3) We call dev_requeue_skb()

4) Requeues are using a FIFO (q->gso_skb), breaking qdisc ability to
   implement FQ or priority scheduling.

5) dequeue_skb() gets packets from q->gso_skb one skb at a time
   with no xmit_more support. This is causing many spinlock games
   between the qdisc and the device driver.

Requeues were supposed to be very rare, lets keep them this way.

Limit batch sizes to /proc/sys/net/core/dev_weight (default 64) as
__qdisc_run() was designed to use.

Fixes: 5772e9a346 ("qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://patch.msgid.link/20251109161215.2574081-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:56:50 -08:00
Magnus Lindholm
d58041d2c6 MAINTAINERS: Add Magnus Lindholm as maintainer for alpha port
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Magnus Lindholm <linmag7@gmail.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2025-11-11 20:52:04 -05:00
Jakub Kicinski
7a6fa4f89e Merge branch 'selftests-mptcp-join-fix-some-flaky-tests'
Matthieu Baerts says:

====================
selftests: mptcp: join: fix some flaky tests

When looking at the recent CI results on NIPA and MPTCP CIs, a few MPTCP
Join tests are marked as unstable. Here are some fixes for that.

- Patch 1: a small fix for mptcp_connect.sh, printing a note as
  initially intended. For >=v5.13.

- Patch 2: avoid unexpected reset when closing subflows. For >= 5.13.

- Patches 3-4: longer transfer when not waiting for the end. For >=5.18.

- Patch 5: read all received data when expecting a reset. For >= v6.1.

- Patch 6: a fix to properly kill background tasks. For >= v6.5.
====================

Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-0-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:49:53 -08:00
Matthieu Baerts (NGI0)
852b644acb selftests: mptcp: join: properly kill background tasks
The 'run_tests' function is executed in the background, but killing its
associated PID would not kill the children tasks running in the
background.

To properly kill all background tasks, 'kill -- -PID' could be used, but
this requires kill from procps-ng. Instead, all children tasks are
listed using 'ps', and 'kill' is called with all PIDs of this group.

Fixes: 31ee4ad86a ("selftests: mptcp: join: stop transfer when check is done (part 1)")
Cc: stable@vger.kernel.org
Fixes: 04b57c9e09 ("selftests: mptcp: join: stop transfer when check is done (part 2)")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-6-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:49:49 -08:00
Matthieu Baerts (NGI0)
ee79980f7a selftests: mptcp: connect: trunc: read all recv data
MPTCP Join "fastclose server" selftest is sometimes failing because the
client output file doesn't have the expected size, e.g. 296B instead of
1024B.

When looking at a packet trace when this happens, the server sent the
expected 1024B in two parts -- 100B, then 924B -- then the MP_FASTCLOSE.
It is then strange to see the client only receiving 296B, which would
mean it only got a part of the second packet. The problem is then not on
the networking side, but rather on the data reception side.

When mptcp_connect is launched with '-f -1', it means the connection
might stop before having sent everything, because a reset has been
received. When this happens, the program was directly stopped. But it is
also possible there are still some data to read, simply because the
previous 'read' step was done with a buffer smaller than the pending
data, see do_rnd_read(). In this case, it is important to read what's
left in the kernel buffers before stopping without error like before.

SIGPIPE is now ignored, not to quit the app before having read
everything.

Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-5-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:49:49 -08:00
Matthieu Baerts (NGI0)
290493078b selftests: mptcp: join: userspace: longer transfer
In rare cases, when the test environment is very slow, some userspace
tests can fail because some expected events have not been seen.

Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to make the
connection longer. This connection will be killed at the end, after the
verifications, so making it longer doesn't change anything, apart from
avoid it to end before the end of the verifications

To play it safe, all userspace tests not waiting for the end of the
transfer are now sharing a longer file (128KB) at slow speed.

Fixes: 4369c198e5 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Fixes: b2e2248f36 ("selftests: mptcp: userspace pm create id 0 subflow")
Fixes: e3b47e460b ("selftests: mptcp: userspace pm remove initial subflow")
Fixes: b9fb176081 ("selftests: mptcp: userspace pm send RM_ADDR for ID 0")
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-4-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:49:49 -08:00
Matthieu Baerts (NGI0)
6457595db9 selftests: mptcp: join: endpoints: longer transfer
In rare cases, when the test environment is very slow, some userspace
tests can fail because some expected events have not been seen.

Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to make the
connection longer. This connection will be killed at the end, after the
verifications, so making it longer doesn't change anything, apart from
avoid it to end before the end of the verifications

To play it safe, all endpoints tests not waiting for the end of the
transfer are now sharing a longer file (128KB) at slow speed.

Fixes: 69c6ce7b6e ("selftests: mptcp: add implicit endpoint test case")
Cc: stable@vger.kernel.org
Fixes: e274f71540 ("selftests: mptcp: add subflow limits test-cases")
Fixes: b5e2fb832f ("selftests: mptcp: add explicit test case for remove/readd")
Fixes: e06959e9ee ("selftests: mptcp: join: test for flush/re-add endpoints")
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-3-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:49:48 -08:00
Matthieu Baerts (NGI0)
aea73bae66 selftests: mptcp: join: rm: set backup flag
Some of these 'remove' tests rarely fail because a subflow has been
reset instead of cleanly removed. This can happen when one extra subflow
which has never carried data is being closed (FIN) on one side, while
the other is sending data for the first time.

To avoid such subflows to be used right at the end, the backup flag has
been added. With that, data will be only carried on the initial subflow.

Fixes: d2c4333a80 ("selftests: mptcp: add testcases for removing addrs")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-2-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:49:48 -08:00
Matthieu Baerts (NGI0)
63c643aa7b selftests: mptcp: connect: fix fallback note due to OoO
The "fallback due to TCP OoO" was never printed because the stat_ooo_now
variable was checked twice: once in the parent if-statement, and one in
the child one. The second condition was then always true then, and the
'else' branch was never taken.

The idea is that when there are more ACK + MP_CAPABLE than expected, the
test either fails if there was no out of order packets, or a notice is
printed.

Fixes: 69ca3d29a7 ("mptcp: update selftest for fallback due to OoO")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-1-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:49:47 -08:00
Jakub Kicinski
27bcc05b88 Merge tag 'for-net-2025-11-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - hci_conn: Fix not cleaning up PA_LINK connections
 - hci_event: Fix not handling PA Sync Lost event
 - MGMT: cancel mesh send timer when hdev removed
 - 6lowpan: reset link-local header on ipv6 recv path
 - 6lowpan: fix BDADDR_LE vs ADDR_LE_DEV address type confusion
 - L2CAP: export l2cap_chan_hold for modules
 - 6lowpan: Don't hold spin lock over sleeping functions
 - 6lowpan: add missing l2cap_chan_lock()
 - btusb: reorder cleanup in btusb_disconnect to avoid UAF
 - btrtl: Avoid loading the config file on security chips

* tag 'for-net-2025-11-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: btrtl: Avoid loading the config file on security chips
  Bluetooth: hci_event: Fix not handling PA Sync Lost event
  Bluetooth: hci_conn: Fix not cleaning up PA_LINK connections
  Bluetooth: 6lowpan: add missing l2cap_chan_lock()
  Bluetooth: 6lowpan: Don't hold spin lock over sleeping functions
  Bluetooth: L2CAP: export l2cap_chan_hold for modules
  Bluetooth: 6lowpan: fix BDADDR_LE vs ADDR_LE_DEV address type confusion
  Bluetooth: 6lowpan: reset link-local header on ipv6 recv path
  Bluetooth: btusb: reorder cleanup in btusb_disconnect to avoid UAF
  Bluetooth: MGMT: cancel mesh send timer when hdev removed
====================

Link: https://patch.msgid.link/20251111141357.1983153-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:43:32 -08:00
Kriish Sharma
bb8336a516 ethtool: fix incorrect kernel-doc style comment in ethtool.h
Building documentation produced the following warning:

  WARNING: ./include/linux/ethtool.h:495 This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * IEEE 802.3ck/df defines 16 bins for FEC histogram plus one more for

This comment was not intended to be parsed as kernel-doc, so replace
the '/**' with '/*' to silence the warning and align with normal
comment style in header files.

No functional changes.

Signed-off-by: Kriish Sharma <kriish.sharma2006@gmail.com>
Link: https://patch.msgid.link/20251110182545.2112596-1-kriish.sharma2006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-11 17:38:48 -08:00
Linus Torvalds
24172e0d79 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "There's more here than I would ideally like at this stage, but there's
  been a steady trickle of fixes and some of them took a few rounds of
  review.

  The bulk of the changes are fixing some fallout from the recent BBM
  level two support which allows the linear map to be split from block
  to page mappings at runtime, but inadvertently led to sleeping in
  atomic context on some paths where the linear map was already mapped
  with page granularity. The fix is simply to avoid splitting in those
  cases but the implementation of that is a little involved.

  The other interesting fix is addressing a catastophic performance
  issue with our per-cpu atomics discovered by Paul in the SRCU locking
  code but which took some interactions with the hardware folks to
  resolve.

  Summary:

   - Avoid sleeping in atomic context when changing linear map
     permissions for DEBUG_PAGEALLOC or KFENCE

   - Rework printing of Spectre mitigation status to avoid hardlockup
     when enabling per-task mitigations on the context-switch path

   - Reject kernel modules when instruction patching fails either due to
     the DWARF-based SCS patching or because of an alternatives callback
     residing outside of the core kernel text

   - Propagate error when updating kernel memory permissions in kprobes

   - Drop pointless, incorrect message when enabling the ACPI SPCR
     console

   - Use value-returning LSE instructions for per-cpu atomics to reduce
     latency in SRCU locking routines"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Reject modules with internal alternative callbacks
  arm64: Fail module loading if dynamic SCS patching fails
  arm64: proton-pack: Fix hard lockup due to print in scheduler context
  arm64: proton-pack: Drop print when !CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY
  arm64: mm: Tidy up force_pte_mapping()
  arm64: mm: Optimize range_split_to_ptes()
  arm64: mm: Don't sleep in split_kernel_leaf_mapping() when in atomic context
  arm64: kprobes: check the return value of set_memory_rox()
  arm64: acpi: Drop message logging SPCR default console
  Revert "ACPI: Suppress misleading SPCR console message when SPCR table is absent"
  arm64: Use load LSE atomics for the non-return per-CPU atomic operations
2025-11-11 10:31:17 -08:00
Linus Torvalds
8341374f67 Merge tag 'for-6.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - fix new inode name tracking in tree-log

 - fix conventional zone and stripe calculations in zoned mode

 - fix bio reference counts on error paths in relocation and scrub

* tag 'for-6.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: release root after error in data_reloc_print_warning_inode()
  btrfs: scrub: put bio after errors in scrub_raid56_parity_stripe()
  btrfs: do not update last_log_commit when logging inode due to a new name
  btrfs: zoned: fix stripe width calculation
  btrfs: zoned: fix conventional zone capacity calculation
2025-11-11 10:13:17 -08:00
Linus Torvalds
537d196186 Merge tag 'mm-hotfixes-stable-2025-11-10-19-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "26 hotfixes.  22(!) are cc:stable, 22 are MM.

   - address some Kexec Handover issues (Pasha Tatashin)

   - fix handling of large folios which are mapped outside i_size (Kiryl
     Shutsemau)

   - fix some DAMON time issues on 32-bit machines (Quanmin Yan)

  Plus the usual shower of singletons"

* tag 'mm-hotfixes-stable-2025-11-10-19-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits)
  kho: warn and exit when unpreserved page wasn't preserved
  kho: fix unpreservation of higher-order vmalloc preservations
  kho: fix out-of-bounds access of vmalloc chunk
  MAINTAINERS: add Chris and Kairui as the swap maintainer
  mm/secretmem: fix use-after-free race in fault handler
  mm/huge_memory: initialise the tags of the huge zero folio
  nilfs2: avoid having an active sc_timer before freeing sci
  scripts/decode_stacktrace.sh: fix build ID and PC source parsing
  mm/damon/sysfs: change next_update_jiffies to a global variable
  mm/damon/stat: change last_refresh_jiffies to a global variable
  maple_tree: fix tracepoint string pointers
  codetag: debug: handle existing CODETAG_EMPTY in mark_objexts_empty for slabobj_ext
  mm/mremap: honour writable bit in mremap pte batching
  gcov: add support for GCC 15
  mm/mm_init: fix hash table order logging in alloc_large_system_hash()
  mm/truncate: unmap large folio on split failure
  mm/memory: do not populate page table entries beyond i_size
  fs/proc: fix uaf in proc_readdir_de()
  mm/huge_memory: preserve PG_has_hwpoisoned if a folio is split to >0 order
  ksm: use range-walk function to jump over holes in scan_get_next_rmap_item
  ...
2025-11-11 09:49:56 -08:00
Stefan Metzmacher
55286b1e1b smb: server: let smb_direct_disconnect_rdma_connection() turn CREATED into DISCONNECTED
When smb_direct_disconnect_rdma_connection() turns SMBDIRECT_SOCKET_CREATED
into SMBDIRECT_SOCKET_ERROR, we'll have the situation that
smb_direct_disconnect_rdma_work() will set SMBDIRECT_SOCKET_DISCONNECTING
and call rdma_disconnect(), which likely fails as we never reached
the RDMA_CM_EVENT_ESTABLISHED. it means that
wait_event(sc->status_wait, sc->status == SMBDIRECT_SOCKET_DISCONNECTED)
in free_transport() will hang forever in SMBDIRECT_SOCKET_DISCONNECTING
never reaching SMBDIRECT_SOCKET_DISCONNECTED.

So we directly go from SMBDIRECT_SOCKET_CREATED to
SMBDIRECT_SOCKET_DISCONNECTED.

Fixes: b3fd52a0d8 ("smb: server: let smb_direct_disconnect_rdma_connection() set SMBDIRECT_SOCKET_ERROR...")
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-11-11 09:50:35 -06:00
Akiva Goldberger
e5eba42f01 mlx5: Fix default values in create CQ
Currently, CQs without a completion function are assigned the
mlx5_add_cq_to_tasklet function by default. This is problematic since
only user CQs created through the mlx5_ib driver are intended to use
this function.

Additionally, all CQs that will use doorbells instead of polling for
completions must call mlx5_cq_arm. However, the default CQ creation flow
leaves a valid value in the CQ's arm_db field, allowing FW to send
interrupts to polling-only CQs in certain corner cases.

These two factors would allow a polling-only kernel CQ to be triggered
by an EQ interrupt and call a completion function intended only for user
CQs, causing a null pointer exception.

Some areas in the driver have prevented this issue with one-off fixes
but did not address the root cause.

This patch fixes the described issue by adding defaults to the create CQ
flow. It adds a default dummy completion function to protect against
null pointer exceptions, and it sets an invalid command sequence number
by default in kernel CQs to prevent the FW from sending an interrupt to
the CQ until it is armed. User CQs are responsible for their own
initialization values.

Callers of mlx5_core_create_cq are responsible for changing the
completion function and arming the CQ per their needs.

Fixes: cdd04f4d4d ("net/mlx5: Add support to create SQ and CQ for ASO")
Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Link: https://patch.msgid.link/1762681743-1084694-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:12:18 +01:00
Max Chou
cd8dbd9ef6 Bluetooth: btrtl: Avoid loading the config file on security chips
For chips with security enabled, it's only possible to load firmware
with a valid signature pattern.
If key_id is not zero, it indicates a security chip, and the driver will
not load the config file.

- Example log for a security chip.

Bluetooth: hci0: RTL: examining hci_ver=0c hci_rev=000a
  lmp_ver=0c lmp_subver=8922
Bluetooth: hci0: RTL: rom_version status=0 version=1
Bluetooth: hci0: RTL: btrtl_initialize: key id 1
Bluetooth: hci0: RTL: loading rtl_bt/rtl8922au_fw.bin
Bluetooth: hci0: RTL: cfg_sz 0, total sz 71301
Bluetooth: hci0: RTL: fw version 0x41c0c905

- Example log for a normal chip.

Bluetooth: hci0: RTL: examining hci_ver=0c hci_rev=000a
  lmp_ver=0c lmp_subver=8922
Bluetooth: hci0: RTL: rom_version status=0 version=1
Bluetooth: hci0: RTL: btrtl_initialize: key id 0
Bluetooth: hci0: RTL: loading rtl_bt/rtl8922au_fw.bin
Bluetooth: hci0: RTL: loading rtl_bt/rtl8922au_config.bin
Bluetooth: hci0: RTL: cfg_sz 6, total sz 71307
Bluetooth: hci0: RTL: fw version 0x41c0c905

Tested-by: Hilda Wu <hildawu@realtek.com>
Signed-off-by: Nial Ni <niall_ni@realsil.com.cn>
Signed-off-by: Max Chou <max.chou@realtek.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-11-11 09:06:57 -05:00
Paolo Abeni
ed6b5632e0 Merge branch 'mlx5e-misc-fixes-2025-11-09'
Tariq Toukan says:

====================
mlx5e misc fixes 2025-11-09

This patchset provides misc bug fixes from the team to the mlx5 Eth
driver.
====================

Link: https://patch.msgid.link/1762681073-1084058-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:05:46 +01:00
Gal Pressman
9fcc2b6c10 net/mlx5e: Fix potentially misleading debug message
Change the debug message to print the correct units instead of always
assuming Gbps, as the value can be in either 100 Mbps or 1 Gbps units.

Fixes: 5da8bc3eff ("net/mlx5e: DCBNL, Add debug messages log")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1762681073-1084058-6-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:05:44 +01:00
Gal Pressman
43b27d1bd8 net/mlx5e: Fix wraparound in rate limiting for values above 255 Gbps
Add validation to reject rates exceeding 255 Gbps that would overflow
the 8 bits max bandwidth field.

Fixes: d8880795da ("net/mlx5e: Implement DCBNL IEEE max rate")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1762681073-1084058-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:05:44 +01:00
Gal Pressman
a7bf4d5063 net/mlx5e: Fix maxrate wraparound in threshold between units
The previous calculation used roundup() which caused an overflow for
rates between 25.5Gbps and 26Gbps.
For example, a rate of 25.6Gbps would result in using 100Mbps units with
value of 256, which would overflow the 8 bits field.

Simplify the upper_limit_mbps calculation by removing the
unnecessary roundup, and adjust the comparison to use <= to correctly
handle the boundary condition.

Fixes: d8880795da ("net/mlx5e: Implement DCBNL IEEE max rate")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1762681073-1084058-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:05:44 +01:00
Cosmin Ratiu
2dc768c052 net/mlx5e: Trim the length of the num_doorbell error
When trying to set num_doorbells to a value greater than the max number
of channels, the error message was going over the netlink limit of 80
chars, truncating the most important part of the message, the number of
channels.

Fix that by trimming the length a bit.

Fixes: 11bbcfb766 ("net/mlx5e: Use the 'num_doorbells' devlink param")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1762681073-1084058-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:05:44 +01:00
Carolina Jubran
0bcd5b3b50 net/mlx5e: Fix missing error assignment in mlx5e_xfrm_add_state()
Assign the return value of mlx5_eswitch_block_mode() to 'err' before
checking it to avoid returning an uninitialized error code.

Fixes: 22239eb258 ("net/mlx5e: Prevent tunnel reformat when tunnel mode not allowed")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202510271649.uwsIxD6O-lkp@intel.com/
Closes: http://lore.kernel.org/linux-rdma/aPIEK4rLB586FdDt@stanley.mountain/
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1762681073-1084058-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:05:44 +01:00
Paolo Abeni
02e9578c3e Merge branch 'net-sched-initialize-struct-tc_ife-to-fix-kernel-infoleak'
Ranganath says:

====================
net: sched: initialize struct tc_ife to fix kernel-infoleak

This series addresses the uninitialization of the struct which has
2 bytes of padding. And copying this uninitialized data to userspace
can leak info from kernel memory.

This series ensures all members and padding are cleared prior to
begin copied.

This change silences the KMSAN report and prevents potential information
leaks from the kernel memory.

v3: https://lore.kernel.org/lkml/20251106195635.2438-1-vnranganath.20@gmail.com/#t
v2: https://lore.kernel.org/r/20251101-infoleak-v2-0-01a501d41c09@gmail.com
v1: https://lore.kernel.org/r/20251031-infoleak-v1-1-9f7250ee33aa@gmail.com

Signed-off-by: Ranganath V N <vnranganath.20@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
====================

Link: https://patch.msgid.link/20251109091336.9277-1-vnranganath.20@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:00:11 +01:00
Ranganath V N
ce50039be4 net: sched: act_ife: initialize struct tc_ife to fix KMSAN kernel-infoleak
Fix a KMSAN kernel-infoleak detected  by the syzbot .

[net?] KMSAN: kernel-infoleak in __skb_datagram_iter

In tcf_ife_dump(), the variable 'opt' was partially initialized using a
designatied initializer. While the padding bytes are reamined
uninitialized. nla_put() copies the entire structure into a
netlink message, these uninitialized bytes leaked to userspace.

Initialize the structure with memset before assigning its fields
to ensure all members and padding are cleared prior to beign copied.

This change silences the KMSAN report and prevents potential information
leaks from the kernel memory.

This fix has been tested and validated by syzbot. This patch closes the
bug reported at the following syzkaller link and ensures no infoleak.

Reported-by: syzbot+0c85cae3350b7d486aee@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=0c85cae3350b7d486aee
Tested-by: syzbot+0c85cae3350b7d486aee@syzkaller.appspotmail.com
Fixes: ef6980b6be ("introduce IFE action")
Signed-off-by: Ranganath V N <vnranganath.20@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251109091336.9277-3-vnranganath.20@gmail.com
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-11 15:00:08 +01:00