Commit Graph

1397029 Commits

Author SHA1 Message Date
Darrick J. Wong
678e1cc2f4 xfs: fix out of bounds memory read error in symlink repair
xfs/286 produced this report on my test fleet:

 ==================================================================
 BUG: KFENCE: out-of-bounds read in memcpy_orig+0x54/0x110

 Out-of-bounds read at 0xffff88843fe9e038 (184B right of kfence-#184):
  memcpy_orig+0x54/0x110
  xrep_symlink_salvage_inline+0xb3/0xf0 [xfs]
  xrep_symlink_salvage+0x100/0x110 [xfs]
  xrep_symlink+0x2e/0x80 [xfs]
  xrep_attempt+0x61/0x1f0 [xfs]
  xfs_scrub_metadata+0x34f/0x5c0 [xfs]
  xfs_ioc_scrubv_metadata+0x387/0x560 [xfs]
  xfs_file_ioctl+0xe23/0x10e0 [xfs]
  __x64_sys_ioctl+0x76/0xc0
  do_syscall_64+0x4e/0x1e0
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 kfence-#184: 0xffff88843fe9df80-0xffff88843fe9dfea, size=107, cache=kmalloc-128

 allocated by task 3470 on cpu 1 at 263329.131592s (192823.508886s ago):
  xfs_init_local_fork+0x79/0xe0 [xfs]
  xfs_iformat_local+0xa4/0x170 [xfs]
  xfs_iformat_data_fork+0x148/0x180 [xfs]
  xfs_inode_from_disk+0x2cd/0x480 [xfs]
  xfs_iget+0x450/0xd60 [xfs]
  xfs_bulkstat_one_int+0x6b/0x510 [xfs]
  xfs_bulkstat_iwalk+0x1e/0x30 [xfs]
  xfs_iwalk_ag_recs+0xdf/0x150 [xfs]
  xfs_iwalk_run_callbacks+0xb9/0x190 [xfs]
  xfs_iwalk_ag+0x1dc/0x2f0 [xfs]
  xfs_iwalk_args.constprop.0+0x6a/0x120 [xfs]
  xfs_iwalk+0xa4/0xd0 [xfs]
  xfs_bulkstat+0xfa/0x170 [xfs]
  xfs_ioc_fsbulkstat.isra.0+0x13a/0x230 [xfs]
  xfs_file_ioctl+0xbf2/0x10e0 [xfs]
  __x64_sys_ioctl+0x76/0xc0
  do_syscall_64+0x4e/0x1e0
  entry_SYSCALL_64_after_hwframe+0x4b/0x53

 CPU: 1 UID: 0 PID: 1300113 Comm: xfs_scrub Not tainted 6.18.0-rc4-djwx #rc4 PREEMPT(lazy)  3d744dd94e92690f00a04398d2bd8631dcef1954
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-4.module+el8.8.0+21164+ed375313 04/01/2014
 ==================================================================

On further analysis, I realized that the second parameter to min() is
not correct.  xfs_ifork::if_bytes is the size of the xfs_ifork::if_data
buffer.  if_bytes can be smaller than the data fork size because:

(a) the forkoff code tries to keep the data area as large as possible
(b) for symbolic links, if_bytes is the ondisk file size + 1
(c) forkoff is always a multiple of 8.

Case in point: for a single-byte symlink target, forkoff will be
8 but the buffer will only be 2 bytes long.

In other words, the logic here is wrong and we walk off the end of the
incore buffer.  Fix that.

Cc: stable@vger.kernel.org # v6.10
Fixes: 2651923d8d ("xfs: online repair of symbolic links")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-20 11:06:24 +01:00
Christoph Hellwig
d8a823c6f0 xfs: free xfs_busy_extents structure when no RT extents are queued
kmemleak occasionally reports leaking xfs_busy_extents structure
from xfs_scrub calls after running xfs/528 (but attributed to following
tests), which seems to be caused by not freeing the xfs_busy_extents
structure when tr.queued is 0 and xfs_trim_rtgroup_extents breaks out
of the main loop.  Free the structure in this case.

Fixes: a3315d1130 ("xfs: use rtgroup busy extent list for FITRIM")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-06 08:59:19 +01:00
Christoph Hellwig
21ab5179aa xfs: fix zone selection in xfs_select_open_zone_mru
xfs_select_open_zone_mru needs to pass XFS_ZONE_ALLOC_OK to
xfs_try_use_zone because we only want to tightly pack into zones of the
same or a compatible temperature instead of any available zone.

This got broken in commit 0301dae732 ("xfs: refactor hint based zone
allocation"), which failed to update this particular caller when
switching to an enum.  xfs/638 sometimes, but not reliably fails due to
this change.

Fixes: 0301dae732 ("xfs: refactor hint based zone allocation")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-05 16:54:38 +01:00
Christoph Hellwig
f5714a3c1a xfs: fix a rtgroup leak when xfs_init_zone fails
Drop the rtgrop reference when xfs_init_zone fails for a conventional
device.

Fixes: 4e4d520755 ("xfs: add the zoned space allocator")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-05 16:53:49 +01:00
Darrick J. Wong
8d7bba1e83 xfs: fix various problems in xfs_atomic_write_cow_iomap_begin
I think there are several things wrong with this function:

A) xfs_bmapi_write can return a much larger unwritten mapping than what
   the caller asked for.  We convert part of that range to written, but
   return the entire written mapping to iomap even though that's
   inaccurate.

B) The arguments to xfs_reflink_convert_cow_locked are wrong -- an
   unwritten mapping could be *smaller* than the write range (or even
   the hole range).  In this case, we convert too much file range to
   written state because we then return a smaller mapping to iomap.

C) It doesn't handle delalloc mappings.  This I covered in the patch
   that I already sent to the list.

D) Reassigning count_fsb to handle the hole means that if the second
   cmap lookup attempt succeeds (due to racing with someone else) we
   trim the mapping more than is strictly necessary.  The changing
   meaning of count_fsb makes this harder to notice.

E) The tracepoint is kinda wrong because @length is mutated.  That makes
   it harder to chase the data flows through this function because you
   can't just grep on the pos/bytecount strings.

F) We don't actually check that the br_state = XFS_EXT_NORM assignment
   is accurate, i.e that the cow fork actually contains a written
   mapping for the range we're interested in

G) Somewhat inadequate documentation of why we need to xfs_trim_extent
   so aggressively in this function.

H) Not sure why xfs_iomap_end_fsb is used here, the vfs already clamped
   the write range to s_maxbytes.

Fix these issues, and then the atomic writes regressions in generic/760,
generic/617, generic/091, generic/263, and generic/521 all go away for
me.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-05 16:52:49 +01:00
Darrick J. Wong
8d54eacd82 xfs: fix delalloc write failures in software-provided atomic writes
With the 20 Oct 2025 release of fstests, generic/521 fails for me on
regular (aka non-block-atomic-writes) storage:

QA output created by 521
dowrite: write: Input/output error
LOG DUMP (8553 total operations):
1(  1 mod 256): SKIPPED (no operation)
2(  2 mod 256): WRITE    0x7e000 thru 0x8dfff	(0x10000 bytes) HOLE
3(  3 mod 256): READ     0x69000 thru 0x79fff	(0x11000 bytes)
4(  4 mod 256): FALLOC   0x53c38 thru 0x5e853	(0xac1b bytes) INTERIOR
5(  5 mod 256): COPY 0x55000 thru 0x59fff	(0x5000 bytes) to 0x25000 thru 0x29fff
6(  6 mod 256): WRITE    0x74000 thru 0x88fff	(0x15000 bytes)
7(  7 mod 256): ZERO     0xedb1 thru 0x11693	(0x28e3 bytes)

with a warning in dmesg from iomap about XFS trying to give it a
delalloc mapping for a directio write.  Fix the software atomic write
iomap_begin code to convert the reservation into a written mapping.
This doesn't fix the data corruption problems reported by generic/760,
but it's a start.

Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-05 16:52:49 +01:00
Christoph Hellwig
0db22d7ee4 xfs: document another racy GC case in xfs_zoned_map_extent
Besides blocks being invalidated, there is another case when the original
mapping could have changed between querying the rmap for GC and calling
xfs_zoned_map_extent.  Document it there as it took us quite some time
to figure out what is going on while developing the multiple-GC
protection fix.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-31 12:06:03 +01:00
Christoph Hellwig
83bac569c7 xfs: prevent gc from picking the same zone twice
When we are picking a zone for gc it might already be in the pipeline
which can lead to us moving the same data twice resulting in in write
amplification and a very unfortunate case where we keep on garbage
collecting the zone we just filled with migrated data stopping all
forward progress.

Fix this by introducing a count of on-going GC operations on a zone, and
skip any zone with ongoing GC when picking a new victim.

Fixes: 080d01c41 ("xfs: implement zoned garbage collection")
Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
Co-developed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-31 12:06:03 +01:00
Darrick J. Wong
f477af0cfa xfs: fix locking in xchk_nlinks_collect_dir
On a filesystem with parent pointers, xchk_nlinks_collect_dir walks both
the directory entries (data fork) and the parent pointers (attr fork) to
determine the correct link count.  Unfortunately I forgot to update the
lock mode logic to handle the case of a directory whose attr fork is in
btree format and has not yet been loaded *and* whose data fork doesn't
need loading.

This leads to a bunch of assertions from xfs/286 in xfs_iread_extents
because we only took ILOCK_SHARED, not ILOCK_EXCL.  You'd need the rare
happenstance of a directory with a large number of non-pptr extended
attributes set and enough memory pressure to cause the directory to be
evicted and partially reloaded from disk.

I /think/ this only started in 6.18-rc1 because I've started seeing OOM
errors with the maple tree slab using 70% of memory, and this didn't
happen in 6.17.  Yay dynamic systems!

Cc: stable@vger.kernel.org # v6.10
Fixes: 77ede5f44b ("xfs: walk directory parent pointers to determine backref count")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-22 10:04:39 +02:00
Darrick J. Wong
3e7ec343f0 xfs: loudly complain about defunct mount options
Apparently we can never deprecate mount options in this project, because
it will invariably turn out that some foolish userspace depends on some
behavior and break.  From Oleksandr Natalenko:

  In v6.18, the attr2 XFS mount option is removed. This may silently
  break system boot if the attr2 option is still present in /etc/fstab
  for rootfs.

  Consider Arch Linux that is being set up from scratch with / being
  formatted as XFS. The genfstab command that is used to generate
  /etc/fstab produces something like this by default:

  /dev/sda2 on / type xfs (rw,relatime,attr2,discard,inode64,logbufs=8,logbsize=32k,noquota)

  Once the system is set up and rebooted, there's no deprecation warning
  seen in the kernel log:

  # cat /proc/cmdline
  root=UUID=77b42de2-397e-47ee-a1ef-4dfd430e47e9 rootflags=discard rd.luks.options=discard quiet

  # dmesg | grep -i xfs
  [    2.409818] SGI XFS with ACLs, security attributes, realtime, scrub, repair, quota, no debug enabled
  [    2.415341] XFS (sda2): Mounting V5 Filesystem 77b42de2-397e-47ee-a1ef-4dfd430e47e9
  [    2.442546] XFS (sda2): Ending clean mount

  Although as per the deprecation intention, it should be there.

  Vlastimil (in Cc) suggests this is because xfs_fs_warn_deprecated()
  doesn't produce any warning by design if the XFS FS is set to be
  rootfs and gets remounted read-write during boot. This imposes two
  problems:

  1) a user doesn't see the deprecation warning; and
  2) with v6.18 kernel, the read-write remount fails because of unknown
     attr2 option rendering system unusable:

  systemd[1]: Switching root.
  systemd-remount-fs[225]: /usr/bin/mount for / exited with exit status 32.

  # mount -o rw /
  mount: /: fsconfig() failed: xfs: Unknown parameter 'attr2'.

  Thorsten (in Cc) suggested reporting this as a user-visible regression.

  From my PoV, although the deprecation is in place for 5 years already,
  it may not be visible enough as the warning is not emitted for rootfs.
  Considering the amount of systems set up with XFS on /, this may
  impose a mass problem for users.

  Vlastimil suggested making attr2 option a complete noop instead of
  removing it.

IOWs, the initrd mounts the root fs with (I assume) no mount options,
and mount -a remounts with whatever options are in fstab.  However,
XFS doesn't complain about deprecated mount options during a remount, so
technically speaking we were not warning all users in all combinations
that they were heading for a cliff.

Gotcha!!

Now, how did 'attr2' get slurped up on so many systems?  The old code
would put that in /proc/mounts if the filesystem happened to be in attr2
mode, even if user hadn't mounted with any such option.  IOWs, this is
because someone thought it would be a good idea to advertise system
state via /proc/mounts.

The easy way to fix this is to reintroduce the four mount options but
map them to a no-op option that ignores them, and hope that nobody's
depending on attr2 to appear in /proc/mounts.  (Hint: use the fsgeometry
ioctl).  But we've learned our lesson, so complain as LOUDLY as possible
about the deprecation.

Lessons learned:

 1. Don't expose system state via /proc/mounts; the only strings that
    ought to be there are options *explicitly* provided by the user.
 2. Never tidy, it's not worth the stress and irritation.

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: stable@vger.kernel.org # v6.18-rc1
Fixes: b9a176e541 ("xfs: remove deprecated mount options")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-22 10:04:39 +02:00
Darrick J. Wong
630785bfbe xfs: always warn about deprecated mount options
The deprecation of the 'attr2' mount option in 6.18 wasn't entirely
successful because nobody noticed that the kernel never printed a
warning about attr2 being set in fstab if the only xfs filesystem is the
root fs; the initramfs mounts the root fs with no mount options; and the
init scripts only conveyed the fstab options by remounting the root fs.

Fix this by making it complain all the time.

Cc: stable@vger.kernel.org # v5.13
Fixes: 92cf7d3638 ("xfs: Skip repetitive warnings about mount options")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-22 10:04:39 +02:00
Darrick J. Wong
bd721ec7de xfs: don't set bt_nr_sectors to a negative number
xfs_daddr_t is a signed type, which means that xfs_buf_map_verify is
using a signed comparison.  This causes problems if bt_nr_sectors is
never overridden (e.g. in the case of an xfbtree for rmap btree repairs)
because even daddr 0 can't pass the verifier test in that case.

Define an explicit max constant and set the initial bt_nr_sectors to a
positive value.

Found by xfs/422.

Cc: stable@vger.kernel.org # v6.18-rc1
Fixes: 42852fe57c ("xfs: track the number of blocks in each buftarg")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-22 10:04:39 +02:00
Christoph Hellwig
0f41997b1b xfs: don't use __GFP_NOFAIL in xfs_init_fs_context
With enough debug options enabled, struct xfs_mount is larger
than 4k and thus NOFAIL allocations won't work for it.

xfs_init_fs_context is early in the mount process, and if we really
are out of memory there we'd better give up ASAP anyway.

Fixes: 7b77b46a61 ("xfs: use kmem functions for struct xfs_mount")
Reported-by: syzbot+359a67b608de1ef72f65@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-21 11:32:50 +02:00
Christoph Hellwig
ca3d643a97 xfs: cache open zone in inode->i_private
The MRU cache for open zones is unfortunately still not ideal, as it can
time out pretty easily when doing heavy I/O to hard disks using up most
or all open zones.  One option would be to just increase the timeout,
but while looking into that I realized we're just better off caching it
indefinitely as there is no real downside to that once we don't hold a
reference to the cache open zone.

So switch the open zone to RCU freeing, and then stash the last used
open zone into inode->i_private.  This helps to significantly reduce
fragmentation by keeping I/O localized to zones for workloads that
write using many open files to HDD.

Fixes: 4e4d520755 ("xfs: add the zoned space allocator")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-21 11:32:50 +02:00
Christoph Hellwig
a8c861f401 xfs: avoid busy loops in GCD
When GCD has no new work to handle, but read, write or reset commands
are outstanding, it currently busy loops, which is a bit suboptimal,
and can lead to softlockup warnings in case of stuck commands.

Change the code so that the task state is only set to running when work
is performed, which looks a bit tricky due to the design of the
reading/writing/resetting lists that contain both in-flight and finished
commands.

Fixes: 080d01c41d ("xfs: implement zoned garbage collection")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-21 11:32:50 +02:00
Geert Uytterhoeven
f5caeb3689 xfs: XFS_ONLINE_SCRUB_STATS should depend on DEBUG_FS
Currently, XFS_ONLINE_SCRUB_STATS selects DEBUG_FS.  However, DEBUG_FS
is meant for debugging, and people may want to disable it on production
systems.  Since commit 0ff51a1fd7 ("xfs: enable online fsck by
default in Kconfig")), XFS_ONLINE_SCRUB_STATS is enabled by default,
forcing DEBUG_FS enabled too.

Fix this by replacing the selection of DEBUG_FS by a dependency on
DEBUG_FS, which is what most other options controlling the gathering and
exposing of statistics do.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-21 09:52:59 +02:00
Damien Le Moal
b00bcb190e xfs: do not tightly pack-write large files
When using a zoned realtime device, tightly packing of data blocks
belonging to multiple closed files into the same realtime group (RTG)
is very efficient at improving write performance. This is especially
true with SMR HDDs as this can reduce, and even suppress, disk head
seeks.

However, such tight packing does not make sense for large files that
require at least a full RTG. If tight packing placement is applied for
such files, the VM writeback thread switching between inodes result in
the large files to be fragmented, thus increasing the garbage collection
penalty later when the RTG needs to be reclaimed.

This problem can be avoided with a simple heuristic: if the size of the
inode being written back is at least equal to the RTG size, do not use
tight-packing. Modify xfs_zoned_pack_tight() to always return false in
this case.

With this change, a multi-writer workload writing files of 256 MB on a
file system backed by an SMR HDD with 256 MB zone size as a realtime
device sees all files occupying exactly one RTG (i.e. one device zone),
thus completely removing the heavy fragmentation observed without this
change.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-21 09:49:39 +02:00
Damien Le Moal
914f377075 xfs: Improve CONFIG_XFS_RT Kconfig help
Improve the description of the XFS_RT configuration option to document
that this option is required for zoned block devices.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-10-21 09:48:31 +02:00
Linus Torvalds
211ddde082 Linux 6.18-rc2 2025-10-19 15:19:16 -10:00
Linus Torvalds
d9043c79ba Merge tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:

 - Make sure the check for lost pelt idle time is done unconditionally
   to have correct lost idle time accounting

 - Stop the deadline server task before a CPU goes offline

* tag 'sched_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix pelt lost idle time detection
  sched/deadline: Stop dl_server before CPU goes offline
2025-10-19 04:59:43 -10:00
Linus Torvalds
343b4b44a1 Merge tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:

 - Make sure perf reporting works correctly in setups using
   overlayfs or FUSE

 - Move the uprobe optimization to a better location logically

* tag 'perf_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix MMAP2 event device with backing files
  perf/core: Fix MMAP event path names with backing files
  perf/core: Fix address filter match with backing files
  uprobe: Move arch_uprobe_optimize right after handlers execution
2025-10-19 04:54:08 -10:00
Linus Torvalds
c7864eeaa4 Merge tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Reset the why-the-system-rebooted register on AMD to avoid stale bits
   remaining from previous boots

 - Add a missing barrier in the TLB flushing code to prevent erroneously
   not flushing a TLB generation

 - Make sure cpa_flush() does not overshoot when computing the end range
   of a flush region

 - Fix resctrl bandwidth counting on AMD systems when the amount of
   monitoring groups created exceeds the number the hardware can track

* tag 'x86_urgent_for_v6.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Prevent reset reasons from being retained across reboot
  x86/mm: Fix SMP ordering in switch_mm_irqs_off()
  x86/mm: Fix overflow in __cpa_addr()
  x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID
2025-10-19 04:41:27 -10:00
Linus Torvalds
1c64efcb08 Merge tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull rustfmt fixes from Miguel Ojeda:
 "Rust 'rustfmt' cleanup

  'rustfmt', by default, formats imports in a way that is prone to
  conflicts while merging and rebasing, since in some cases it condenses
  several items into the same line.

  Document in our guidelines that we will handle this for the moment
  with the trailing empty comment workaround and make the tree
  'rustfmt'-clean again"

* tag 'rust-rustfmt' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: bitmap: fix formatting
  rust: cpufreq: fix formatting
  rust: alloc: employ a trailing comment to keep vertical layout
  docs: rust: add section on imports formatting
2025-10-18 10:05:13 -10:00
Linus Torvalds
648937f64a Merge tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull tpm fix from Jarkko Sakkinen:
 "Correct the state transitions for ARM FF-A to match the spec and how
  tpm_crb behaves on other platforms"

* tag 'tpmdd-next-v6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm_crb: Add idle support for the Arm FF-A start method
2025-10-18 08:38:28 -10:00
Linus Torvalds
e67bb0da33 Merge tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:

 - Search for MSI Capability with correct ID to fix an MSI regression on
   platforms with Cadence IP (Hans Zhang)

 - Revert early bridge resource set up to fix resource assignment
   failures that broke at least alpha boot and Snapdragon ath12k WiFi
   (Ilpo Järvinen)

 - Implement VMD .irq_startup()/.irq_shutdown() to fix IRQ issues that
   caused boot crashes and broken devices below VMD (Inochi Amaoto)

 - Select CONFIG_SCREEN_INFO on X86 to fix black screen on boot when
   SCREEN_INFO not selected (Mario Limonciello)

* tag 'pci-v6.18-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/VGA: Select SCREEN_INFO on X86
  PCI: vmd: Override irq_startup()/irq_shutdown() in vmd_init_dev_msi_info()
  PCI: Revert early bridge resource set up
  PCI: cadence: Search for MSI Capability with correct ID
2025-10-18 08:35:09 -10:00
Linus Torvalds
ea0bdf2b94 Merge tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link fixes from Dave Jiang:
 "A small collection of CXL fixes. In addition to some misc fixes for
  the CXL subsystem, a number of fixes for CXL extended linear cache
  support are included to make it functional again.

   - Avoid missing port component registers setup due to dport
     enumeration failure

   - Add check for no entries in cxl_feature_info to address accessing
     invalid pointer.

   - Use %pa printk format to emit resource_size_t in
     validate_region_offset()

  CXL extended linear cache support fixes:

   - Fix setup of memory resource in cxl_acpi_set_cache_size()

   - Set range param for region_res_match_cxl_range() as const
     (addresses a compile warning for match_region_by_range() fix)

   - Fix match_region_by_range() to use region_res_match_cxl_range()

   - Subtract to find an hpa_alias0 in cxl_poison events to correct the
     alias math calculation"

* tag 'cxl-fixes-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/trace: Subtract to find an hpa_alias0 in cxl_poison events
  cxl/region: Use %pa printk format to emit resource_size_t
  cxl: Fix match_region_by_range() to use region_res_match_cxl_range()
  cxl: Set range param for region_res_match_cxl_range() as const
  cxl/acpi: Fix setup of memory resource in cxl_acpi_set_cache_size()
  cxl/features: Add check for no entries in cxl_feature_info
  cxl/port: Avoid missing port component registers setup
2025-10-18 08:22:07 -10:00
Linus Torvalds
2953fb6548 Merge tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:

 - fix for sticky fingers handling in hid-multitouch (Benjamin
   Tissoires)

 - fix for reporting of 0 battery levels (Dmitry Torokhov)

 - build fix for hid-haptic in certain configurations (Jonathan Denose)

 - improved probe and avoiding spamming kernel log by hid-nintendo
   (Vicki Pfau)

 - fix for OOB in hid-cp2112 (Deepak Sharma)

 - interrupt handling fix for intel-thc-hid (Even Xu)

 - a couple of new device IDs and device-specific quirks

* tag 'hid-for-linus-2025101701' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-hidpp: Add HIDPP_QUIRK_RESET_HI_RES_SCROLL
  selftests/hid: add tests for missing release on the Dell Synaptics
  HID: multitouch: fix sticky fingers
  HID: multitouch: fix name of Stylus input devices
  HID: hid-input: only ignore 0 battery events for digitizers
  HID: hid-debug: Fix spelling mistake "Rechargable" -> "Rechargeable"
  HID: Kconfig: Fix build error from CONFIG_HID_HAPTIC
  HID: nintendo: Rate limit IMU compensation message
  HID: nintendo: Wait longer for initial probe
  HID: core: Add printk_ratelimited variants to hid_warn() etc
  HID: quirks: Add ALWAYS_POLL quirk for VRS R295 steering wheel
  HID: quirks: avoid Cooler Master MM712 dongle wakeup bug
  HID: cp2112: Add parameter validation to data length
  HID: intel-thc-hid: intel-quickspi: Add ARL PCI Device Id's
  HID: intel-thc-hid: Intel-quickspi: switch first interrupt from level to edge detection
  HID: intel-thc-hid: intel-quicki2c: Fix wrong type casting
2025-10-18 08:18:18 -10:00
Linus Torvalds
d303caf5ca Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:

 - Replace bpf_map_kmalloc_node() with kmalloc_nolock() to fix kmemleak
   imbalance in tracking of bpf_async_cb structures (Alexei Starovoitov)

 - Make selftests/bpf arg_parsing.c more robust to errors (Andrii
   Nakryiko)

 - Fix redefinition of 'off' as different kind of symbol when I40E
   driver is builtin (Brahmajit Das)

 - Do not disable preemption in bpf_test_run (Sahil Chandna)

 - Fix memory leak in __lookup_instance error path (Shardul Bankar)

 - Ensure test data is flushed to disk before reading it (Xing Guo)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests/bpf: Fix redefinition of 'off' as different kind of symbol
  bpf: Do not disable preemption in bpf_test_run().
  bpf: Fix memory leak in __lookup_instance error path
  selftests: arg_parsing: Ensure data is flushed to disk before reading.
  bpf: Replace bpf_map_kmalloc_node() with kmalloc_nolock() to allocate bpf_async_cb structures.
  selftests/bpf: make arg_parsing.c more robust to crashes
  bpf: test_run: Fix ctx leak in bpf_prog_test_run_xdp error path
2025-10-18 08:00:43 -10:00
Linus Torvalds
847f242f7a Merge tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat fixes from Namjae Jeon:

 - Fix out-of-bounds in FS_IOC_SETFSLABEL

 - Add validation for stream entry size to prevent infinite loop

* tag 'exfat-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: fix out-of-bounds in exfat_nls_to_ucs2()
  exfat: fix improper check of dentry.stream.valid_size
2025-10-18 07:23:59 -10:00
Linus Torvalds
2d07c6c209 Merge tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:

 - Fix for FlexFiles mirror->dss allocation

 - Apply delay_retrans to async operations

 - Check if suid/sgid is cleared after a write when needed

 - Fix setting the state renewal timer for early mounts after a reboot

* tag 'nfs-for-6.18-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS4: Fix state renewals missing after boot
  NFS: check if suid/sgid was cleared after a write as needed
  NFS4: Apply delay_retrans to async operations
  NFSv4/flexfiles: fix to allocate mirror->dss before use
2025-10-18 07:18:48 -10:00
Linus Torvalds
4ccb3a8000 Merge tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
 "smb client fixes, security and smbdirect improvements, and some minor cleanup:

   - Important OOB DFS fix

   - Fix various potential tcon refcount leaks

   - smbdirect (RDMA) fixes (following up from test event a few weeks
     ago):

      - Fixes to improve and simplify handling of memory lifetime of
        smbdirect_mr_io structures, when a connection gets disconnected

      - Make sure we really wait to reach SMBDIRECT_SOCKET_DISCONNECTED
        before destroying resources

      - Make sure the send/recv submission/completion queues are large
        enough to avoid ib_post_send() from failing under pressure

   - convert cifs.ko to use the recommended crypto libraries (instead of
     crypto_shash), this also can improve performance

   - Three small cleanup patches"

* tag '6.18-rc1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (24 commits)
  smb: client: Consolidate cmac(aes) shash allocation
  smb: client: Remove obsolete crypto_shash allocations
  smb: client: Use HMAC-MD5 library for NTLMv2
  smb: client: Use MD5 library for SMB1 signature calculation
  smb: client: Use MD5 library for M-F symlink hashing
  smb: client: Use HMAC-SHA256 library for SMB2 signature calculation
  smb: client: Use HMAC-SHA256 library for key generation
  smb: client: Use SHA-512 library for SMB3.1.1 preauth hash
  cifs: parse_dfs_referrals: prevent oob on malformed input
  smb: client: Fix refcount leak for cifs_sb_tlink
  smb: client: let smbd_destroy() wait for SMBDIRECT_SOCKET_DISCONNECTED
  smb: move some duplicate definitions to common/cifsglob.h
  smb: client: let destroy_mr_list() keep smbdirect_mr_io memory if registered
  smb: client: let destroy_mr_list() call ib_dereg_mr() before ib_dma_unmap_sg()
  smb: client: call ib_dma_unmap_sg if mr->sgt.nents is not 0
  smb: client: improve logic in smbd_deregister_mr()
  smb: client: improve logic in smbd_register_mr()
  smb: client: improve logic in allocate_mr_list()
  smb: client: let destroy_mr_list() remove locked from the list
  smb: client: let destroy_mr_list() call list_del(&mr->list)
  ...
2025-10-18 07:11:32 -10:00
Linus Torvalds
02e5f74ef0 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Fix the handling of ZCR_EL2 in NV VMs

   - Pick the correct translation regime when doing a PTW on the back of
     a SEA

   - Prevent userspace from injecting an event into a vcpu that isn't
     initialised yet

   - Move timer save/restore to the sysreg handling code, fixing EL2
     timer access in the process

   - Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug

   - Fix trapping configuration when the host isn't GICv3

   - Improve the detection of HCR_EL2.E2H being RES1

   - Drop a spurious 'break' statement in the S1 PTW

   - Don't try to access SPE when owned by EL3

  Documentation updates:

   - Document the failure modes of event injection

   - Document that a GICv3 guest can be created on a GICv5 host with
     FEAT_GCIE_LEGACY

  Selftest improvements:

   - Add a selftest for the effective value of HCR_EL2.AMO

   - Address build warning in the timer selftest when building with
     clang

   - Teach irqfd selftests about non-x86 architectures

   - Add missing sysregs to the set_id_regs selftest

   - Fix vcpu allocation in the vgic_lpi_stress selftest

   - Correctly enable interrupts in the vgic_lpi_stress selftest

  x86:

   - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test
     for the bug fixed by commit 3ccbf6f470 ("KVM: x86/mmu: Return
     -EAGAIN if userspace deletes/moves memslot during prefault")

   - Don't try to get PMU capabilities from perf when running a CPU with
     hybrid CPUs/PMUs, as perf will rightly WARN.

  guest_memfd:

   - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a
     more generic KVM_CAP_GUEST_MEMFD_FLAGS

   - Add a guest_memfd INIT_SHARED flag and require userspace to
     explicitly set said flag to initialize memory as SHARED,
     irrespective of MMAP.

     The behavior merged in 6.18 is that enabling mmap() implicitly
     initializes memory as SHARED, which would result in an ABI
     collision for x86 CoCo VMs as their memory is currently always
     initialized PRIVATE.

   - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with
     private memory, to enable testing such setups, i.e. to hopefully
     flush out any other lurking ABI issues before 6.18 is officially
     released.

   - Add testcases to the guest_memfd selftest to cover guest_memfd
     without MMAP, and host userspace accesses to mmap()'d private
     memory"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (46 commits)
  arm64: Revamp HCR_EL2.E2H RES1 detection
  KVM: arm64: nv: Use FGT write trap of MDSCR_EL1 when available
  KVM: arm64: Compute per-vCPU FGTs at vcpu_load()
  KVM: arm64: selftests: Fix misleading comment about virtual timer encoding
  KVM: arm64: selftests: Add an E2H=0-specific configuration to get_reg_list
  KVM: arm64: selftests: Make dependencies on VHE-specific registers explicit
  KVM: arm64: Kill leftovers of ad-hoc timer userspace access
  KVM: arm64: Fix WFxT handling of nested virt
  KVM: arm64: Move CNT*CT_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CVAL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Move CNT*_CTL_EL0 userspace accessors to generic infrastructure
  KVM: arm64: Add timer UAPI workaround to sysreg infrastructure
  KVM: arm64: Make timer_set_offset() generally accessible
  KVM: arm64: Replace timer context vcpu pointer with timer_id
  KVM: arm64: Introduce timer_context_to_vcpu() helper
  KVM: arm64: Hide CNTHV_*_EL2 from userspace for nVHE guests
  Documentation: KVM: Update GICv3 docs for GICv5 hosts
  KVM: arm64: gic-v3: Only set ICH_HCR traps for v2-on-v3 or v3 guests
  KVM: arm64: selftests: Actually enable IRQs in vgic_lpi_stress
  KVM: arm64: selftests: Allocate vcpus with correct size
  ...
2025-10-18 07:07:14 -10:00
Linus Torvalds
0e622c4b0e Merge tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Madhavan Srinivasan:

 - Fix to handle NULL pointer dereference at irq domain teardown

 - Fix for handling extraction of struct xive_irq_data

 - Fix to skip parameter area allocation when fadump disabled

Thanks to Ganesh Goudar, Hari Bathini, Nam Cao, Ritesh Harjani (IBM),
Sourabh Jain, and Venkat Rao Bagalkote,

* tag 'powerpc-6.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/fadump: skip parameter area allocation when fadump is disabled
  powerpc, ocxl: Fix extraction of struct xive_irq_data
  powerpc/pseries/msi: Fix NULL pointer dereference at irq domain teardown
2025-10-18 07:02:28 -10:00
Linus Torvalds
959f018f97 Merge tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab fixes from Vlastimil Babka:

 - Fixes for two bugs that can be triggered when debugging options are
   enabled (Hao Ge, Vlastimil Babka)

* tag 'slab-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  slab: reset slab->obj_ext when freeing and it is OBJEXTS_ALLOC_FAIL
  slab: fix clearing freelist in free_deferred_objects()
2025-10-18 06:59:25 -10:00
Stuart Yoder
dbfdaeb381 tpm_crb: Add idle support for the Arm FF-A start method
According to the CRB over FF-A specification [1], a TPM that implements
the ABI must comply with the TCG PTP specification. This requires support
for the Idle and Ready states.

This patch implements CRB control area requests for goIdle and
cmdReady on FF-A based TPMs.

The FF-A message used to notify the TPM of CRB updates includes a
locality parameter, which provides a hint to the TPM about which
locality modified the CRB.  This patch adds a locality parameter
to __crb_go_idle() and __crb_cmd_ready() to support this.

[1] https://developer.arm.com/documentation/den0138/latest/

Signed-off-by: Stuart Yoder <stuart.yoder@arm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-10-18 14:33:22 +03:00
Paolo Bonzini
4361f5aa8b Merge tag 'kvm-x86-fixes-6.18-rc2' of https://github.com/kvm-x86/linux into HEAD
KVM x86 fixes for 6.18:

 - Expand the KVM_PRE_FAULT_MEMORY selftest to add a regression test for the
   bug fixed by commit 3ccbf6f470 ("KVM: x86/mmu: Return -EAGAIN if userspace
   deletes/moves memslot during prefault")

 - Don't try to get PMU capabbilities from perf when running a CPU with hybrid
   CPUs/PMUs, as perf will rightly WARN.

 - Rework KVM_CAP_GUEST_MEMFD_MMAP (newly introduced in 6.18) into a more
   generic KVM_CAP_GUEST_MEMFD_FLAGS

 - Add a guest_memfd INIT_SHARED flag and require userspace to explicitly set
   said flag to initialize memory as SHARED, irrespective of MMAP.  The
   behavior merged in 6.18 is that enabling mmap() implicitly initializes
   memory as SHARED, which would result in an ABI collision for x86 CoCo VMs
   as their memory is currently always initialized PRIVATE.

 - Allow mmap() on guest_memfd for x86 CoCo VMs, i.e. on VMs with private
   memory, to enable testing such setups, i.e. to hopefully flush out any
   other lurking ABI issues before 6.18 is officially released.

 - Add testcases to the guest_memfd selftest to cover guest_memfd without MMAP,
   and host userspace accesses to mmap()'d private memory.
2025-10-18 10:25:43 +02:00
Paolo Bonzini
5d26eaae15 Merge tag 'kvmarm-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.18, take #1

Improvements and bug fixes:

- Fix the handling of ZCR_EL2 in NV VMs
  (20250926194108.84093-1-oliver.upton@linux.dev)

- Pick the correct translation regime when doing a PTW on
  the back of a SEA (20250926224246.731748-1-oliver.upton@linux.dev)

- Prevent userspace from injecting an event into a vcpu that isn't
  initialised yet (20250930085237.108326-1-oliver.upton@linux.dev)

- Move timer save/restore to the sysreg handling code, fixing EL2 timer
  access in the process (20250929160458.3351788-1-maz@kernel.org)

- Add FGT-based trapping of MDSCR_EL1 to reduce the overhead of debug
  (20250924235150.617451-1-oliver.upton@linux.dev)

- Fix trapping configuration when the host isn't GICv3
  (20251007160704.1673584-1-sascha.bischoff@arm.com)

- Improve the detection of HCR_EL2.E2H being RES1
  (20251009121239.29370-1-maz@kernel.org)

- Drop a spurious 'break' statement in the S1 PTW
  (20250930135621.162050-1-osama.abdelkader@gmail.com)

- Don't try to access SPE when owned by EL3
  (20251010174707.1684200-1-mukesh.ojha@oss.qualcomm.com)

Documentation updates:

- Document the failure modes of event injection
  (20250930233620.124607-1-oliver.upton@linux.dev)

- Document that a GICv3 guest can be created on a GICv5 host
  with FEAT_GCIE_LEGACY (20251007154848.1640444-1-sascha.bischoff@arm.com)

Selftest improvements:

- Add a selftest for the effective value of HCR_EL2.AMO
  (20250926224454.734066-1-oliver.upton@linux.dev)

- Address build warning in the timer selftest when building
  with clang (20250926155838.2612205-1-seanjc@google.com)

- Teach irq_fd selftests about non-x86 architectures
  (20250930193301.119859-1-oliver.upton@linux.dev)

- Add missing sysregs to the set_id_regs selftest
  (20251012154352.61133-1-zenghui.yu@linux.dev)

- Fix vcpu allocation in the vgic_lpi_stress selftest
  (20251008154520.54801-1-zenghui.yu@linux.dev)

- Correctly enable interrupts in the vgic_lpi_stress selftest
  (20251007195254.260539-1-oliver.upton@linux.dev)
2025-10-18 10:25:31 +02:00
Linus Torvalds
f406055cb1 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - Explicitly encode the XZR register if the value passed to
   write_sysreg_s() is 0.

   The GIC CDEOI instruction is encoded as a system register write with
   XZR as the source register. However, clang does not honour the "Z"
   register constraint, leading to incorrect code generation

 - Ensure the interrupts (DAIF.IF) are unmasked when completing
   single-step of a suspended breakpoint before calling
   exit_to_user_mode().

   With pseudo-NMIs, interrupts are (additionally) masked at the PMR_EL1
   register, handled by local_irq_*()

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: debug: always unmask interrupts in el0_softstp()
  arm64/sysreg: Fix GIC CDEOI instruction encoding
2025-10-17 13:04:21 -10:00
Linus Torvalds
fe69107ec7 Merge tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:

 - Disable CFI with Rust for any platform other than x86 and ARM64

 - Keep task mm_cpumasks up-to-date to avoid triggering M-mode firmware
   warnings if the kernel tries to send an IPI to an offline CPU

 - Improve kprobe address validation performance and avoid desyncs
   (following x86)

 - Avoid duplicate device probes by avoiding DT hardware probing when
   ACPI is enabled in early boot

 - Use the correct set of dependencies for
   CONFIG_ARCH_HAS_ELF_CORE_EFLAGS, avoiding an allnoconfig warning

 - Fix a few other minor issues

* tag 'riscv-for-linux-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: kprobes: convert one final __ASSEMBLY__ to __ASSEMBLER__
  riscv: Respect dependencies of ARCH_HAS_ELF_CORE_EFLAGS
  riscv: acpi: avoid errors caused by probing DT devices when ACPI is used
  riscv: kprobes: Fix probe address validation
  riscv: entry: fix typo in comment 'instruciton' -> 'instruction'
  RISC-V: clear hot-unplugged cores from all task mm_cpumasks to avoid rfence errors
  riscv: kgdb: Ensure that BUFMAX > NUMREGBYTES
  rust: cfi: only 64-bit arm and x86 support CFI_CLANG
2025-10-17 12:59:31 -10:00
Brahmajit Das
a1e83d4c03 selftests/bpf: Fix redefinition of 'off' as different kind of symbol
This fixes the following build error

   CLNG-BPF [test_progs] verifier_global_ptr_args.bpf.o
progs/verifier_global_ptr_args.c:228:5: error: redefinition of 'off' as
different kind of symbol
   228 | u32 off;
       |     ^

The symbol 'off' was previously defined in
tools/testing/selftests/bpf/tools/include/vmlinux.h, which includes an
enum i40e_ptp_gpio_pin_state from
drivers/net/ethernet/intel/i40e/i40e_ptp.c:

	enum i40e_ptp_gpio_pin_state {
		end = -2,
		invalid = -1,
		off = 0,
		in_A = 1,
		in_B = 2,
		out_A = 3,
		out_B = 4,
	};

This enum is included when CONFIG_I40E is enabled. As of commit
032676ff82 ("LoongArch: Update Loongson-3 default config file"),
CONFIG_I40E is set in the defconfig, which leads to the conflict.

Renaming the local variable avoids the redefinition and allows the
build to succeed.

Suggested-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20251017171551.53142-1-listout@listout.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-17 11:33:23 -07:00
Sahil Chandna
7c33e97a6e bpf: Do not disable preemption in bpf_test_run().
The timer mode is initialized to NO_PREEMPT mode by default,
this disables preemption and force execution in atomic context
causing issue on PREEMPT_RT configurations when invoking
spin_lock_bh(), leading to the following warning:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6107, name: syz.0.17
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
Preemption disabled at:
[<ffffffff891fce58>] bpf_test_timer_enter+0xf8/0x140 net/bpf/test_run.c:42

Fix this, by removing NO_PREEMPT/NO_MIGRATE mode check.
Also, the test timer context no longer needs explicit calls to
migrate_disable()/migrate_enable() with rcu_read_lock()/rcu_read_unlock().
Use helpers rcu_read_lock_dont_migrate() and rcu_read_unlock_migrate()
instead.

Reported-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1f1fbecb9413cdbfbef8
Suggested-by: Yonghong Song <yonghong.song@linux.dev>
Suggested-by: Menglong Dong <menglong.dong@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Tested-by: syzbot+1f1fbecb9413cdbfbef8@syzkaller.appspotmail.com
Co-developed-by: Brahmajit Das <listout@listout.xyz>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Signed-off-by: Sahil Chandna <chandna.sahil@gmail.com>
Link: https://lore.kernel.org/r/20251014185635.10300-1-chandna.sahil@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-17 11:29:35 -07:00
Ada Couprie Diaz
ea0d55ae4b arm64: debug: always unmask interrupts in el0_softstp()
We intend that EL0 exception handlers unmask all DAIF exceptions
before calling exit_to_user_mode().

When completing single-step of a suspended breakpoint, we do not call
local_daif_restore(DAIF_PROCCTX) before calling exit_to_user_mode(),
leaving all DAIF exceptions masked.

When pseudo-NMIs are not in use this is benign.

When pseudo-NMIs are in use, this is unsound. At this point interrupts
are masked by both DAIF.IF and PMR_EL1, and subsequent irq flag
manipulation may not work correctly. For example, a subsequent
local_irq_enable() within exit_to_user_mode_loop() will only unmask
interrupts via PMR_EL1 (leaving those masked via DAIF.IF), and
anything depending on interrupts being unmasked (e.g. delivery of
signals) will not work correctly.

This was detected by CONFIG_ARM64_DEBUG_PRIORITY_MASKING.

Move the call to `try_step_suspended_breakpoints()` outside of the check
so that interrupts can be unmasked even if we don't call the step handler.

Fixes: 0ac7584c08 ("arm64: debug: split single stepping exception entry")
Cc: <stable@vger.kernel.org> # 6.17
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: added Mark's rewritten commit log and some whitespace]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-10-17 18:08:05 +01:00
Lorenzo Pieralisi
e9ad390a48 arm64/sysreg: Fix GIC CDEOI instruction encoding
The GIC CDEOI system instruction requires the Rt field to be set to 0b11111
otherwise the instruction behaviour becomes CONSTRAINED UNPREDICTABLE.

Currenly, its usage is encoded as a system register write, with a constant
0 value:

write_sysreg_s(0, GICV5_OP_GIC_CDEOI)

While compiling with GCC, the 0 constant value, through these asm
constraints and modifiers ('x' modifier and 'Z' constraint combo):

asm volatile(__msr_s(r, "%x0") : : "rZ" (__val));

forces the compiler to issue the XZR register for the MSR operation (ie
that corresponds to Rt == 0b11111) issuing the right instruction encoding.

Unfortunately LLVM does not yet understand that modifier/constraint
combo so it ends up issuing a different register from XZR for the MSR
source, which in turns means that it encodes the GIC CDEOI instruction
wrongly and the instruction behaviour becomes CONSTRAINED UNPREDICTABLE
that we must prevent.

Add a conditional to write_sysreg_s() macro that detects whether it
is passed a constant 0 value and issues an MSR write with XZR as source
register - explicitly doing what the asm modifier/constraint is meant to
achieve through constraints/modifiers, fixing the LLVM compilation issue.

Fixes: 7ec80fb3f0 ("irqchip/gic-v5: Add GICv5 PPI support")
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Cc: Sascha Bischoff <sascha.bischoff@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-10-17 17:55:01 +01:00
Linus Torvalds
6f3b6e91f7 Merge tag 'io_uring-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:

 - Revert of a change that went into an older kernel, and which has been
   reported to cause a regression for some write workloads on LVM while
   a snapshop is being created

 - Fix a regression from this merge window, where some compilers (and/or
   certain .config options) would cause an earlier evaluations of a
   dereference which would then cause a NULL pointer dereference.

   I was only able to reproduce this with OPTIMIZE_FOR_SIZE=y, but David
   Howells hit it with just KASAN enabled. Depending on how things
   inlined, this makes sense

 - Fix for a missing lock around a mem region unregistration

 - Fix for ring resizing with the same placement after resize

* tag 'io_uring-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  io_uring/rw: check for NULL io_br_sel when putting a buffer
  io_uring: fix unexpected placement on same size resizing
  io_uring: protect mem region deregistration
  Revert "io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()"
2025-10-17 08:45:54 -07:00
Linus Torvalds
0c8df15f75 Merge tag 'block-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:

 - NVMe pull request via Keith:
     - iostats accounting fixed on multipath retries (Amit)
     - secure concatenation response fixup (Martin)
     - tls partial record fixup (Wilfred)

 - Fix for a lockdep reported issue with the elevator lock and
   blk group frozen operations

 - Fix for a regression in this merge window, where updating
   'nr_requests' would not do the right thing for queues with
   shared tags

* tag 'block-6.18-20251016' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  nvme/tcp: handle tls partially sent records in write_space()
  block: Remove elevator_lock usage from blkg_conf frozen operations
  blk-mq: fix stale tag depth for shared sched tags in blk_mq_update_nr_requests()
  nvme-auth: update sc_c in host response
  nvme-multipath: Skip nr_active increments in RETRY disposition
2025-10-17 08:31:26 -07:00
Linus Torvalds
cf1ea8854e Merge tag 'mmc-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull mmc cleanup from Ulf Hansson:
 "Move rpmb_frame struct and constants to rpmb common header

  This helps us to avoid sharing an immutable branch between our git
  trees. I was planning to send it before rc1, but I didn't make it"

* tag 'mmc-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  rpmb: move rpmb_frame struct and constants to common header
2025-10-17 08:22:20 -07:00
Linus Torvalds
1422424187 Merge tag 'sound-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A collection of small fixes. All changes are rather boring
  device-specific fixes and quirks:

   - A few fixes for missing NULL checks

   - ASoC NAU8821 fixes for jack and irq handling

   - Various fixes for ASoC TAS2781, IDT821034, sc8280xp, max9809x,
     wcd938x, and SoundWire

   - Usual HD-audio and USB-audio quirks"

* tag 'sound-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (27 commits)
  ALSA: hda/realtek: Fix mute led for HP Omen 17-cb0xxx
  ALSA: usb-audio: fix vendor quirk for Logitech H390
  ALSA: usb-audio: add volume quirks for MS LifeChat LX-3000
  ASoC: amd/sdw_utils: avoid NULL deref when devm_kasprintf() fails
  ASoC: max98090/91: fixed max98091 ALSA widget powering up/down
  ASoC: dt-bindings: Add compatible string fsl,imx-audio-tlv320
  ASoC: codecs: wcd938x-sdw: remove redundant runtime pm calls
  ASoC: sdw_utils: add rt1321 part id to codec_info_list
  ALSA: usb-audio: Fix NULL pointer deference in try_to_register_card
  ALSA: firewire: amdtp-stream: fix enum kernel-doc warnings
  ALSA: usb-audio: add mixer_playback_min_mute quirk for Logitech H390
  ASoC: nau8821: Avoid unnecessary blocking in IRQ handler
  ASoC: nau8821: Add DMI quirk to bypass jack debounce circuit
  ASoC: nau8821: Consistently clear interrupts before unmasking
  ASoC: nau8821: Generalize helper to clear IRQ status
  ASoC: nau8821: Cancel jdet_work before handling jack ejection
  ASoC: codecs: Fix gain setting ranges for Renesas IDT821034 codec
  ASoC: tas2781: Update ti,tas2781.yaml for adding tas58xx
  ASoC: tas2781: Support more newly-released amplifiers tas58xx in the driver
  ASoC: qcom: sc8280xp: Add support for QCS615
  ...
2025-10-17 08:20:10 -07:00
Linus Torvalds
e96687c6d3 Merge tag 'drm-fixes-2025-10-17' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "As per usual xe/amdgpu are the leaders, with some i915 and then a
  bunch of scattered fixes. There are a bunch of stability fixes for
  some older amdgpu cards.

  draw:
   - Avoid color truncation

  gpuvm:
   - Avoid kernel-doc warning

  sched:
   - Avoid double free

  i915:
   - Skip GuC communication warning if reset is in progress
   - Couple frontbuffer related fixes
   - Deactivate PSR only on LNL and when selective fetch enabled

  xe:
   - Increase global invalidation timeout to handle some workloads
   - Fix NPD while evicting BOs in an array of VM binds
   - Fix resizable BAR to account for possibly needing to move BARs
     other than the LMEMBAR
   - Fix error handling in xe_migrate_init()
   - Fix atomic fault handling with mixed mappings or if the page is
     already in VRAM
   - Enable media samplers power gating for platforms before Xe2
   - Fix de-registering exec queue from GuC when unbinding
   - Ensure data migration to system if indicated by madvise with SVM
   - Fix kerneldoc for kunit change
   - Always account for cacheline alignment on migration
   - Drop bogus assertion on eviction

  amdgpu:
   - Backlight fix
   - SI fixes
   - CIK fix
   - Make CE support debug only
   - IP discovery fix
   - Ring reset fixes
   - GPUVM fault memory barrier fix
   - Drop unused structures in amdgpu_drm.h
   - JPEG debugfs fix
   - VRAM handling fixes for GPUs without VRAM
   - GC 12 MES fixes

  amdkfd:
   - MES fix

  ast:
   - Fix display output after reboot

  bridge:
   - lt9211: Fix version check

  panthor:
   - Fix MCU suspend

  qaic:
   - Init bootlog in correct order
   - Treat remaining == 0 as error in find_and_map_user_pages()
   - Lock access to DBC request queue

  rockchip:
   - vop2: Fix destination size in atomic check"

* tag 'drm-fixes-2025-10-17' of https://gitlab.freedesktop.org/drm/kernel: (44 commits)
  drm/sched: Fix potential double free in drm_sched_job_add_resv_dependencies
  drm/xe/evict: drop bogus assert
  drm/xe/migrate: don't misalign current bytes
  drm/xe/kunit: Fix kerneldoc for parameterized tests
  drm/xe/svm: Ensure data will be migrated to system if indicated by madvise.
  drm/gpuvm: Fix kernel-doc warning for drm_gpuvm_map_req.map
  drm/i915/psr: Deactivate PSR only on LNL and when selective fetch enabled
  drm/ast: Blank with VGACR17 sync enable, always clear VGACRB6 sync off
  accel/qaic: Synchronize access to DBC request queue head & tail pointer
  accel/qaic: Treat remaining == 0 as error in find_and_map_user_pages()
  accel/qaic: Fix bootlog initialization ordering
  drm/rockchip: vop2: use correct destination rectangle height check
  drm/draw: fix color truncation in drm_draw_fill24
  drm/xe/guc: Check GuC running state before deregistering exec queue
  drm/xe: Enable media sampler power gating
  drm/xe: Handle mixed mappings and existing VRAM on atomic faults
  drm/xe/migrate: Fix an error path
  drm/xe: Move rebar to be done earlier
  drm/xe: Don't allow evicting of BOs in same VM in array of VM binds
  drm/xe: Increase global invalidation timeout to 1000us
  ...
2025-10-17 08:16:58 -07:00
Linus Torvalds
389dfd9db6 Merge tag 'i2c-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:

 - PM cleanup after all prerequisites are merged with rc1

 - usbio: missing addition after all dependencies are in

 - slimpro: DT binding schema conversion

* tag 'i2c-for-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  dt-bindings: i2c: Convert apm,xgene-slimpro-i2c to DT schema
  i2c: usbio: Add ACPI device-id for MTL-CVF devices
  i2c: Remove redundant pm_runtime_mark_last_busy() calls
2025-10-17 08:12:19 -07:00
Dawn Gardner
2a78634800 ALSA: hda/realtek: Fix mute led for HP Omen 17-cb0xxx
This laptop uses the ALC285 codec, fixed by enabling
the ALC285_FIXUP_HP_MUTE_LED quirk

Signed-off-by: Dawn Gardner <dawn.auroali@gmail.com>
Link: https://patch.msgid.link/20251016184218.31508-3-dawn.auroali@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-10-17 16:37:21 +02:00