Commit Graph

9975 Commits

Author SHA1 Message Date
Linus Torvalds
b08494a8f7 Merge tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "As part of building up nova-core/nova-drm pieces we've brought in some
  rust abstractions through this tree, aux bus being the main one, with
  devres changes also in the driver-core tree. Along with the drm core
  abstractions and enough nova-core/nova-drm to use them. This is still
  all stub work under construction, to build the nova driver upstream.

  The other big NVIDIA related one is nouveau adds support for
  Hopper/Blackwell GPUs, this required a new GSP firmware update to
  570.144, and a bunch of rework in order to support multiple fw
  interfaces.

  There is also the introduction of an asahi uapi header file as a
  precursor to getting the real driver in later, but to unblock
  userspace mesa packages while the driver is trapped behind rust
  enablement.

  Otherwise it's the usual mixture of stuff all over, amdgpu, i915/xe,
  and msm being the main ones, and some changes to vsprintf.

  new drivers:
   - bring in the asahi uapi header standalone
   - nova-drm: stub driver

  rust dependencies (for nova-core):
   - auxiliary
       - bus abstractions
       - driver registration
       - sample driver
   - devres changes from driver-core
   - revocable changes

  core:
   - add Apple fourcc modifiers
   - add virtio capset definitions
   - extend EXPORT_SYNC_FILE for timeline syncobjs
   - convert to devm_platform_ioremap_resource
   - refactor shmem helper page pinning
   - DP powerup/down link helpers
   - extended %p4cc in vsprintf.c to support fourcc prints
   - change vsprintf %p4cn to %p4chR, remove %p4cn
   - Add drm_file_err function
   - IN_FORMATS_ASYNC property
   - move sitronix from tiny to their own subdir

  rust:
   - add drm core infrastructure rust abstractions
     (device/driver, ioctl, file, gem)

  dma-buf:
   - adjust sg handling to not cache map on attach
   - allow setting dma-device for import
   - Add a helper to sort and deduplicate dma_fence arrays

  docs:
   - updated drm scheduler docs
   - fbdev todo update
   - fb rendering
   - actual brightness

  ttm:
   - fix delayed destroy resv object

  bridge:
   - add kunit tests
   - convert tc358775 to atomic
   - convert drivers to devm_drm_bridge_alloc
   - convert rk3066_hdmi to bridge driver

  scheduler:
   - add kunit tests

  panel:
   - refcount panels to improve lifetime handling
   - Powertip PH128800T004-ZZA01
   - NLT NL13676BC25-03F, Tianma TM070JDHG34-00
   - Himax HX8279/HX8279-D DDIC
   - Visionox G2647FB105
   - Sitronix ST7571
   - ZOTAC rotation quirk

  vkms:
   - allow attaching more displays

  i915:
   - xe3lpd display updates
   - vrr refactor
   - intel_display struct conversions
   - xe2hpd memory type identification
   - add link rate/count to i915_display_info
   - cleanup VGA plane handling
   - refactor HDCP GSC
   - fix SLPC wait boosting reference counting
   - add 20ms delay to engine reset
   - fix fence release on early probe errors

  xe:
   - SRIOV updates
   - BMG PCI ID update
   - support separate firmware for each GT
   - SVM fix, prelim SVM multi-device work
   - export fan speed
   - temp disable d3cold on BMG
   - backup VRAM in PM notifier instead of suspend/freeze
   - update xe_ttm_access_memory to use GPU for non-visible access
   - fix guc_info debugfs for VFs
   - use copy_from_user instead of __copy_from_user
   - append PCIe gen5 limitations to xe_firmware document

  amdgpu:
   - DSC cleanup
   - DC Scaling updates
   - Fused I2C-over-AUX updates
   - DMUB updates
   - Use drm_file_err in amdgpu
   - Enforce isolation updates
   - Use new dma_fence helpers
   - USERQ fixes
   - Documentation updates
   - SR-IOV updates
   - RAS updates
   - PSP 12 cleanups
   - GC 9.5 updates
   - SMU 13.x updates
   - VCN / JPEG SR-IOV updates

  amdkfd:
   - Update error messages for SDMA
   - Userptr updates
   - XNACK fixes

  radeon:
   - CIK doorbell cleanup

  nouveau:
   - add support for NVIDIA r570 GSP firmware
   - enable Hopper/Blackwell support

  nova-core:
   - fix task list
   - register definition infrastructure
   - move firmware into own rust module
   - register auxiliary device for nova-drm

  nova-drm:
   - initial driver skeleton

  msm:
   - GPU:
       - ACD (adaptive clock distribution) for X1-85
       - drop fictional address_space_size
       - improve GMU HFI response time out robustness
       - fix crash when throttling during boot
   - DPU:
       - use single CTL path for flushing on DPU 5.x+
       - improve SSPP allocation code for better sharing
       - Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
       - Added SAR2130P support
       - Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
   - DP:
       - switch to new audio helpers
       - better LTTPR handling
   - DSI:
       - Added support for SA8775P
       - Added SAR2130P support
   - HDMI:
       - Switched to use new helpers for ACR data
       - Fixed old standing issue of HPD not working in some cases

  amdxdna:
   - add dma-buf support
   - allow empty command submits

  renesas:
   - add dma-buf support
   - add zpos, alpha, blend support

  panthor:
   - fail properly for NO_MMAP bos
   - add SET_LABEL ioctl
   - debugfs BO dumping support

  imagination:
   - update DT bindings
   - support TI AM68 GPU

  hibmc:
   - improve interrupt handling and HPD support

  virtio:
   - add panic handler support

  rockchip:
   - add RK3588 support
   - add DP AUX bus panel support

  ivpu:
   - add heartbeat based hangcheck

  mediatek:
   - prepares support for MT8195/99 HDMIv2/DDCv2

  anx7625:
   - improve HPD

  tegra:
   - speed up firmware loading

* tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel: (1627 commits)
  drm/nouveau/tegra: Fix error pointer vs NULL return in nvkm_device_tegra_resource_addr()
  drm/xe: Default auto_link_downgrade status to false
  drm/xe/guc: Make creation of SLPC debugfs files conditional
  drm/i915/display: Add check for alloc_ordered_workqueue() and alloc_workqueue()
  drm/i915/dp_mst: Work around Thunderbolt sink disconnect after SINK_COUNT_ESI read
  drm/i915/ptl: Use everywhere the correct DDI port clock select mask
  drm/nouveau/kms: add support for GB20x
  drm/dp: add option to disable zero sized address only transactions.
  drm/nouveau: add support for GB20x
  drm/nouveau/gsp: add hal for fifo.chan.doorbell_handle
  drm/nouveau: add support for GB10x
  drm/nouveau/gf100-: track chan progress with non-WFI semaphore release
  drm/nouveau/nv50-: separate CHANNEL_GPFIFO handling out from CHANNEL_DMA
  drm/nouveau: add helper functions for allocating pinned/cpu-mapped bos
  drm/nouveau: add support for GH100
  drm/nouveau: improve handling of 64-bit BARs
  drm/nouveau/gv100-: switch to volta semaphore methods
  drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES
  drm/nouveau/gsp: init client VMMs with NV0080_CTRL_DMA_SET_PAGE_DIRECTORY
  drm/nouveau/gsp: fetch level shift and PDE from BAR2 VMM
  ...
2025-05-28 09:46:39 -07:00
Linus Torvalds
48cfc5791d Merge tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:

 - Update overflow helpers to ease refactoring of on-stack flex array
   instances (Gustavo A. R. Silva, Kees Cook)

 - lkdtm: Use SLAB_NO_MERGE instead of constructors (Harry Yoo)

 - Simplify CONFIG_CC_HAS_COUNTED_BY (Jan Hendrik Farr)

 - Disable u64 usercopy KUnit test on 32-bit SPARC (Thomas Weißschuh)

 - Add missed designated initializers now exposed by fixed randstruct
   (Nathan Chancellor, Kees Cook)

 - Document compilers versions for __builtin_dynamic_object_size

 - Remove ARM_SSP_PER_TASK GCC plugin

 - Fix GCC plugin randstruct, add selftests, and restore COMPILE_TEST
   builds

 - Kbuild: induce full rebuilds when dependencies change with GCC
   plugins, the Clang sanitizer .scl file, or the randstruct seed.

 - Kbuild: Switch from -Wvla to -Wvla-larger-than=1

 - Correct several __nonstring uses for -Wunterminated-string-initialization

* tag 'hardening-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
  Revert "hardening: Disable GCC randstruct for COMPILE_TEST"
  lib/tests: randstruct: Add deep function pointer layout test
  lib/tests: Add randstruct KUnit test
  randstruct: gcc-plugin: Remove bogus void member
  net: qede: Initialize qede_ll_ops with designated initializer
  scsi: qedf: Use designated initializer for struct qed_fcoe_cb_ops
  md/bcache: Mark __nonstring look-up table
  integer-wrap: Force full rebuild when .scl file changes
  randstruct: Force full rebuild when seed changes
  gcc-plugins: Force full rebuild when plugins change
  kbuild: Switch from -Wvla to -Wvla-larger-than=1
  hardening: simplify CONFIG_CC_HAS_COUNTED_BY
  overflow: Fix direct struct member initialization in _DEFINE_FLEX()
  kunit/overflow: Add tests for STACK_FLEX_ARRAY_SIZE() helper
  overflow: Add STACK_FLEX_ARRAY_SIZE() helper
  input/joystick: magellan: Mark __nonstring look-up table const
  watchdog: exar: Shorten identity name to fit correctly
  mod_devicetable: Enlarge the maximum platform_device_id name length
  overflow: Clarify expectations for getting DEFINE_FLEX variable sizes
  compiler_types: Identify compiler versions for __builtin_dynamic_object_size
  ...
2025-05-28 07:47:10 -07:00
Linus Torvalds
f1975e4765 Merge tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:

 - Move kern_table members out of kernel/sysctl.c

   Moved a subset (tracing, panic, signal, stack_tracer and sparc) out
   of the kern_table array. The goal is for kern_table to only have
   sysctl elements. All this increases modularity by placing the
   ctl_tables closer to where they are used while reducing the chances
   of merge conflicts in kernel/sysctl.c.

 - Fixed sysctl unit test panic by relocating it to selftests

* tag 'sysctl-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  sysctl: Close test ctl_headers with a for loop
  sysctl: call sysctl tests with a for loop
  sysctl: Add 0012 to test the u8 range check
  sysctl: move u8 register test to lib/test_sysctl.c
  sparc: mv sparc sysctls into their own file under arch/sparc/kernel
  stack_tracer: move sysctl registration to kernel/trace/trace_stack.c
  tracing: Move trace sysctls into trace.c
  signal: Move signal ctl tables into signal.c
  panic: Move panic ctl tables into panic.c
2025-05-27 20:43:35 -07:00
Jens Axboe
375700bab5 llist: make llist_add_batch() a static inline
The function is small enough that it should be, and it's a (very) hot path
for io_uring.  Doing this actually reduces my vmlinux text size for my
standard build/test box.

Before:
axboe@r7625 ~/g/linux (test)> size vmlinux
   text	   data	    bss	    dec	    hex		filename
19892174 5938310 2470432 28300916 1afd674	vmlinux

After:
axboe@r7625 ~/g/linux (test)> size vmlinux
   text	   data	    bss	    dec	    hex		filename
19891878 5938310 2470436 28300624 1afd550	vmlinux

Link: https://lkml.kernel.org/r/f1d104c6-7ac8-457a-a53d-6bb741421b2f@kernel.dk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-27 19:40:34 -07:00
Linus Torvalds
049294830b Merge tag 'thermal-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
 "These add support for a new feature, Platform Temperature Control
  (PTC), to the Intel int340x thermal driver, add support for the Airoha
  EN7581 thermal sensor and the IPQ5018 platform, fix up the ACPI
  thermal zones handling, fix other assorted issues and clean up code

  Specifics:

   - Add Platform Temperature Control (PTC) support to the Intel int340x
     thermal driver (Srinivas Pandruvada)

   - Make the Hisilicon thermal driver compile by default when ARCH_HISI
     is set (Krzysztof Kozlowski)

   - Clean up printk() format by using %pC instead of %pCn in the
     bcm2835 thermal driver (Luca Ceresoli)

   - Fix variable name coding style in the AmLogic thermal driver
     (Enrique Isidoro Vazquez Ramos)

   - Fix missing debugfs entry removal on failure by using the devm_
     variant in the LVTS thermal driver (AngeloGioacchino Del Regno)

   - Remove the unused lvts_debugfs_exit() function as the devm_ variant
     introduced before takes care of removing the debugfs entry in the
     LVTS driver (Arnd Bergmann)

   - Add the Airoha EN7581 thermal sensor support along with its DT
     bindings (Christian Marangi)

   - Add ipq5018 compatible string DT binding, cleanup and add its
     suppot to the QCom Tsens thermal driver (Sricharan Ramabadhran,
     George Moussalem)

   - Fix comments typos in the Airoha driver (Christian Marangi, Colin
     Ian King)

   - Address a sparse warning by making a local variable static in the
     QCom thermal driver (George Moussalem)

   - Fix the usage of the _SCP control method in the driver for ACPI
     thermal zones (Armin Wolf)"

* tag 'thermal-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: qcom: ipq5018: make ops_ipq5018 struct static
  thermal/drivers/airoha: Fix spelling mistake "calibrarion" -> "calibration"
  ACPI: thermal: Execute _SCP before reading trip points
  ACPI: OSI: Stop advertising support for "3.0 _SCP Extensions"
  thermal/drivers/airoha: Fix spelling mistake
  thermal/drivers/qcom/tsens: Add support for IPQ5018 tsens
  thermal/drivers/qcom/tsens: Add support for tsens v1 without RPM
  thermal/drivers/qcom/tsens: Update conditions to strictly evaluate for IP v2+
  dt-bindings: thermal: qcom-tsens: Add ipq5018 compatible
  thermal/drivers: Add support for Airoha EN7581 thermal sensor
  dt-bindings: thermal: Add support for Airoha EN7581 thermal sensor
  thermal/drivers/mediatek/lvts: Remove unused lvts_debugfs_exit
  thermal/drivers/mediatek/lvts: Fix debugfs unregister on failure
  thermal/drivers/amlogic: Rename Uptat to uptat to follow kernel coding style
  vsprintf: remove redundant and unused %pCn format specifier
  thermal/drivers/bcm2835: Use %pC instead of %pCn
  thermal/drivers/hisi: Do not enable by default during compile testing
  thermal: int340x: processor_thermal: Platform temperature control documentation
  thermal: intel: int340x: Enable platform temperature control
  thermal: intel: int340x: Add platform temperature control interface
2025-05-27 16:28:02 -07:00
Linus Torvalds
a9e6060bb2 Merge tag 'sound-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
 "We've received a lot of activities in this cycle, mostly about leaf
  driver codes rather than the core part, but with a good mixture of
  code cleanups and new driver additions. Below are some highlights:

  ASoC:
   - Support for automatically enumerating DAIs from standards
     conforming SoundWire SDCA devices; not much used as of this
     writing, rather for future implementations
   - Conversion of quite a few drivers to newer GPIO APIs
   - Continued cleanups and helper usages in allover places
   - Support for a wider range of Intel AVS platforms
   - Support for AMD ACP 7.x platforms, Cirrus Logic CS35L63 and CS48L32
     Everest Semiconductor ES8375 and ES8389, Longsoon-1 AC'97
     controllers, nVidia Tegra264, Richtek ALC203 and RT9123 and
     Rockchip SAI controllers

  HD-audio:
   - Lots of cleanups of TAS2781 codec drivers
   - A new HD-audio control bound via ACPI for Nvidia
   - Support for Tegra264, Intel WCL, usual new codec quirks

  USB-audio:
   - Fix a race at removal of MIDI device
   - Pioneer DJM-V10 support, Scarlett2 driver cleanups

  Misc:
   - Cleanups of deprecated PCI functions
   - Removal of unused / dead function codes"

* tag 'sound-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (364 commits)
  firmware: cs_dsp: Fix OOB memory read access in KUnit test
  ASoC: codecs: add support for ES8375
  ASoC: dt-bindings: Add Everest ES8375 audio CODEC
  ALSA: hda: acpi: Make driver's match data const static
  ALSA: hda: acpi: Use SYSTEM_SLEEP_PM_OPS()
  ALSA: atmel: Replace deprecated strcpy() with strscpy()
  ALSA: core: fix up bus match const issues.
  ASoC: wm_adsp: Make cirrus_dir const
  ASoC: tegra: Tegra264 support in isomgr_bw
  ASoC: tegra: AHUB: Add Tegra264 support
  ASoC: tegra: ADX: Add Tegra264 support
  ASoC: tegra: AMX: Add Tegra264 support
  ASoC: tegra: I2S: Add Tegra264 support
  ASoC: tegra: Update PLL rate for Tegra264
  ASoC: tegra: ASRC: Update ARAM address
  ASoC: tegra: ADMAIF: Add Tegra264 support
  ASoC: tegra: CIF: Add Tegra264 support
  dt-bindings: ASoC: Document Tegra264 APE support
  dt-bindings: ASoC: admaif: Add missing properties
  ASoC: dt-bindings: audio-graph-card2: reference audio-graph routing property
  ...
2025-05-27 15:05:18 -07:00
Linus Torvalds
97851c6016 Merge tag 'ratelimit.2025.05.25a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull rate-limit updates from Paul McKenney:
 "lib/ratelimit: Reduce false-positive and silent misses:

   - Reduce open-coded use of ratelimit_state structure fields.

   - Convert the ->missed field to atomic_t.

   - Count misses that are due to lock contention.

   - Eliminate jiffies=0 special case.

   - Reduce ___ratelimit() false-positive rate limiting (Petr Mladek).

   - Allow zero ->burst to hard-disable rate limiting.

   - Optimize away atomic operations when a miss is guaranteed.

   - Warn if ->interval or ->burst are negative (Petr Mladek).

   - Simplify the resulting code.

  A smoke test and stress test have been created, but they are not yet
  ready for mainline. With luck, we will offer them for the v6.17 merge
  window"

* tag 'ratelimit.2025.05.25a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  ratelimit: Drop redundant accesses to burst
  ratelimit: Use nolock_ret restructuring to collapse common case code
  ratelimit: Use nolock_ret label to collapse lock-failure code
  ratelimit: Use nolock_ret label to save a couple of lines of code
  ratelimit: Simplify common-case exit path
  ratelimit: Warn if ->interval or ->burst are negative
  ratelimit: Avoid atomic decrement under lock if already rate-limited
  ratelimit: Avoid atomic decrement if already rate-limited
  ratelimit: Don't flush misses counter if RATELIMIT_MSG_ON_RELEASE
  ratelimit: Force re-initialization when rate-limiting re-enabled
  ratelimit: Allow zero ->burst to disable ratelimiting
  ratelimit: Reduce ___ratelimit() false-positive rate limiting
  ratelimit: Avoid jiffies=0 special case
  ratelimit: Count misses due to lock contention
  ratelimit: Convert the ->missed field to atomic_t
  drm/amd/pm: Avoid open-coded use of ratelimit_state structure's internals
  drm/i915: Avoid open-coded use of ratelimit_state structure's ->missed field
  random: Avoid open-coded use of ratelimit_state structure's ->missed field
  ratelimit: Create functions to handle ratelimit_state internals
2025-05-27 10:48:36 -07:00
Linus Torvalds
664a231d90 Merge tag 'x86_cache_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov:
 "Carve out the resctrl filesystem-related code into fs/resctrl/ so that
  multiple architectures can share the fs API for manipulating their
  respective hw resource control implementation.

  This is the second step in the work towards sharing the resctrl
  filesystem interface, the next one being plugging ARM's MPAM into the
  aforementioned fs API"

* tag 'x86_cache_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  MAINTAINERS: Add reviewers for fs/resctrl
  x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrl
  x86/resctrl: Always initialise rid field in rdt_resources_all[]
  x86/resctrl: Relax some asm #includes
  x86/resctrl: Prefer alloc(sizeof(*foo)) idiom in rdt_init_fs_context()
  x86/resctrl: Squelch whitespace anomalies in resctrl core code
  x86/resctrl: Move pseudo lock prototypes to include/linux/resctrl.h
  x86/resctrl: Fix types in resctrl_arch_mon_ctx_{alloc,free}() stubs
  x86/resctrl: Move enum resctrl_event_id to resctrl.h
  x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl
  fs/resctrl: Add boiler plate for external resctrl code
  x86/resctrl: Add 'resctrl' to the title of the resctrl documentation
  x86/resctrl: Split trace.h
  x86/resctrl: Expand the width of domid by replacing mon_data_bits
  x86/resctrl: Add end-marker to the resctrl_event_id enum
  x86/resctrl: Move is_mba_sc() out of core.c
  x86/resctrl: Drop __init/__exit on assorted symbols
  x86/resctrl: Resctrl_exit() teardown resctrl but leave the mount point
  x86/resctrl: Check all domains are offline in resctrl_exit()
  x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_"
  ...
2025-05-27 09:53:02 -07:00
Linus Torvalds
ba45037098 Merge tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit updates from Shuah Khan:

 - Enable qemu_config for riscv32, sparc 64-bit, PowerPC 32-bit BE and
   64-bit LE

 - Enable CONFIG_SPARC32 to clearly differentiate between sparc 32-bit
   and 64-bit configurations

 - Enable CONFIG_CPU_BIG_ENDIAN to clearly differentiate between powerpc
   LE and BE configurations

 - Add feature to list available architectures to kunit tool

 - Fixes to bugs and changes to documentation

* tag 'linux_kselftest-kunit-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Fix wrong parameter to kunit_deactivate_static_stub()
  kunit: tool: add test counts to JSON output
  Documentation: kunit: improve example on testing static functions
  kunit: executor: Remove const from kunit_filter_suites() allocation type
  kunit: qemu_configs: Disable faulting tests on 32-bit SPARC
  kunit: qemu_configs: Add 64-bit SPARC configuration
  kunit: qemu_configs: sparc: Explicitly enable CONFIG_SPARC32=y
  kunit: qemu_configs: Add PowerPC 32-bit BE and 64-bit LE
  kunit: qemu_configs: powerpc: Explicitly enable CONFIG_CPU_BIG_ENDIAN=y
  kunit: tool: Implement listing of available architectures
  kunit: qemu_configs: Add riscv32 config
  kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests
2025-05-26 14:29:44 -07:00
Linus Torvalds
14418ddcc2 Merge tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Fix memcpy_sglist to handle partially overlapping SG lists
   - Use memcpy_sglist to replace null skcipher
   - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK
   - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS
   - Hide CRYPTO_MANAGER
   - Add delayed freeing of driver crypto_alg structures

  Compression:
   - Allocate large buffers on first use instead of initialisation in scomp
   - Drop destination linearisation buffer in scomp
   - Move scomp stream allocation into acomp
   - Add acomp scatter-gather walker
   - Remove request chaining
   - Add optional async request allocation

  Hashing:
   - Remove request chaining
   - Add optional async request allocation
   - Move partial block handling into API
   - Add ahash support to hmac
   - Fix shash documentation to disallow usage in hard IRQs

  Algorithms:
   - Remove unnecessary SIMD fallback code on x86 and arm/arm64
   - Drop avx10_256 xts(aes)/ctr(aes) on x86
   - Improve avx-512 optimisations for xts(aes)
   - Move chacha arch implementations into lib/crypto
   - Move poly1305 into lib/crypto and drop unused Crypto API algorithm
   - Disable powerpc/poly1305 as it has no SIMD fallback
   - Move sha256 arch implementations into lib/crypto
   - Convert deflate to acomp
   - Set block size correctly in cbcmac

  Drivers:
   - Do not use sg_dma_len before mapping in sun8i-ss
   - Fix warm-reboot failure by making shutdown do more work in qat
   - Add locking in zynqmp-sha
   - Remove cavium/zip
   - Add support for PCI device 0x17D8 to ccp
   - Add qat_6xxx support in qat
   - Add support for RK3576 in rockchip-rng
   - Add support for i.MX8QM in caam

  Others:
   - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up
   - Add new SEV/SNP platform shutdown API in ccp"

* tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (382 commits)
  x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining
  crypto: qat - add missing header inclusion
  crypto: api - Redo lookup on EEXIST
  Revert "crypto: testmgr - Add hash export format testing"
  crypto: marvell/cesa - Do not chain submitted requests
  crypto: powerpc/poly1305 - add depends on BROKEN for now
  Revert "crypto: powerpc/poly1305 - Add SIMD fallback"
  crypto: ccp - Add missing tee info reg for teev2
  crypto: ccp - Add missing bootloader info reg for pspv5
  crypto: sun8i-ce - move fallback ahash_request to the end of the struct
  crypto: octeontx2 - Use dynamic allocated memory region for lmtst
  crypto: octeontx2 - Initialize cptlfs device info once
  crypto: xts - Only add ecb if it is not already there
  crypto: lrw - Only add ecb if it is not already there
  crypto: testmgr - Add hash export format testing
  crypto: testmgr - Use ahash for generic tfm
  crypto: hmac - Add ahash support
  crypto: testmgr - Ignore EEXIST on shash allocation
  crypto: algapi - Add driver template support to crypto_inst_setname
  crypto: shash - Set reqsize in shash_alg
  ...
2025-05-26 13:47:28 -07:00
Linus Torvalds
15d90a5e55 Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull CRC updates from Eric Biggers:
 "Cleanups for the kernel's CRC (cyclic redundancy check) code:

   - Use __ro_after_init where appropriate

   - Remove unnecessary static_key on s390

   - Rename some source code files

   - Rename the crc32 and crc32c crypto API modules

   - Use subsys_initcall instead of arch_initcall

   - Restore maintainers for crc_kunit.c

   - Fold crc16_byte() into crc16.c

   - Add some SPDX license identifiers"

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crc32: add SPDX license identifier
  lib/crc16: unexport crc16_table and crc16_byte()
  w1: ds2406: use crc16() instead of crc16_byte() loop
  MAINTAINERS: add crc_kunit.c back to CRC LIBRARY
  lib/crc: make arch-optimized code use subsys_initcall
  crypto: crc32 - remove "generic" from file and module names
  x86/crc: drop "glue" from filenames
  sparc/crc: drop "glue" from filenames
  s390/crc: drop "glue" from filenames
  powerpc/crc: rename crc32-vpmsum_core.S to crc-vpmsum-template.S
  powerpc/crc: drop "glue" from filenames
  arm64/crc: drop "glue" from filenames
  arm/crc: drop "glue" from filenames
  s390/crc32: Remove no-op module init and exit functions
  s390/crc32: Remove have_vxrs static key
  lib/crc: make the CPU feature static keys __ro_after_init
2025-05-26 13:32:06 -07:00
Suren Baghdasaryan
12ca42c237 alloc_tag: allocate percpu counters for module tags dynamically
When a module gets unloaded it checks whether any of its tags are still in
use and if so, we keep the memory containing module's allocation tags
alive until all tags are unused.  However percpu counters referenced by
the tags are freed by free_module().  This will lead to UAF if the memory
allocated by a module is accessed after module was unloaded.

To fix this we allocate percpu counters for module allocation tags
dynamically and we keep it alive for tags which are still in use after
module unloading.  This also removes the requirement of a larger
PERCPU_MODULE_RESERVE when memory allocation profiling is enabled because
percpu memory for counters does not need to be reserved anymore.

Link: https://lkml.kernel.org/r/20250517000739.5930-1-surenb@google.com
Fixes: 0db6f8d782 ("alloc_tag: load module tags into separate contiguous memory")
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: David Wang <00107082@163.com>
Closes: https://lore.kernel.org/all/20250516131246.6244-1-00107082@163.com/
Tested-by: David Wang <00107082@163.com>
Cc: Christoph Lameter (Ampere) <cl@gentwo.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-25 00:53:48 -07:00
Casey Chen
780138b123 alloc_tag: check mem_profiling_support in alloc_tag_init
If mem_profiling_support is false, for example by
sysctl.vm.mem_profiling=never, alloc_tag_init should skip module tags
allocation, codetag type registration and procfs init.

Link: https://lkml.kernel.org/r/20250513182602.121843-1-cachen@purestorage.com
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:38 -07:00
Eric Biggers
b82f72292a lib/crc32: remove unused support for CRC32C combination
crc32c_combine() and crc32c_shift() are no longer used (except by the
KUnit test that tests them), and their current implementation is very
slow.  Remove them.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://patch.msgid.link/20250519175012.36581-8-ebiggers@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-21 15:40:17 -07:00
Tzung-Bi Shih
772e50a76e kunit: Fix wrong parameter to kunit_deactivate_static_stub()
kunit_deactivate_static_stub() accepts real_fn_addr instead of
replacement_addr.  In the case, it always passes NULL to
kunit_deactivate_static_stub().

Fix it.

Link: https://lore.kernel.org/r/20250520082050.2254875-1-tzungbi@kernel.org
Fixes: e047c5eaa7 ("kunit: Expose 'static stub' API to redirect functions")
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-05-21 09:51:23 -06:00
Luca Ceresoli
592ebd77e6 vsprintf: remove redundant and unused %pCn format specifier
%pC and %pCn print the same string, and commit 900cca2944 ("lib/vsprintf:
add %pC{,n,r} format specifiers for clocks") introducing them does not
clarify any intended difference. It can be assumed %pC is a default for
%pCn as some other specifiers do, but not all are consistent with this
policy. Moreover there is now no other suffix other than 'n', which makes a
default not really useful.

All users in the kernel were using %pC except for one which has been
converted. So now remove %pCn and all the unnecessary extra code and
documentation.

Acked-by: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Link: https://lore.kernel.org/r/20250311-vsprintf-pcn-v2-2-0af40fc7dee4@bootlin.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2025-05-16 12:50:00 +02:00
Yury Norov [NVIDIA]
13f0a02bf4 find: Add find_first_andnot_bit()
The function helps to implement cpumask_andnot() APIs.

Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: James Morse <james.morse@arm.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghuay@nvidia.com>
Tested-by: James Morse <james.morse@arm.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Fenghua Yu <fenghuay@nvidia.com>
Link: https://lore.kernel.org/20250515165855.31452-3-james.morse@arm.com
2025-05-15 20:24:40 +02:00
Lee Trager
e505e14073 pldmfw: Don't require send_package_data or send_component_table to be defined
Not all drivers require send_package_data or send_component_table when
updating firmware. Instead of forcing drivers to implement a stub allow
these functions to go undefined.

Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250512190109.2475614-2-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-15 12:59:18 +02:00
Eric Biggers
289c99bec7 lib/crc32: add SPDX license identifier
lib/crc32.c and include/linux/crc32.h got missed by the bulk SPDX
conversion because of the nonstandard explanation of the license.
However, crc32.c clearly states that it's licensed under the GNU General
Public License, Version 2.  And the comment in crc32.h clearly indicates
that it's meant to have the same license as crc32.c.  Therefore, apply
SPDX-License-Identifier: GPL-2.0-only to both files.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250514052409.194822-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-05-14 09:15:38 -07:00
Eric Biggers
3937f6db6e lib/crc16: unexport crc16_table and crc16_byte()
Now that neither crc16_table nor crc16_byte() is used outside
lib/crc16.c, fold them into lib/crc16.c.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250513022115.39109-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-05-13 20:37:16 -07:00
Christoph Hellwig
f1c2bca267 xarray: fix kerneldoc for __xa_cmpxchg
Fix the documentation for __xa_cmpxchg to actually describe the
cmpxch-like semantics correctly, based on the version for xa_cmpxchg.

Link: https://lkml.kernel.org/r/20250507051656.3900864-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12 23:50:49 -07:00
Eric Biggers
698de82278 crypto: testmgr - make it easier to enable the full set of tests
Currently the full set of crypto self-tests requires
CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y.  This is problematic in two ways.
First, developers regularly overlook this option.  Second, the
description of the tests as "extra" sometimes gives the impression that
it is not required that all algorithms pass these tests.

Given that the main use case for the crypto self-tests is for
developers, make enabling CONFIG_CRYPTO_SELFTESTS=y just enable the full
set of crypto self-tests by default.

The slow tests can still be disabled by adding the command-line
parameter cryptomgr.noextratests=1, soon to be renamed to
cryptomgr.noslowtests=1.  The only known use case for doing this is for
people trying to use the crypto self-tests to satisfy the FIPS 140-3
pre-operational self-testing requirements when the kernel is being
validated as a FIPS 140-3 cryptographic module.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12 13:34:03 +08:00
Eric Biggers
40b9969796 crypto: testmgr - replace CRYPTO_MANAGER_DISABLE_TESTS with CRYPTO_SELFTESTS
The negative-sense of CRYPTO_MANAGER_DISABLE_TESTS is a longstanding
mistake that regularly causes confusion.  Especially bad is that you can
have CRYPTO=n && CRYPTO_MANAGER_DISABLE_TESTS=n, which is ambiguous.

Replace CRYPTO_MANAGER_DISABLE_TESTS with CRYPTO_SELFTESTS which has the
expected behavior.

The tests continue to be disabled by default.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12 13:33:14 +08:00
Eric Biggers
bdc2a55687 crypto: lib/chacha - add array bounds to function prototypes
Add explicit array bounds to the function prototypes for the parameters
that didn't already get handled by the conversion to use chacha_state:

- chacha_block_*():
  Change 'u8 *out' or 'u8 *stream' to u8 out[CHACHA_BLOCK_SIZE].

- hchacha_block_*():
  Change 'u32 *out' or 'u32 *stream' to u32 out[HCHACHA_OUT_WORDS].

- chacha_init():
  Change 'const u32 *key' to 'const u32 key[CHACHA_KEY_WORDS]'.
  Change 'const u8 *iv' to 'const u8 iv[CHACHA_IV_SIZE]'.

No functional changes.  This just makes it clear when fixed-size arrays
are expected.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12 13:32:53 +08:00
Eric Biggers
607c92141c crypto: lib/chacha - add strongly-typed state zeroization
Now that the ChaCha state matrix is strongly-typed, add a helper
function chacha_zeroize_state() which zeroizes it.  Then convert all
applicable callers to use it instead of direct memzero_explicit.  No
functional changes.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12 13:32:53 +08:00
Eric Biggers
32c9541189 crypto: lib/chacha - use struct assignment to copy state
Use struct assignment instead of memcpy() in lib/crypto/chacha.c where
appropriate.  No functional change.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12 13:32:53 +08:00
Eric Biggers
98066f2f89 crypto: lib/chacha - strongly type the ChaCha state
The ChaCha state matrix is 16 32-bit words.  Currently it is represented
in the code as a raw u32 array, or even just a pointer to u32.  This
weak typing is error-prone.  Instead, introduce struct chacha_state:

    struct chacha_state {
            u32 x[16];
    };

Convert all ChaCha and HChaCha functions to use struct chacha_state.
No functional changes.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-05-12 13:32:53 +08:00
Dr. David Alan Gilbert
479d26ee01 lib/oid_registry.c: remove unused sprint_OID
sprint_OID() was added as part of 2012's commit 4f73175d03 ("X.509: Add
utility functions to render OIDs as strings") but it hasn't been used. 
Remove it.

Note that there's also 'sprint_oid' (lower case) which is used in a lot of
places; that's left as is except for fixing its case in a comment.

Link: https://lkml.kernel.org/r/20250501010502.326472-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:54:13 -07:00
Herton R. Krzesinski
92f3c5a005 lib/test_kmod: do not hardcode/depend on any filesystem
Right now test_kmod has hardcoded dependencies on btrfs/xfs.  That is not
optimal since you end up needing to select/build them, but it is not
really required since other fs could be selected for the testing.  Also,
we can't change the default/driver module used for testing on
initialization.

Thus make it more generic: introduce two module parameters (start_driver
and start_test_fs), which allow to select which modules/fs to use for the
testing on test_kmod initialization.  Then it's up to the user to select
which modules/fs to use for testing based on his config.  However, keep
test_module as required default.

This way, config/modules becomes selectable as when the testing is done
from selftests (userspace).

While at it, also change trigger_config_run_type, since at module
initialization we already set the defaults at __kmod_config_init and
should not need to do it again in test_kmod_init(), thus we can avoid to
again set test_driver/test_fs.

Link: https://lkml.kernel.org/r/20250418165047.702487-1-herton@redhat.com
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Reviewed-by: Luis Chambelrain <mcgrof@kernel.org>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:54:09 -07:00
Caleb Sander Mateos
8d1d4b538b scatterlist: inline sg_next()
sg_next() is a short function called frequently in I/O paths. Define it
in the header file so it can be inlined into its callers.

Link: https://lkml.kernel.org/r/20250416160615.3571958-1-csander@purestorage.com
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:54:08 -07:00
Zijun Hu
3eff6a3e57 errseq: eliminate special limitation for macro MAX_ERRNO
Current errseq implementation depends on a very special precondition that
macro MAX_ERRNO must be (2^n - 1).

Eliminate the limitation by

- redefining macro ERRSEQ_SHIFT
- defining a new macro ERRNO_MASK instead of MAX_ERRNO for errno mask.

There is no plan to change the value of MAX_ERRNO, but this makes the
implementation more generic and eliminates the BUILD_BUG_ON().

Link: https://lkml.kernel.org/r/20250407-improve_errseq-v1-1-7b27cbeb8298@quicinc.com
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:54:06 -07:00
Mario Limonciello
ae5b350085 kstrtox: add support for enabled and disabled in kstrtobool()
In some places in the kernel there is a design pattern for sysfs
attributes to use kstrtobool() in store() and str_enabled_disabled() in
show().

This is counterintuitive to interact with because kstrtobool() takes
on/off but str_enabled_disabled() shows enabled/disabled.  Some of those
sysfs uses could switch to str_on_off() but for some attributes
enabled/disabled really makes more sense.

Add support for kstrtobool() to accept enabled/disabled.

Link: https://lkml.kernel.org/r/20250321022538.1532445-1-superm1@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:54:06 -07:00
Chisheng Chen
3dfd79cc87 lib/rbtree.c: fix the example typo
Replace `sr` with `Sr`.  The condition `!tmp1 || rb_is_black(tmp1)`
ensures that `tmp1` (which is `sibling->rb_right`) is either NULL or a
black node.  Therefore, the right child of the sibling must be black, and
the example should use `Sr` instead of `sr`.

Link: https://lkml.kernel.org/r/20250403112614.570140-1-johnny1001s000602@gmail.com
Signed-off-by: Chisheng Chen <johnny1001s000602@gmail.com>
Cc: Hsin Chang Yu <zxcvb600870024@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:54:04 -07:00
Uladzislau Rezki (Sony)
2d76e79315 lib/test_vmalloc.c: allow built-in execution
Remove the dependency on module loading ("m") for the vmalloc test suite,
enabling it to be built directly into the kernel, so both ("=m") and
("=y") are supported.

Motivation:
- Faster debugging/testing of vmalloc code;
- It allows to configure the test via kernel-boot parameters.

Configuration example:
  test_vmalloc.nr_threads=64
  test_vmalloc.run_test_mask=7
  test_vmalloc.sequential_test_order=1

Link: https://lkml.kernel.org/r/20250417161216.88318-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Adrian Huang <ahuang12@lenovo.com>
Tested-by: Adrian Huang <ahuang12@lenovo.com>
Cc: Christop Hellwig <hch@infradead.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:34 -07:00
Uladzislau Rezki (Sony)
7a73348e5d lib/test_vmalloc.c: replace RWSEM to SRCU for setup
The test has the initialization step during which threads are created.  To
prevent the workers from starting prematurely a write lock was previously
used by the main setup thread, while each worker would block on a read
lock.

Replace this RWSEM based synchronization with a simpler SRCU based
approach.  Which does two basic steps:

- Main thread wraps the setup phase in an SRCU read-side critical
  section.  Pair of srcu_read_lock()/srcu_read_unlock().
- Each worker calls synchronize_srcu() on entry, ensuring it waits for
  the initialization phase to be completed.

This patch eliminates the need for down_read()/up_read() and
down_write()/up_write() pairs thus simplifying the logic and improving
clarity.

[urezki@gmail.com: fix compile error with CONFIG_TINY_RCU]
  Link: https://lkml.kernel.org/r/20250420142029.103169-1-urezki@gmail.com
Link: https://lkml.kernel.org/r/20250417161216.88318-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Adrian Huang <ahuang12@lenovo.com>
Tested-by: Adrian Huang <ahuang12@lenovo.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Christop Hellwig <hch@infradead.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:33 -07:00
Sidhartha Kumar
2a6ed1b411 maple_tree: reorder mas->store_type case statements
Move the unlikely case that mas->store_type is invalid to be the last
evaluated case and put liklier cases higher up.

Link: https://lkml.kernel.org/r/20250410191446.2474640-7-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Suggested-by: Liam R. Howlett <liam.howlett@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:29 -07:00
Sidhartha Kumar
271152a973 maple_tree: add sufficient height
In order to support rebalancing and spanning stores using less than the
worst case number of nodes, we need to track more than just the vacant
height.  Using only vacant height to reduce the worst case maple node
allocation count can lead to a shortcoming of nodes in the following
scenarios.

For rebalancing writes, when a leaf node becomes insufficient, it may be
combined with a sibling into a single node.  This means that the parent
node which has entries for this children will lose one entry.  If this
parent node was just meeting the minimum entries, losing one entry will
now cause this parent node to be insufficient.  This leads to a cascading
operation of rebalancing at different levels and can lead to more node
allocations than simply using vacant height can return.

For spanning writes, a similar situation occurs.  At the location at which
a spanning write is detected, the number of ancestor nodes may similarly
need to rebalanced into a smaller number of nodes and the same cascading
situation could occur.

To use less than the full height of the tree for the number of
allocations, we also need to track the height at which a non-leaf node
cannot become insufficient.  This means even if a rebalance occurs to a
child of this node, it currently has enough entries that it can lose one
without any further action.  This field is stored in the maple write state
as sufficient height.  In mas_prealloc_calc() when figuring out how many
nodes to allocate, we check if the vacant node is lower in the tree than a
sufficient node (has a larger value).  If it is, we cannot use the vacant
height and must use the difference in the height and sufficient height as
the basis for the number of nodes needed.

An off by one bug was also discovered in mast_overflow() where it is using
>= rather than >.  This caused extra iterations of the
mas_spanning_rebalance() loop and lead to unneeded allocations.  A test is
also added to check the number of allocations is correct.

Link: https://lkml.kernel.org/r/20250410191446.2474640-6-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:29 -07:00
Sidhartha Kumar
300a5b4ffe maple_tree: break on convergence in mas_spanning_rebalance()
This allows support for using the vacant height to calculate the worst
case number of nodes needed for wr_rebalance operation. 
mas_spanning_rebalance() was seen to perform unnecessary node allocations.
We can reduce allocations by breaking early during the rebalancing loop
once we realize that we have ascended to a common ancestor.

Link: https://lkml.kernel.org/r/20250410191446.2474640-5-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Suggested-by: Liam Howlett <liam.howlett@oracle.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:28 -07:00
Sidhartha Kumar
ad88fc17d2 maple_tree: use vacant nodes to reduce worst case allocations
In order to determine the store type for a maple tree operation, a walk of
the tree is done through mas_wr_walk().  This function descends the tree
until a spanning write is detected or we reach a leaf node.  While
descending, keep track of the height at which we encounter a node with
available space.  This is done by checking if mas->end is less than the
number of slots a given node type can fit.

Now that the height of the vacant node is tracked, we can use the
difference between the height of the tree and the height of the vacant
node to know how many levels we will have to propagate creating new nodes.
Update mas_prealloc_calc() to consider the vacant height and reduce the
number of worst-case allocations.

Rebalancing and spanning stores are not supported and fall back to using
the full height of the tree for allocations.

Update preallocation testing assertions to take into account vacant
height.

Link: https://lkml.kernel.org/r/20250410191446.2474640-4-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:28 -07:00
Sidhartha Kumar
f9d3a963fe maple_tree: use height and depth consistently
For the maple tree, the root node is defined to have a depth of 0 with a
height of 1.  Each level down from the node, these values are incremented
by 1.  Various code paths define a root with depth 1 which is inconsisent
with the definition.  Modify the code to be consistent with this
definition.

In mas_spanning_rebalance(), l_mas.depth was being used to track the
height based on the number of iterations done in the main loop.  This
information was then used in mas_put_in_tree() to set the height.  Rather
than overload the l_mas.depth field to track height, simply keep track of
height in the local variable new_height and directly pass this to
mas_wmb_replace() which will be passed into mas_put_in_tree().  This
allows up to remove writes to l_mas.depth.

Link: https://lkml.kernel.org/r/20250410191446.2474640-3-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:28 -07:00
Sidhartha Kumar
28092a652f maple_tree: convert mas_prealloc_calc() to take in a maple write state
Patch series "Track node vacancy to reduce worst case allocation counts", v5.

================ overview ========================
Currently, the maple tree preallocates the worst case number of nodes for
given store type by taking into account the whole height of the tree. 
This comes from a worst case scenario of every node in the tree being full
and having to propagate node allocation upwards until we reach the root of
the tree.  This can be optimized if there are vacancies in nodes that are
at a lower depth than the root node.  This series implements tracking the
level at which there is a vacant node so we only need to allocate until
this level is reached, rather than always using the full height of the
tree.  The ma_wr_state struct is modified to add a field which keeps track
of the vacant height and is updated during walks of the tree.  This value
is then read in mas_prealloc_calc() when we decide how many nodes to
allocate.

For rebalancing and spanning stores, we also need to track the lowest
height at which a node has 1 more entry than the minimum sufficient number
of entries.  This is because rebalancing can cause a parent node to become
insufficient which results in further node allocations.  In this case, we
need to use the sufficient height as the worst case rather than the vacant
height.

patch 1-2: preparatory patches
patch 3: implement vacant height tracking + update the tests
patch 4: support vacant height tracking for rebalancing writes
patch 5: implement sufficient height tracking
patch 6: reorder switch case statements

================ results =========================
Bpftrace was used to profile the allocation path for requesting new maple
nodes while running stress-ng mmap 120s.  The histograms below represent
requests to kmem_cache_alloc_bulk() and show the count argument.  This
represnts how many maple nodes the caller is requesting in
kmem_cache_alloc_bulk()

command: stress-ng --mmap 4 --timeout 120

mm-unstable 

@bulk_alloc_req:
[3, 4)                 4 |                                                    |
[4, 5)             54170 |@                                                   |
[5, 6)                 0 |                                                    |
[6, 7)            893057 |@@@@@@@@@@@@@@@@@@@@                                |
[7, 8)                 4 |                                                    |
[8, 9)           2230287 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[9, 10)            55811 |@                                                   |
[10, 11)           77834 |@                                                   |
[11, 12)               0 |                                                    |
[12, 13)         1368684 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@                     |
[13, 14)               0 |                                                    |
[14, 15)               0 |                                                    |
[15, 16)          367197 |@@@@@@@@                                            |


@maple_node_total: 46,630,160
@total_vmas: 46184591

mm-unstable + this series

@bulk_alloc_req:
[2, 3)               198 |                                                    |
[3, 4)                 4 |                                                    |
[4, 5)                43 |                                                    |
[5, 6)                 0 |                                                    |
[6, 7)           1069503 |@@@@@@@@@@@@@@@@@@@@@                               |
[7, 8)                 4 |                                                    |
[8, 9)           2597268 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[9, 10)           472191 |@@@@@@@@@                                           |
[10, 11)          191904 |@@@                                                 |
[11, 12)               0 |                                                    |
[12, 13)          247316 |@@@@                                                |
[13, 14)               0 |                                                    |
[14, 15)               0 |                                                    |
[15, 16)           98769 |@                                                   |


@maple_node_total: 37,813,856
@total_vmas: 43493287

This represents a ~19% reduction in the number of bulk maple nodes allocated.

For more reproducible results, a historgram of the return value of
mas_prealloc_calc() is displayed while running the maple_tree_tests whcih
have a deterministic store pattern

mas_prealloc_calc() return value mm-unstable
1   :                                                    (12068)
3   :                                                    (11836)
5   : *****                                              (271192)
7   : ************************************************** (2329329)
9   : ***********                                        (534186)
10  :                                                    (435)
11  : ***************                                    (704306)
13  : ********                                           (409781)

mas_prealloc_calc() return value mm-unstable + this series
1   :                                                    (12070)
3   : ************************************************** (3548777)
5   : ********                                           (633458)
7   :                                                    (65081)
9   :                                                    (11224)
10  :                                                    (341)
11  :                                                    (2973)
13  :                                                    (68)

do_mmap latency was also measured for regressions:
command: stress-ng --mmap 4 --timeout 120

mm-unstable:
avg = 7162 nsecs, total: 16101821292 nsecs, count: 2248034

mm-unstable + this series:
avg = 6689 nsecs, total: 15135391764 nsecs, count: 2262726


stress-ng --mmap4 --timeout 120

with vacant_height:
stress-ng: info:  [257]                   21526312 Maple Tree Read                0.176 M/sec
stress-ng: info:  [257]                  339979348 Maple Tree Write               2.774 M/sec

without vacant_height:
stress-ng: info:  [8228]                   20968900 Maple Tree Read                0.171 M/sec
stress-ng: info:  [8228]                  312214648 Maple Tree Write               2.547 M/sec

This represents an increase of ~3% read throughput and ~9% increase in
write throughput.


This patch (of 6):

In a subsequent patch, mas_prealloc_calc() will need to access fields only
in the ma_wr_state.  Convert the function to take in a ma_wr_state and
modify all callers.  There is no functional change.

Link: https://lkml.kernel.org/r/20250410191446.2474640-1-sidhartha.kumar@oracle.com
Link: https://lkml.kernel.org/r/20250410191446.2474640-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:28 -07:00
Przemek Kitszel
4c97a17a25 xarray: make xa_alloc_cyclic() return 0 on all success cases
Change xa_alloc_cyclic() to return 0 even on wrap-around.  Do the same for
xa_alloc_cyclic_irq() and xa_alloc_cyclic_bh().

This will prevent any future bug of treating return of 1 as an error:
	int ret = xa_alloc_cyclic(...)
	if (ret) // currently mishandles ret==1
		goto failure;

If there will be someone interested in when wrap-around occurs, there is
still __xa_alloc_cyclic() that behaves as before.  For now there is no
such user.

Link: https://lkml.kernel.org/r/20250320102219.8101-1-przemyslaw.kitszel@intel.com
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/netdev/Z9gUd-5t8b5NX2wE@casper.infradead.org
Cc: Andriy Shevchenko <andriy.shevchenko@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:19 -07:00
Matthew Wilcox (Oracle)
70d1be00b4 iov_iter: convert iov_iter_extract_xarray_pages() to use folios
ITER_XARRAY is exclusively used with xarrays that contain folios, not
pages, so extract folio pointers from it, not page pointers.  Removes a
use of find_subpage().

Link: https://lkml.kernel.org/r/20250402210612.2444135-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:05 -07:00
Matthew Wilcox (Oracle)
b57f4f4f18 iov_iter: convert iter_xarray_populate_pages() to use folios
ITER_XARRAY is exclusively used with xarrays that contain folios, not
pages, so extract folio pointers from it, not page pointers.  Removes a
hidden call to compound_head() and a use of find_subpage().

Link: https://lkml.kernel.org/r/20250402210612.2444135-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11 17:48:05 -07:00
Paul E. McKenney
ba575cea29 ratelimit: Drop redundant accesses to burst
Now that there is the "burst <= 0" fastpath, for all later code, burst
must be strictly greater than zero.  Therefore, drop the redundant checks
of this local variable.

Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-08 16:13:27 -07:00
Paul E. McKenney
4b2cce999c ratelimit: Use nolock_ret restructuring to collapse common case code
Now that unlock_ret releases the lock, then falls into nolock_ret, which
handles ->missed based on the value of ret, the common-case lock-held
code can be collapsed into a single "if" statement with a single-statement
"then" clause.

Yes, we could go further and just assign the "if" condition to ret,
but in the immortal words of MSDOS, "Are you sure?".

Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-08 16:13:27 -07:00
Paul E. McKenney
743a1942d5 ratelimit: Use nolock_ret label to collapse lock-failure code
Now that we have a nolock_ret label that handles ->missed correctly
based on the value of ret, we can eliminate a local variable and collapse
several "if" statements on the lock-acquisition-failure code path.

Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-08 16:13:27 -07:00
Paul E. McKenney
a69114c2a1 ratelimit: Use nolock_ret label to save a couple of lines of code
Create a nolock_ret label in order to start consolidating the unlocked
return paths that conditionally invoke ratelimit_state_inc_miss().

Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-08 16:13:27 -07:00
Paul E. McKenney
f2d0ea0f08 ratelimit: Simplify common-case exit path
By making "ret" always be initialized, and moving the final call to
ratelimit_state_inc_miss() out from under the lock, we save a goto and
a couple lines of code.  This also saves a couple of lines of code from
the unconditional enable/disable slowpath.

Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-08 16:13:27 -07:00
Petr Mladek
a940d145cc ratelimit: Warn if ->interval or ->burst are negative
Currently, ___ratelimit() treats a negative ->interval or ->burst as
if it was zero, but this is an accident of the current implementation.
Therefore, splat in this case, which might have the benefit of detecting
use of uninitialized ratelimit_state structures on the one hand or easing
addition of new features on the other.

Link: https://lore.kernel.org/all/fbe93a52-365e-47fe-93a4-44a44547d601@paulmck-laptop/
Link: https://lore.kernel.org/all/20250423115409.3425-1-spasswolf@web.de/
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
2025-05-08 16:13:27 -07:00