IB_SIZE is only b0..b19. Starting with a6xx gen3, additional fields
were added above the IB_SIZE. Accidentially setting them can cause
badness. Fix this by properly defining the CP_INDIRECT_BUFFER packet
and using the generated builder macro to ensure unintended bits are not
set.
v2: add missing type attribute for IB_BASE
v3: fix offset attribute in xml
Reported-by: Connor Abbott <cwabbott0@gmail.com>
Fixes: a83366ef19 ("drm/msm/a6xx: add A640/A650 to gpulist")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/643396/
I'm currently maintaining the PCI endpoint subsystem and reviewing the
native host bridge and endpoint drivers. However, this affects my endpoint
maintainership role since I cannot merge endpoint patches that depend on
the controller drivers (which is more common). Moreover, the controller
driver patches would also benefit from a helping hand in maintaining them.
So I'd like to step up to maintain the native host bridge and endpoint
drivers together with the endpoint subsystem.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250418094905.9983-1-manivannan.sadhasivam@linaro.org
Pull misc x86 fixes from Ingo Molnar:
- Fix hypercall detection on Xen guests
- Extend the AMD microcode loader SHA check to Zen5, to block loading
of any unreleased standalone Zen5 microcode patches
- Add new Intel CPU model number for Bartlett Lake
- Fix the workaround for AMD erratum 1054
- Fix buggy early memory acceptance between SEV-SNP guests and the EFI
stub
* tag 'x86-urgent-2025-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/sev: Avoid shared GHCB page for early memory acceptance
x86/cpu/amd: Fix workaround for erratum 1054
x86/cpu: Add CPU model number for Bartlett Lake CPUs with Raptor Cove cores
x86/microcode/AMD: Extend the SHA check to Zen5, block loading of any unreleased standalone Zen5 microcode patches
x86/xen: Fix __xen_hypercall_setfunc()
Pull timer fix from Ingo Molnar:
"Fix a lockdep false positive in the i8253 driver"
* tag 'timers-urgent-2025-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/i8253: Call clockevent_i8253_disable() with interrupts disabled
Calling into the MIDR checking framework from the PI code has recently
become much harder, due to the new fancy "multi-MIDR" support that
relies on tables being populated at boot time, but not that early that
they are available to the PI code. There are additional issues with
this framework, as the code really isn't position independend *at all*.
This leads to some ugly breakages, as reported by Ada.
It so appears that the only reason for the PI code to call into the
MIDR checking code is to cope with The Most Broken ARM64 System Ever,
aka Cavium ThunderX, which cannot deal with nG attributes that result
of the combination of KASLR and KPTI as a consequence of Erratum 27456.
Duplicate the check for the erratum in the PI code, removing the
dependency on the bulk of the MIDR checking framework. This allows
dropping that same check from kaslr_requires_kpti(), as the KPTI code
already relies on the ARM64_WORKAROUND_CAVIUM_27456 cap.
Fixes: c8c2647e69 ("arm64: Make _midr_in_range_list() an exported function")
Reported-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/3d97e45a-23cf-419b-9b6f-140b4d88de7b@arm.com
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20250418093129.1755739-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Pull x86 perf event fixes from Ingo Molnar:
"Miscellaneous fixes and a hardware-enabling change:
- Fix Intel uncore PMU IIO free running counters on SPR, ICX and SNR
systems
- Fix Intel PEBS buffer overflow handling
- Fix skid in Intel PEBS sampling of user-space general purpose
registers
- Enable Panther Lake PMU support - similar to Lunar Lake"
* tag 'perf-urgent-2025-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Add Panther Lake support
perf/x86/intel: Allow to update user space GPRs from PEBS records
perf/x86/intel: Don't clear perf metrics overflow bit unconditionally
perf/x86/intel/uncore: Fix the scale of IIO free running counters on SPR
perf/x86/intel/uncore: Fix the scale of IIO free running counters on ICX
perf/x86/intel/uncore: Fix the scale of IIO free running counters on SNR
Pull misc irq fixes from Ingo Molnar:
- Fix BCM2712 irqchip driver Kconfig dependencies required on the
Raspberry PI5
- Fix spurious interrupts on RZ/G3E SMARC EVK systems
- Fix crash regression on Sun/NIU hardware
- Apply MSI driver quirk for Sun Neptune chips
* tag 'irq-urgent-2025-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-bcm2712-mip: Enable driver when ARCH_BCM2835 is enabled
irqchip/renesas-rzv2h: Prevent TINT spurious interrupt
net/niu: Niu requires MSIX ENTRY_DATA fields touch before entry reads
PCI/MSI: Add an option to write MSIX ENTRY_DATA before any reads
Pull misc core fixes from Ingo Molnar:
"Fix a genksyms related bug, triggered by recent changes to the percpu
code, and update the .clang-format file to not include obsolete
function names"
* tag 'core-urgent-2025-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genksyms: Handle typeof_unqual keyword and __seg_{fs,gs} qualifiers
clang-format: Update the ForEachMacros list for v6.15-rc1
Pull hardening fixes from Kees Cook:
- lib/prime_numbers: KUnit test should not select PRIME_NUMBERS (Geert
Uytterhoeven)
- ubsan: Fix panic from test_ubsan_out_of_bounds (Mostafa Saleh)
- ubsan: Remove 'default UBSAN' from UBSAN_INTEGER_WRAP (Nathan
Chancellor)
- string: Add load_unaligned_zeropad() code path to sized_strscpy()
(Peter Collingbourne)
- kasan: Add strscpy() test to trigger tag fault on arm64 (Vincenzo
Frascino)
- Disable GCC randstruct for COMPILE_TEST
* tag 'hardening-v6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
lib/prime_numbers: KUnit test should not select PRIME_NUMBERS
ubsan: Fix panic from test_ubsan_out_of_bounds
lib/Kconfig.ubsan: Remove 'default UBSAN' from UBSAN_INTEGER_WRAP
hardening: Disable GCC randstruct for COMPILE_TEST
kasan: Add strscpy() test to trigger tag fault on arm64
string: Add load_unaligned_zeropad() code path to sized_strscpy()
Pull gpio fix from Bartosz Golaszewski:
- check for both the new AND old (deprecated) setter callback when
changing GPIO direction to output
* tag 'gpio-fixes-for-v6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: Allow to use setters with return value for output-only gpios
Pull thermal control fixes from Rafael Wysocki:
"Add missing DVFS support flags for the Lunar Lake and Panther Lake
platforms to the int340x Intel thermal driver and fix DLVR support
for Panther Lake in it (Srinivas Pandruvada)"
* tag 'thermal-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel: int340x: Fix Panther Lake DLVR support
thermal: intel: int340x: Add missing DVFS support flags
Pull power management fixes from Rafael Wysocki:
"These are mostly cpufreq fixes, some of which address recent
regressions and some address older issues that have come to light
during the last two weeks, and a runtime PM documentation correction:
- Fix the performance-to-frequency scaling factor computation on
systems using HWP in the intel_pstate driver after a recent
incorrect update of it (Rafael Wysocki)
- Fix the usage of the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag
in the schedutil cpufreq governor after a recent update of it that
has caused frequency limits changes to be missed sometimes (Rafael
Wysocki)
- Address some recently discovered synchronization issues related to
frequency limits changes in the schedutil cpufreq governor and in
the cpufreq core (Rafael Wysocki)
- Fix ITMT support in the amd-pstate cpufreq driver so that it is
enabled after asym priorities have been correctly initialized for
all CPUs (K Prateek Nayak)
- Fix changing min/max limits in the amd-pstate cpufreq driver while
on the performance governor (Dhananjay Ugwekar)
- Fix a function name in the runtime PM documentation that was
previously incorrectly updated by mistake (Sakari Ailus)"
* tag 'pm-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: Avoid using inconsistent policy->min and policy->max
cpufreq/sched: Set need_freq_update in ignore_dl_rate_limit()
cpufreq/sched: Explicitly synchronize limits_changed flag handling
cpufreq/sched: Fix the usage of CPUFREQ_NEED_UPDATE_LIMITS
Documentation: PM: runtime: Fix a reference to pm_runtime_autosuspend()
cpufreq: intel_pstate: Fix hwp_get_cpu_scaling()
cpufreq/amd-pstate: Enable ITMT support after initializing core rankings
cpufreq/amd-pstate: Fix min_limit perf and freq updation for performance governor
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for an issue where C instructions ended up in non-C builds, due
to some broken inline assembly in the KGDB breakpoint insertion code
- A fix to avoid spurious printk messages about misaligned access
performance probing
- A fix for a handful of issues with /proc/iomem's reserved region
handling
- A pair of fixes for module relocation processing
- A few build-time fixes
* tag 'riscv-for-linus-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: KGDB: Remove ".option norvc/.option rvc" for kgdb_compiled_break
riscv: KGDB: Do not inline arch_kgdb_breakpoint()
riscv: Avoid fortify warning in syscall_get_arguments()
riscv: Provide all alternative macros all the time
riscv: module: Allocate PLT entries for R_RISCV_PLT32
riscv: module: Fix out-of-bounds relocation access
riscv: Properly export reserved regions in /proc/iomem
riscv: Fix unaligned access info messages
riscv: Avoid fortify warning in syscall_get_arguments()
Documentation: riscv: Fix typo MIMPLID -> MIMPID
riscv: Use kvmalloc_array on relocation_hashtable
Pull kselftest fix from Shuah Khan:
"Fixes dynevent_limitations.tc test failure on dash by detecting and
handling bash and dash differences in evaluating \\"
* tag 'linux_kselftest-fixes-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/ftrace: Differentiate bash and dash in dynevent_limitations.tc
Pull smb server fixes from Steve French:
- Fix integer overflow in server disconnect deadtime calculation
- Three fixes for potential use after frees: one for oplocks, and one
for leases and one for kerberos authentication
- Fix to prevent attempted write to directory
- Fix locking warning for durable scavenger thread
* tag 'v6.15-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: Prevent integer overflow in calculation of deadtime
ksmbd: fix the warning from __kernel_write_iter
ksmbd: fix use-after-free in smb_break_all_levII_oplock()
ksmbd: fix use-after-free in __smb2_lease_break_noti()
ksmbd: fix WARNING "do not call blocking ops when !TASK_RUNNING"
ksmbd: Fix dangling pointer in krb_authenticate
CXL subsystem supports userspace to configure features via fwctl
interface, it will configure features by using Set Feature command.
Whatever Set Feature succeeds or fails, CXL driver always needs to
return a structure fwctl_rpc_cxl_out to caller, and returned size is
updated in a out_len parameter. The out_len should be updated not only
when the set feature succeeds, but also when the set feature fails.
Fixes: eb5dfcb9e3 ("cxl: Add support to handle user feature commands for set feature")
Signed-off-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250410024521.514095-1-ming.li@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Testing revealed the following error message for a CXL memdev that has
Feature support:
[ 56.690430] cxl mem0: Resources present before probing
Attach the allocation of cxl_fwctl to the parent device of cxl_memdev.
devm_add_* calls for cxl_memdev should not happen before the memdev
probe function or outside the scope of the memdev driver.
cxl_test missed this bug because cxl_test always arranges for the
cxl_mem driver to be loaded before cxl_mock_mem runs. So the driver core
always finds the devres list idle in that case.
[DJ: Updated subject title and added commit log suggestion from djbw]
Fixes: 858ce2f56b ("cxl: Add FWCTL support to CXL")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/linux-cxl/6801aea053466_71fe2944c@dwillia2-xfh.jf.intel.com.notmuch/
Link: https://patch.msgid.link/20250418002933.406439-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Pull block fixes from Jens Axboe:
- MD pull via Yu:
- fix raid10 missing discard IO accounting (Yu Kuai)
- fix bitmap stats for bitmap file (Zheng Qixing)
- fix oops while reading all member disks failed during
check/repair (Meir Elisha)
- NVMe pull via Christoph:
- fix scan failure for non-ANA multipath controllers (Hannes
Reinecke)
- fix multipath sysfs links creation for some cases (Hannes
Reinecke)
- PCIe endpoint fixes (Damien Le Moal)
- use NULL instead of 0 in the auth code (Damien Le Moal)
- Various ublk fixes:
- Slew of selftest additions
- Improvements and fixes for IO cancelation
- Tweak to Kconfig verbiage
- Fix for page dirtying for blk integrity mapped pages
- loop fixes:
- buffered IO fix
- uevent fixes
- request priority inheritance fix
- Various little fixes
* tag 'block-6.15-20250417' of git://git.kernel.dk/linux: (38 commits)
selftests: ublk: add generic_06 for covering fault inject
ublk: simplify aborting ublk request
ublk: remove __ublk_quiesce_dev()
ublk: improve detection and handling of ublk server exit
ublk: move device reset into ublk_ch_release()
ublk: rely on ->canceling for dealing with ublk_nosrv_dev_should_queue_io
ublk: add ublk_force_abort_dev()
ublk: properly serialize all FETCH_REQs
selftests: ublk: move creating UBLK_TMP into _prep_test()
selftests: ublk: add test_stress_05.sh
selftests: ublk: support user recovery
selftests: ublk: support target specific command line
selftests: ublk: increase max nr_queues and queue depth
selftests: ublk: set queue pthread's cpu affinity
selftests: ublk: setup ring with IORING_SETUP_SINGLE_ISSUER/IORING_SETUP_DEFER_TASKRUN
selftests: ublk: add two stress tests for zero copy feature
selftests: ublk: run stress tests in parallel
selftests: ublk: make sure _add_ublk_dev can return in sub-shell
selftests: ublk: cleanup backfile automatically
selftests: ublk: add io_uring uapi header
...
Pull io_uring fixes from Jens Axboe:
- Correctly cap iov_iter->nr_segs for imports of registered buffers,
both kbuf and normal ones.
Three cleanups to make it saner first, then two fixes for each of the
buffer types.
This fixes a performance regression where partial buffer usage
doesn't trim the tail number of segments, leading the block layer to
iterate the IOs to check if it needs splitting.
- Two patches tweaking the newly introduced zero-copy rx API, mostly to
keep the API consistent once we add multiple interface queues per
ring support in the 6.16 release.
- zc rx unmapping fix for a dead device
* tag 'io_uring-6.15-20250418' of git://git.kernel.dk/linux:
io_uring/zcrx: fix late dma unmap for a dead dev
io_uring/rsrc: ensure segments counts are correct on kbuf buffers
io_uring/rsrc: send exact nr_segs for fixed buffer
io_uring/rsrc: refactor io_import_fixed
io_uring/rsrc: separate kbuf offset adjustments
io_uring/rsrc: don't skip offset calculation
io_uring/zcrx: add pp to ifq conversion helper
io_uring/zcrx: return ifq id to the user
Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
signaled by the commit 664ed90e62 ("vhost/scsi: Set
VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
vhost_scsi_send_bad_target() still assumes the response in a single
descriptor.
Similar issue in vhost_scsi_send_bad_target() has been fixed in previous
commit. In addition, similar issue for vhost_scsi_complete_cmd_work() has
been fixed by the commit 6dd88fd59d ("vhost-scsi: unbreak any layout for
response").
Fixes: 3ca51662f8 ("vhost-scsi: Add better resource allocation failure handling")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20250403063028.16045-4-dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Although the support of VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 was
signaled by the commit 664ed90e62 ("vhost/scsi: Set
VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits"),
vhost_scsi_send_bad_target() still assumes the response in a single
descriptor.
In addition, although vhost_scsi_send_bad_target() is used by both I/O
queue and control queue, the response header is always
virtio_scsi_cmd_resp. It is required to use virtio_scsi_ctrl_tmf_resp or
virtio_scsi_ctrl_an_resp for control queue.
Fixes: 664ed90e62 ("vhost/scsi: Set VIRTIO_F_ANY_LAYOUT + VIRTIO_F_VERSION_1 feature bits")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20250403063028.16045-3-dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The vhost-scsi completion path may access vq->log_base when vq->log_used is
already set to false.
vhost-thread QEMU-thread
vhost_scsi_complete_cmd_work()
-> vhost_add_used()
-> vhost_add_used_n()
if (unlikely(vq->log_used))
QEMU disables vq->log_used
via VHOST_SET_VRING_ADDR.
mutex_lock(&vq->mutex);
vq->log_used = false now!
mutex_unlock(&vq->mutex);
QEMU gfree(vq->log_base)
log_used()
-> log_write(vq->log_base)
Assuming the VMM is QEMU. The vq->log_base is from QEMU userpace and can be
reclaimed via gfree(). As a result, this causes invalid memory writes to
QEMU userspace.
The control queue path has the same issue.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20250403063028.16045-2-dongli.zhang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit cb380909ae ("vhost: return task creation error instead of NULL")
changed the return value of vhost_task_create(), but did not update the
documentation.
Reflect the change in the documentation: on an error, vhost_task_create()
returns an ERR_PTR() and no longer NULL.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20250327124435.142831-1-sgarzare@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to section 5.3.6.2 (Multiport Device Operation) of the virtio
spec(version 1.2) a control buffer with the event VIRTIO_CONSOLE_RESIZE
is followed by a virtio_console_resize struct containing cols then rows.
The kernel implements this the wrong way around (rows then cols) resulting
in the two values being swapped.
Signed-off-by: Maximilian Immanuel Brandtner <maxbr@linux.ibm.com>
Message-Id: <20250324144300.905535-1-maxbr@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As per virtio spec the fields cols and rows are specified as little
endian. Although there is no legacy interface requirement that would
state that cols and rows need to be handled as native endian when legacy
interface is used, unlike for the fields of the adjacent struct
virtio_console_control, I decided to err on the side of caution based
on some non-conclusive virtio spec repo archaeology and opt for using
virtio16_to_cpu() much like for virtio_console_control.event. Strictly
by the letter of the spec virtio_le_to_cpu() would have been sufficient.
But when the legacy interface is not used, it boils down to the same.
And when using the legacy interface, the device formatting these as
little endian when the guest is big endian would surprise me more than
it using guest native byte order (which would make it compatible with
the current implementation). Nevertheless somebody trying to implement
the spec following it to the letter could end up forcing little endian
byte order when the legacy interface is in use. So IMHO this ultimately
needs a judgement call by the maintainers.
Fixes: 8345adbf96 ("virtio: console: Accept console size along with resize control message")
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Cc: stable@vger.kernel.org # v2.6.35+
Message-Id: <20250322002954.3129282-1-pasic@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit df6f8c4d72 ("selftests/pcie_bwctrl: Add 'set_pcie_speed.sh' to
TEST_PROGS") added set_pcie_speed.sh into TEST_PROGS but that script is a
helper that is only being called by set_pcie_cooling_state.sh, not a test
case itself. When set_pcie_speed.sh is in TEST_PROGS, selftest harness will
execute also it leading to bwctrl selftest errors:
# selftests: pcie_bwctrl: set_pcie_speed.sh
# cat: /cur_state: No such file or directory
not ok 2 selftests: pcie_bwctrl: set_pcie_speed.sh # exit=1
Place set_pcie_speed.sh into TEST_FILES instead to have it included into
installed test files but not execute it from the test harness.
Fixes: df6f8c4d72 ("selftests/pcie_bwctrl: Add 'set_pcie_speed.sh' to TEST_PROGS")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250417124529.11391-1-ilpo.jarvinen@linux.intel.com
Erratum 1054 affects AMD Zen processors that are a part of Family 17h
Models 00-2Fh and the workaround is to not set HWCR[IRPerfEn]. However,
when X86_FEATURE_ZEN1 was introduced, the condition to detect unaffected
processors was incorrectly changed in a way that the IRPerfEn bit gets
set only for unaffected Zen 1 processors.
Ensure that HWCR[IRPerfEn] is set for all unaffected processors. This
includes a subset of Zen 1 (Family 17h Models 30h and above) and all
later processors. Also clear X86_FEATURE_IRPERF on affected processors
so that the IRPerfCount register is not used by other entities like the
MSR PMU driver.
Fixes: 232afb5578 ("x86/CPU/AMD: Add X86_FEATURE_ZEN1")
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/caa057a9d6f8ad579e2f1abaa71efbd5bd4eaf6d.1744956467.git.sandipan.das@amd.com
Johan writes:
USB-serial device ids for 6.15-rc3
Here's a new simple driver for Owon oscilloscopes and a couple of new
new modem and smart meter device ids.
All have been in linux-next with no reported issues.
* tag 'usb-serial-6.15-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
USB: serial: simple: add OWON HDS200 series oscilloscope support
USB: serial: ftdi_sio: add support for Abacus Electrics Optical Probe
USB: serial: option: add Sierra Wireless EM9291
Vlastimil points out an issue with kswapd in defrag_mode not waking up
kcompactd reliably.
Background: When kswapd is woken for any higher-order request, it
initially checks those high-order watermarks to decide if work is
necesary. However, it cannot (efficiently) meet the contiguity goal of
such a request by itself. So once it has reclaimed a compaction gap, it
adjusts the request down to check for free order-0 pages, then wakes
kcompactd to coalesce them into larger blocks.
In defrag_mode, the initial watermark check needs to be analogously
against free pageblocks. However, once kswapd drops the high-order to
hand off contiguity work, it also needs to fall back to base page
watermarks - otherwise it'll keep reclaiming until blocks are freed.
While it appears kcompactd is woken up frequently enough to do most of the
compaction work, kswapd ends up overreclaiming by quite a bit:
DEFRAGMODE DEFRAGMODE-thispatch
Hugealloc Time mean 79381.34 ( +0.00%) 88126.12 ( +11.02%)
Hugealloc Time stddev 85852.16 ( +0.00%) 135366.75 ( +57.67%)
Kbuild Real time 249.35 ( +0.00%) 226.71 ( -9.04%)
Kbuild User time 1249.16 ( +0.00%) 1249.37 ( +0.02%)
Kbuild System time 171.76 ( +0.00%) 166.93 ( -2.79%)
THP fault alloc 51666.87 ( +0.00%) 52685.60 ( +1.97%)
THP fault fallback 16970.00 ( +0.00%) 15951.87 ( -6.00%)
Direct compact fail 166.53 ( +0.00%) 178.93 ( +7.40%)
Direct compact success 17.13 ( +0.00%) 4.13 ( -71.69%)
Compact daemon scanned migrate 3095413.33 ( +0.00%) 9231239.53 ( +198.22%)
Compact daemon scanned free 2155966.53 ( +0.00%) 7053692.87 ( +227.17%)
Compact direct scanned migrate 265642.47 ( +0.00%) 68388.33 ( -74.26%)
Compact direct scanned free 130252.60 ( +0.00%) 55634.87 ( -57.29%)
Compact total migrate scanned 3361055.80 ( +0.00%) 9299627.87 ( +176.69%)
Compact total free scanned 2286219.13 ( +0.00%) 7109327.73 ( +210.96%)
Alloc stall 1890.80 ( +0.00%) 6297.60 ( +232.94%)
Pages kswapd scanned 9043558.80 ( +0.00%) 5952576.73 ( -34.18%)
Pages kswapd reclaimed 1891708.67 ( +0.00%) 1030645.00 ( -45.52%)
Pages direct scanned 1017090.60 ( +0.00%) 2688047.60 ( +164.29%)
Pages direct reclaimed 92682.60 ( +0.00%) 309770.53 ( +234.22%)
Pages total scanned 10060649.40 ( +0.00%) 8640624.33 ( -14.11%)
Pages total reclaimed 1984391.27 ( +0.00%) 1340415.53 ( -32.45%)
Swap out 884585.73 ( +0.00%) 417781.93 ( -52.77%)
Swap in 287106.27 ( +0.00%) 95589.73 ( -66.71%)
File refaults 551697.60 ( +0.00%) 426474.80 ( -22.70%)
Link: https://lkml.kernel.org/r/20250416135142.778933-3-hannes@cmpxchg.org
Fixes: a211c6550e ("mm: page_alloc: defrag_mode kswapd/kcompactd watermarks")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Vlastimil points out that commit a211c6550e ("mm: page_alloc:
defrag_mode kswapd/kcompactd watermarks") switched kswapd from
zone_watermark_ok_safe() to the standard, percpu-cached version of reading
free pages, thus dropping the watermark safety precautions for systems
with high CPU counts (e.g. >212 cpus on 64G). Restore them.
Since zone_watermark_ok_safe() is no longer the right interface, and this
was the last caller of the function anyway, open-code the
zone_page_state_snapshot() conditional and delete the function.
Link: https://lkml.kernel.org/r/20250416135142.778933-2-hannes@cmpxchg.org
Fixes: a211c6550e ("mm: page_alloc: defrag_mode kswapd/kcompactd watermarks")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
In __folio_remove_rmap() for RMAP_LEVEL_PMD/RMAP_LEVEL_PUD and with
CONFIG_PAGE_MAPCOUNT we first decrement the folio mapcount (and recompute
mapped shared vs. mapped exclusively) to then adjust the entire mapcount.
This means that another process might stumble in do_wp_page() over a
PTE-mapped PMD folio that is indicated as "exclusively mapped", but still
has an entire mapcount (PMD mapping), because it is racing with the
process that is unmapping the folio (PMD mapping). Note that do_wp_page()
will back off once it detects the remaining folio reference from the
process that is in the process of unmapping the folio.
This will trigger the early VM_WARN_ON_ONCE(folio_entire_mapcount(folio))
check in do_wp_page(), that can easily be reproduced by looping a couple
of times over allocating a PMD THP, forking a child where we immediately
unmap it again, and writing in the parent concurrently to the THP.
[ 252.738129][T16470] ------------[ cut here ]------------
[ 252.739267][T16470] WARNING: CPU: 3 PID: 16470 at mm/memory.c:3738 do_wp_page+0x2a75/0x2c00
[ 252.740968][T16470] Modules linked in:
[ 252.741958][T16470] CPU: 3 UID: 0 PID: 16470 Comm: ...
...
[ 252.765841][T16470] <TASK>
[ 252.766419][T16470] ? srso_alias_return_thunk+0x5/0xfbef5
[ 252.767558][T16470] ? rcu_is_watching+0x12/0x60
[ 252.768525][T16470] ? srso_alias_return_thunk+0x5/0xfbef5
[ 252.769645][T16470] ? srso_alias_return_thunk+0x5/0xfbef5
[ 252.770778][T16470] ? lock_acquire+0x33/0x80
[ 252.771697][T16470] ? __handle_mm_fault+0x5e8/0x3e40
[ 252.772735][T16470] ? __handle_mm_fault+0x5e8/0x3e40
[ 252.773781][T16470] __handle_mm_fault+0x1869/0x3e40
[ 252.774839][T16470] handle_mm_fault+0x22a/0x640
[ 252.775808][T16470] do_user_addr_fault+0x618/0x1000
[ 252.776847][T16470] exc_page_fault+0x68/0xd0
[ 252.777775][T16470] asm_exc_page_fault+0x26/0x30
While we could adjust the sequence in __folio_remove_rmap(), let's rater
move the mapcount sanity checks after the mapcount vs. refcount
stabilization phase. With this fix, a simple reproducer is happy.
While at it, convert the two VM_WARN_ON_ONCE() we are moving to
VM_WARN_ON_ONCE_FOLIO().
Link: https://lkml.kernel.org/r/20250415095007.569836-1-david@redhat.com
Fixes: 1da190f4d0 ("mm: Copy-on-Write (COW) reuse support for PTE-mapped THP")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: syzbot+5e8feb543ca8e12e0ede@syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/67fab4fe.050a0220.2c5fcf.0011.GAE@google.com
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
commit 4eeec8c89a ("mm: move hugetlb specific things in folio to
page[3]") shifted hugetlb specific stuff, and now mapping overlaps
_hugetlb_cgroup field.
Upon restoring the vmemmap for HVO, only the first two tail pages are
reset, and this causes the check in free_tail_page_prepare() to fail as it
finds an unexpected mapping value in some tails.
Increment the number of pages to be reset to 4 (head + 3 tail pages)
Link: https://lkml.kernel.org/r/20250415111859.376302-1-osalvador@suse.de
Fixes: 4eeec8c89a ("mm: move hugetlb specific things in folio to page[3]")
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Suggested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
inode_to_wb() is used also for filesystems that don't support cgroup
writeback. For these filesystems inode->i_wb is stable during the
lifetime of the inode (it points to bdi->wb) and there's no need to hold
locks protecting the inode->i_wb dereference. Improve the warning in
inode_to_wb() to not trigger for these filesystems.
Link: https://lkml.kernel.org/r/20250412163914.3773459-3-agruenba@redhat.com
Fixes: aaa2cacf81 ("writeback: add lockdep annotation to inode_to_wb()")
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>