Corrects typo that set that register to only be 8 bit.
Fixes: 773621829a ("media: imx477: Convert to use V4L2_CCI library")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
In order to prevent confusing problems where pin configuration is not
be applied, an upcoming change to the overlaycheck utility will add a
check that all added pin groups are in some way referenced by the
overlay. Before that can be done, it is necessary to ensure that all
existing overlays pass that test.
This patch modifies some overlays by adding the required "pinctrl-0"
properties, but for others that are just setting GPIOs to inputs and
outputs, where those same GPIOs are declared by <name>-gpios properties,
it is better to drop the pin groups and let the GPIO subsystem set up
the GPIOs as required. Removing this duplication may be helpful in the
future should we ever decide to enable the exclusive GPIO vs pinctrl
locking (.strict in struct pinmux_ops).
See: https://forums.raspberrypi.com/viewtopic.php?t=393742
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Previously, if an HDMI display was connected at boot time when composite
was enabled, the firmware FB would get "stuck" and a login console would
not appear on either HDMI or composite.
With the change, a console will now appear, typically (invariably?) on
HDMI (just as happens with HDMI + DPI). This may be helpful when booting
to CLI when composite was enabled accidentally and cannot be viewed.
The old behaviour can be reinstated by adding the "nohdmi" option.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
The non-pi5 variant of tc358743 needs to update the compatible of
csi0 rather than csi1 if the cam0 override is used, otherwise it
gets loaded in Media Controller mode.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Remove the need for the user to know the limitations of this PWM
implementation by adjusting configuration requests to be the closest
acceptable value. Add a get_state method so that the actual values can
be queried.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
EMMC clock speeds are based around divisions of 52Mhz, not the 50MHz
used by SD. As such, relax the "full speed" check (intended to stop
any overclock whenever an operation has to be retried) so that any
requested speed of 50MHz or higher will be overclocked.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
EMMC clock speeds are based around divisions of 52Mhz, not the 50MHz
used by SD. As such, relax the "full speed" check (intended to stop
any overclock whenever an operation has to be retried) so that any
requested speed of 50MHz or higher will be overclocked.
See: https://github.com/raspberrypi/linux/issues/7120
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Switch the power management in the imx708 device driver to use auto-
suspend with a 5s timeout.
This improves mode switching time that avoids additional regulator
switch-on delays and common register I2C writes.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add an overlay to allow composite video to be output on GPIOs 4-11
on Raspberry Pi 5, 500, 500+ or CM5 only, with an optional 108 MHz
clock on GPIO 0 and duplicate MSB on GPIO 27.
Requires composite video to be enabled and DPI to be disabled.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
The regulator compatible string changed, and implements the PWM
API instead of the backlight one, therefore requiring pwm-backlight
to sit inbetween.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Switch the power management in the imx477 device driver to use auto-
suspend with a 5s timeout.
This improves mode switching time that avoids additional regulator
switch-on delays and common register I2C writes.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Fix these two smatch warnings for the vc-sm-cma driver, rest were false
positives:
../vc-sm-cma/vc_sm.c:413
vc_sm_dma_buf_attach() warn: inconsistent returns '&buf->lock'.
Locked on : 396
Unlocked on: 413
../vc-sm-cma/vc_sm.c:1225
vc_sm_cma_ioctl_alloc() error: we previously assumed 'buffer' could be
null (see line 1113)
Signed-off-by: Jai Luthra <jai.luthra@ideasonboard.com>
The 4056x3040 mode appears to need more vertical blanking lines
than any other, leaving a black bar at the bottom of the image.
Increase IMX477_VBLANK_MIN from 4 to 48 to compensate. (It may be
possible to reduce it slightly further, but fix the regression
for now).
https://github.com/raspberrypi/linux/issues/7109
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
hvs6 appears to have dropped support for the component order
field in 3 plane YUV formats.
Support them by swapping the Cb and Cr planes over when
reading the image pointers.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Don't mix dma fence lock with the active_job lock. Use fence_lock to
protect the dma fence used by drm scheduler when signalling a job
completion and queue_lock to protect concurrent access to active bin job
in OOM and stats collection for a given file priv. The issue was
uncovered when PREEMPT_RT on with a system freeze when opening multiple
Chromium tabs on Raspberry Pi 5.
Link: https://github.com/raspberrypi/linux/issues/7035
Fixes: fa6a20c874 ("drm/v3d: Address race-condition between per-fd GPU stats and fd release")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://lore.kernel.org/r/20250916172022.2779837-1-mwen@igalia.com
Pass-through mode disables all gamma and brightness processing, sending
the raw pixel data directly to the LEDs. It is enabled by setting the
brightness to zero, either in Device Tree or using the runtime method of
writing a single byte (in this case 0) to the device.
See: https://github.com/raspberrypi/linux/issues/7108
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Remove CONFIG_MEDIA_PCI_HAILO from all the arm64 defconfig files as this
driver is no longer built in the kernel tree.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Modify the PDAF Datatype of the Arducam 64MP camera from 0x30 to 0x12
so that the Raspberry Pi 5 cfe driver can receive PDAF data.
Signed-off-by: Lee Jackson <info@arducam.com>
There were various points where the loader was using uninitialised
data, had the potential to run off the end of an array, or was
handling core functions incorrectly. Fix these up.
Also handle 24bpp and 32bpp framebuffers.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
We lost a line in the forward port, which meant that it always used
/dev/fb0, and complained that the sysfs nodes already existed.
Fixes: c91c9f257d ("fbdev: Allow client to request a particular /dev/fbN node")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Although the PIO throughput benefits from larger burst sizes, only the
first two DMA channels support a burst size of 8 - the others are capped
at 4. To avoid misconfiguring the PIO hardware, retrieve the actual
max_burst value from the DMA channel's capabilities.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
This patch adds support for external FSIN-triggered snapshot mode
to the OmniVision OV9282 sensor driver. It enables frame capture
synchronized with an external hardware trigger signal.
Signed-off-by: Omer Faruk Edemen <ofedemen@lectrontech.com>
Upstream series https://lore.kernel.org/linux-media/20250917-csi-bgr-rgb-v3-0-0145571b3aa4@kernel.org/
The tc358743 is an HDMI to MIPI-CSI2 bridge. It can output all three
HDMI 1.4 video formats: RGB 4:4:4, YCbCr 4:2:2, and YCbCr 4:4:4.
RGB 4:4:4 is converted to the MIPI-CSI2 RGB888 video format, and listed
in the driver as MEDIA_BUS_FMT_RGB888_1X24.
Most CSI2 receiver drivers then map MEDIA_BUS_FMT_RGB888_1X24 to
V4L2_PIX_FMT_RGB24.
However, V4L2_PIX_FMT_RGB24 is defined as having its color components in
the R, G and B order, from left to right. MIPI-CSI2 however defines the
RGB888 format with blue first.
This essentially means that the R and B will be swapped compared to what
V4L2_PIX_FMT_RGB24 defines.
The proper MBUS format would be BGR888, so let's use that.
Fixes: d32d98642d ("[media] Driver for Toshiba TC358743 HDMI to CSI-2 bridge")
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Masquerading Interrupt split transfers as Control puts the transfer into
the non-periodic handler in the hub. This stops the hub dropping
complete-split data in the microframe after a CSPLIT should have
arrived, improving resilience to host IRQ latency. Devices are none
the wiser - the handshake tokens are the same.
Originally devised by Hans Petter Selasky @ FreeBSD.
(v2: dwc2 needs an un-masquerade prior to channel interrupt handling)
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Output of raspinfo is now over 65,000 characters, which is more than
GitHub allows in a single form field!
Also adds Pi 500+ and CM0 to the list of models.
8 bit readout is only a reconfiguration of the CSI2 block,
and recomputation of horizontal blanking. Enable it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The sensor supports readout as 10 or 12 bit. As we are now
computing the horizontal blanking limits dynamically, adding
support for both readout modes falls out trivially, so add
them both.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The timing registers configured are for 450MHz.
If running at a different link frequency, use the automatic
timing control.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Now that the link frequency can be varied, write the link bit
rate registers to reflect the speed being used.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
As we now support variable link frequency, compute the minimum
line_length value that the sensor will work with, and set
V4L2_CID_HBLANK based on that number.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Rather than the hard coded PLL settings for fixed frequencies,
compute the PLL settings based on device tree, validating that
the specified rate can be achieved.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
There are a fair number of registers duplicated in all the mode
tables, so move those into the common table.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This removes a load of boilerplate code around how registers
are grouped into multiple word values.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The frame length default value doesn't change dynamically, and
neither does any of the other parameters that configure it,
so precompute it instead of working from a frame duration to
get to the value.
The minimum value was also computed, when actually the sensor
will take any value down to 4 lines.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
line_length_pix is a value that the developer wants to know,
so write the values in decimal.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Register line_length_pix was being written by both the tables
of registers and the control handler for V4L2_CID_HBLANK.
Remove the duplication in the tables.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This is largely to test a previous change that made IOMMU aperture
configurable and allocated lazily; it may be useful in its own right.
We expect IOMMU2 to be well-utilized e.g. when using 64MPix cameras.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Allocate space for level-2 IOMMU translation tables on demand.
This should save memory in most cases but means that map_pages()
can now fail with -ENOMEM. Unused pages are retained for re-use.
Move all dma_sync* calls into map and unmap functions rather than
batching them up. This makes it easier to ensure they are safely
balanced, now that the tables are held as separate pages.
Add OF properties to override the default aperture size (2GB)
and base address (40GB); this doesn't include any dma-iova-offset.
Various tidy-ups. The internal representation of aperture limits
*does* include dma_iova_offset, as that is more commonly useful.
Clarify the distinction between Linux pages and IOMMU table pages.
Fix wrong definition of MMMU_CTRL_PT_INVALID_EN flag.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
At a high FPS with RAW10, there is frame corruption for 480p because the
rate_factor of 2 is used with the normal 2x2 bining [1]. This commit
ties the rate_factor to the selected binning mode. For the 480p mode,
analog 2x2 binning mode with a rate_factor of 2 is always used. For the
1232p mode the normal 2x2 binning mode is used for RAW10 while analog
2x2 binning mode is used for RAW8.
[1] raspberrypi#5493
Signed-off-by: Vinay Varma <varmavinaym@gmail.com>
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Reworked due to upstream changes
The enum binning_mode is redundant, it only uses values from the earlier
defined binning modes. Remove it.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The PL011 driver in this downstream kernel tree supports an extra
compatible string - arm,pl011-axi - for use by RP1. This registers as a
platform driver, not an AMBA driver, and has the advantage of responding
to dynamic Device Tree changes such as loading one of the "uart<n>"
overlays.
Change all of the downstream Raspberry Pi dts files to use the new
compatible string. At the same time, remove the override of the periphid
as the upstream code now has the correct value.
See: https://github.com/raspberrypi/linux/issues/7019
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The D-PHY on RP1 support lane polarity swapping, and there
is a standard device tree mechanism for configuring this,
so tie the two together.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
There was lots of register definition information dumped from
the some source into the driver but unused. Remove it, and
format the remaining lines according to the Linux kernel coding
style.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Put particularly the PHY registers into order, bitmasks
defined alongside the registers, and Use tabs for indentation.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Replace one-element array with flexible-array member to fix:
[ 11.725017] ------------[ cut here ]------------
[ 11.725038] memcpy: detected field-spanning write (size 4) of single field "hdr->body" at drivers/staging/vc04_services/vc-sm-cma/vc_sm_cma_vchi.c:130 (size 0)
[ 11.725113] WARNING: CPU: 3 PID: 455 at drivers/staging/vc04_services/vc-sm-cma/vc_sm_cma_vchi.c:130 vc_vchi_cmd_create+0x1a8/0x1d0 [vc_sm_cma]
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Extend the LED control features of the led-modes property by adding a
led-swap property. This allows the same led-modes values to be used
across designs where the LED assignments differ.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Copy the firmware's handling of FSTROBE, but using module parameters
instead of config.txt entries.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
To ensure we don't get a regression on handling reflect_[xy]
in the middle of the command line string, add a test for it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The command line parser was looking for an "=" before the ","
separator.
If the command line had reflect_[xy] before another option
which took a parameter, it would find the "=" from the second
option and assume it was associated with the reflect option,
generally leading to a parsing failure.
Handle this case by looking for both "," and "=", and taking
the first instance.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
All the matrix entries for the YUV to RGB conversion matrices were
being filled with the same coefficients.
Compute the values for the BT601, BT709, and BT2020 matrices in
both full and limited range, and program those into the hardware.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Upstream bcm2712 support added/split out the inbound window for MIP1 into
a separate range. For the pcie-32bit-dma overlay to work, both the MIP
and RC ranges need to agree.
Shift the MIP window to the top 4K page below 4GB.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Use native arm64 runners to speed up build process. Cross compile is
still used for arm targets, but also benefit from the arm64 runner
architecture. Overall build time will be reduced by 25 to 30 minutes by
this.
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Input tensor injection is a debug feature that allows a user-controlled
input to be passed directly to IMX500's inference engine (bypassing the
in-built ISP).
Three new custom controls are added to ENABLE_INJECTION before streaming
begins, to provide appropriate input tensors via an INPUT_TENSOR_FD, and
to provide notification of DNN results in the sensor output via
INJECTION_CMP_FRM.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
The driver previously set the global interrupt mask in the ECR register
in bcm54xx_config_init(), disabling all interrupts. This conflicts with
the configuration in bcm_phy_config_intr(), which enables or disables the
global interrupt mask as needed and is called earlier. As a result,
interrupts may remain globally disabled even when the IMR is configured
to unmask specific events.
Remove the ECR handling from bcm54xx_config_init() so that interrupt
enable/disable is managed exclusively by bcm_phy_config_intr().
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
On CM4/CM5, LED3 is used for ETH_LEDY, while LED4 may be unused or serve
as INT_N. Previously, both LEDs 3 and 4 were mirrored from LED1, which
overwrote the INT_N configuration on CM5.
Fix this by only shadowing LED1 to LED3, preserving the setting for
LED4/INT.
Fixes: 9704fab964 ("net: phy: broadcom: Allow ethernet LED mode to be set via device tree")
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
On CM5, the active-low interrupt pin (INT_N) of the Ethernet PHY is
connected to GPIO37. However, an internal pull-up resistor appears to
be missing, which causes the interrupt edge to be missed or not detected
reliably. Fix this by configuring a bias pull-up on the gpio controller.
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
The phy specific structure is missing the pointers for handling
interrupts and link change notification. This results in interrupt's
being polled on CM5:
Fix this and copy the existing pointers from BCM54210E, which match the
implementation.
Before:
[3.501498] macb 1f00100000.ethernet eth0: PHY [1f00100000.ethernet-ffffffff:00] driver [Broadcom BCM54213PE] (irq=POLL)
After:
[3.597582] macb 1f00100000.ethernet eth0: PHY [1f00100000.ethernet-ffffffff:00] driver [Broadcom BCM54213PE] (irq=168)
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Added 2 overlays for the rpi-power HAT to operate in
either TOP or BOTTOM mode.
Modified makefile and readme accordingly
Signed-off-by: Lucas Hoffmann <lucas.hoffmann@raspberrypi.com>
Increase the timeout for the toolchain installation in the
dtoverlaycheck workflow, to match that for the kernel.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Both drivers/gpu/drm/solomon/ssd130x-i2c.c and
drivers/video/fbdev/ssd1307fb.c were registering the compatible
"solomon,ssd1306fb-i2c", so bringing ambiguity as to which one
got loaded.
fbdev is largely deprecated, so adopt the updated compatible
for the drm driver.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
They're supported by the standard BMP280 driver, so only
needed the overlay configuration.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Device links are used to keep track of suppliers and consumers of
resources, adding some control over the ordering of device probes other
than returning -EPROBE_DEFER. The way the RP1 device is created breaks
this mechanism in the rare case that the use of modules has been
completely disabled, thanks to some opimisations within the device link
code.
Fix this glitch by giving the corresponding fwnode a pointer to the
device, taking the opportunity to remove a pointless check on the
validity of the rp1_node pointer.
See: https://github.com/raspberrypi/linux/issues/7018
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
max-speed is a generic property for ethernet PHYs, so is
supported by the PHY on Pi5, Pi500, and CM5.
Add the override and update the documentation accordingly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
max-speed is a generic property for ethernet PHYs, so should be supported
by the PHY on CM4/Pi4/Pi400.
Add the override and update the documentation accordingly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Prevents fullscreen logos from being drawn multiple times.
With small enough logos, the image would be drawn multiple times across the screen.
Signed-off-by: Ben Benson <ben.benson@raspberrypi.com>
To work around the 30fps buffer-flip rate limit when using VEC's
"native" interlaced modes, switch to sending individual fields
to the VEC BE, using an ISR to flip between fields.
When the TV mode is NTSC, change advertised progressive modes to
have 263 total lines; this ameliorates colour artifacts, although
it reduces the frame rate slightly from 60.05Hz to 59.83Hz.
Progressive modes with 262 lines remain supported.
Fix an error in equalising pulse configuration for PAL-M/PAL60.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Reimplement the vc4-kms-dsi-ili9881-5inch display overlay by applying a
few changes to the vc4-kms-dsi-ili9881-7inch version. In doing so, it
inherits the rotation parameter that was previously absent, which then
needs documenting.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Enable by adding the following to cmdline.txt:
`fullscreen_logo_name=logo.tga fullscreen_logo=1`
Will show the logo file present in /lib/firmware/ on the screen.
This will be fullscreen and rendered early at boot.
Any remaining space is filled with solid color from the image border.
If TGA file is too big, image is clipped accordingly.
Signed-off-by: Ben Benson <ben.benson@raspberrypi.com>
The patch "dmaengine: dw-axi-dmac: add per-channel AXI burst length
support" programs ARLEN/AWLEN from the snps,axi-max-burst-len array but
still exposed a single max_burst value via dma_get_slave_caps(). As a
result all channels reported 8 even when limited to 4, leading to
warnings:
dma dma2chan5: requested source burst length 8 exceeds supported 4
Add a .device_caps callback to return the correct per-channel max_burst.
This allows drivers like amba-pl011 to clamp burst lengths properly.
Fixes: 0e4e6a0c4f4e ("dmaengine: dw-axi-dmac: add per-channel AXI burst length support")
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
PIO benefits from increased DMA bandwidth when used with DMA channels
0 or 1, because they support longer bursts. Add DMA channel selection
attributes to prevent other users from claiming them.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Give the DMAC property "snps,axi-max-burst-len" a value for each DMA
channel, encoding the fact that channels 1 and 2 are more capable
("heavy").
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add a mechanism to allow clients to prefer some DMA channels over
others. This is required to allow high-bandwidth clients to request
one of the two "heavy" channels, but could also be used to prevent
some clients from hogging all channels.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The DesignWare AXI DMAC IP can be configured with heterogeneous channel
parameters. Allow maximum burst length to be set per-channel by making
snps,axi-max-burst-len an array.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
If the DMA channel allocation fails, the relevant dma_configs entry
should be marked as no longer claimed, otherwise rp1_pio_sm_dma_free
will be called with an error number as a DMA channel pointer.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
As we do a shallow clone of the repo, Fixes: tags
generally don't have the matching commit available
to lookup, and checkpatch logs it.
Ignore this error.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
While we continue to use the downstream RP1 driver, update some other
Kconfig settings to recognise MFD_RP1 as a valid RP1 driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Drop from RGB to YUV422 output if RGB couldn't be supported
within the defined max_bpc and TMDS rates, and then try
dropping max_bpc.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Apparently some NVMe SSD implementations don't fall back to MSI cleanly,
instead making the driver allocate one queue via the legacy interrupt.
There are still only 32 vectors available, but should be sufficient for
the majority of use-cases on BCM2711.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Disabling tx_lpi or eee should not cause the value of tx_lpi_timer to
be lost, even though it is not useful until they are re-enabled.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add two new DT properties:
* microchip,eee-enabled - a boolean to enable EEE
* microchip,tx-lpi-timer - time in microseconds to wait before entering
low power state
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
For applications of the LAN78xx that don't have valid programmed
EEPROMs or OTPs, enabling both LEDs and auto-negotiation by default
seems reasonable.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
lan78xx_link_reset explicitly clears the MAC's view of the PHY's IRQ
status. In doing so it potentially leaves the PHY with a pending
interrupt that will never be acknowledged, at which point no further
interrupts will be generated.
Avoid the problem by acknowledging any pending PHY interrupt after
clearing the MAC's status bit.
See: https://github.com/raspberrypi/linux/issues/2937
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
All other architectures do different cache operations depending on the
dir parameter. Fix arm64 to do the same.
This fixes udmabuf operations when syncing for read e.g. when the CPU
reads back a V4L2 decoded frame buffer.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Add a new minimal alignment field to the format structure. This minimal
alignment will be used if a stride has been provided by userland. If no
stride has been provided by userland (bytesperline == 0), the optimal
alignemnt will be used in the stride calculation.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Rename the align field in the format structure to opt_align to indicate
the optimal alignment for the format.
There is no functional change in this commit.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Each V3D queue works independently and all the dependencies between the
jobs are handled through the DRM scheduler. Therefore, there is no need
to use one single lock for all queues. Using it, creates unnecessary
contention between different queues that can operate independently.
Replace the global spinlock with per-queue locks to improve parallelism
and reduce contention between different V3D queues (BIN, RENDER, TFU,
CSD). This allows independent queues to operate concurrently while
maintaining proper synchronization within each queue.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Instead of storing the queue's active job in four different variables,
store the active job inside the queue's state. This way, it's possible
to access all active jobs using an index based in `enum v3d_queue`.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Instead of storing a pointer to the DRM file data, store a pointer
directly to the private V3D file struct.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
There are now formats defined for 2-plane YUV420 at 10, 12,
and 16 bit depth using the most significant bits of the 16bit
word (P010, P012, and P016), and 3-plane YUV420 at those
depths using the least significant bits of the 16 bit word
(S010, S012, and S016).
VC4_GEN_6 can support all those formats although only composing
using at most 10bits of resolution, so add them as supported
formats for all planes.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The function vc4_mock_atomic_add_output() should return a pointer, even
during error treatment. Use the proper macros to create pointers from
the error code.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
The version of the dwc-otg core used in BCM2835 through BCM2712 only does
whole-word writes, as well as needing the documented requirement for DMA
buffers to start on a word boundary.
Also, the alignment method used in the dwc2 driver doesn't handle the
case where the URB has the NO_TRANSFER_DMA_MAP flag set, so reject
buffers that have unaligned DMA start addresses. At least one whole page
should be mapped, so the BCM283x whole-word-write bug should be benign
in this case.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
During the probe() routine, the PiSP BE driver needs to power up the
interface in order to identify and initialize the hardware.
The driver resumes the interface by calling the
pispbe_runtime_resume() function directly, without going
through the pm_runtime helpers, but later suspends it by calling
pm_runtime_put_autosuspend().
This causes a PM usage count imbalance at probe time, notified by the
runtime_pm framework with the below message in the system log:
pispbe 1000880000.pisp_be: Runtime PM usage count underflow!
Fix this by resuming the interface using the pm runtime helpers instead
of calling the resume function directly and use the pm_runtime framework
in the probe() error path. While at it, remove manual suspend of the
interface in the remove() function. The driver cannot be unloaded if in
use, so simply disable runtime pm.
To simplify the implementation, make the driver depend on PM as the
RPI5 platform where the ISP is integrated in uses the PM framework by
default.
Fixes: 12187bd5d4 ("media: raspberrypi: Add support for PiSP BE")
Cc: stable@vger.kernel.org
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
The driver uses pci_msi methods, only defined when CONFIG_PCI_MSI symbol
is set, and cannot be compiled without. Therefore, it depends on this
symbol.
Signed-off-by: Jorge Marques <jorge.marques@analog.com>
Document the gpio-gate-clock-releasing compatible string that enables
acquire/release GPIO semantics on gpio-gated clocks.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
Add support for the 'gpio-gate-clock-releasing' compatible string. The
behaviour is identical to that of 'gpio-gate-clock' but the gpio is
acquired on 'enable' and released on 'disable'.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
drm_helper_probe_add_cmdline_mode was looking for a match for
the width, height, and refresh rate within the EDID modes, but
didn't check the interlacing flag. That meant that with
video=1920x1080@50i would match any 1920x1080@50 mode that was
found.
The converse would be possible too if an interlaced mode with
matching resolution & refresh rate was found first.
Check the interlacing flag as well.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The brcmfmac driver uses the SDIO vendor ID values to identify which
vendor's driver extensions to use. However, the Cypress/Infineon devices
used by Rasperry Pi devices have a vendor ID of 02d0, which is Broadcom.
In order to use the Cypress driver extensions, modify the static mapping
for "43430" and "4345" (sic) to indicate that they are Cypress parts.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
dwc2_hc_start_transfer() overwrites hc->xfer_len for split-IN transfers.
Drivers may not allocate buffers that are multiples of the endpoint max
packet size, which may cause buffer overruns in the last transfer.
The hardware needs HCTSIZ to be set to a multiple of HCCHAR.MPS, so trim
chan->max_packet in dwc2_assign_and_init_hc().
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The HFNUM register increments on every microframe in HS mode, and USB
device drivers expect the returned frame count to relate to the overall
frame. Right-shift the returned value by 3 to drop the microframe bits.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
BCM2836 with Cortex-A7 cores has almost the same ARM_LOCAL interrupt
routing logic as BCM2837, so relax the compile guard to CONFIG_SMP not
CONFIG_ARM64.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Interrupts are dispatched round-robin but doing so trampled FIQ routing.
Taking a FIQ on a core without a handler installed is fatal.
Only modify bits 1:0 which are the IRQ route bits.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The 6.9 kernel introduced a large patchset [1] designed to make gpiochip
usage safe in the presence of potential hotplugging events. The changes
included protecting every gpiochip access with a claim of an interlock.
Running on a Pi 5 these changes reduce GPIO performance from userspace
by around 10%. The penalty would be proportionally higher from kernel,
as seen by SPI speed reductions.
Patch the gpiolib implementation to remove the protection of gpiochip
accesses. By providing alternative implementations of the relevant
macros, the changes are localised and therefore easier to verify.
See: https://github.com/raspberrypi/linux/issues/6854
[1] https://lwn.net/Articles/960024/
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The Renesas uPD controller is a bit more picky about validating Configure
Endpoint TRBs and requires that bit 0 of the ADD field is 1.
This is mentioned in xhci v1.2 s4.6.6.
Also drop a redundant helper function and reject invalid endpoints.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Copying our downstream patch for imx477 that allows configuration
of external synchronisation signals via DT, add the same to imx296.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Distinguish between releasing the watchdog without requesting that it is
stopped, and failing to stop it when requested. The former is standard
behaviour for systemd, while the latter may be unexpected.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
A conditional had been left as == GEN_6_C, when it also applied
to GEN_6_D, resulting in an invalid change to the dlist on async
updates.
Fixes: b7b14b31c8 ("drm/vc4: plane: Add support for 2712 D-step.")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Support vertical flip by setting REG_VREVERSE.
Additional registers also needs to be set per mode, according
to the readout direction (normal/inverted) as mentioned in the
data sheet.
Since the register IMX335_REG_AREA3_ST_ADR_1 is based on the
flip (and is set via vflip related registers), it has been
moved out of the 2592x1944 mode regs.
Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
Reviewed-by: Tommaso Merciai <tomm.merciai@gmail.com>
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
In commit 81495a59ba ("media: imx335: Fix active area height discrepency")
the height for the mode struct was rectified to '1944'. However, the
name of mode struct is still reflecting to '1940'. Update it.
Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
Reviewed-by: Tommaso Merciai <tomm.merciai@gmail.com>
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
The existing driver claims AHT20 support in i2c_device_id, but fails to:
1. Use the correct init command (0xBE for AHT20 vs 0xE1 for AHT10)
2. Omit AHT10_MODE_CYC which AHT20 doesn't support/require
Add proper initialization sequence and include "aosong,aht20" in the
device tree match table to fully support the AHT20.
Signed-off-by: Josh Martinez <8892161+joshermar@users.noreply.github.com>
As a follow-up to commit ef79eea9e4 ("drm/vc4: plane: Enable scaler
for YUV444 on GEN6"), the image looks a little soft when rendering at
1:1 due to the scaling filter being enabled.
Switch to using the nearest neighbour filter automatically when
not scaling in YUV444 to compensate.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
GEN6 requires the luma scaler to be enabled for YUV444 to
be rendered at 1:1, otherwise the source plane isn't rendered.
Fixes: 076eedaf76 ("drm/vc4: hvs: Add support for BCM2712 HVS")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
YUV444 support isn't officially supported by the hardware, but
worked if you told it the image was YUV422 with double the width
and altered chroma scaling.
Adding BCM2712 support gained a fetcher memory (UPM). The code
handling UPM allocations didn't have a case for YUV444, so only
allocated based on the base width, and therefore underflowed.
Increase the UPM allocation size for the luma plane of YUV444
to match.
Fixes: 076eedaf76 ("drm/vc4: hvs: Add support for BCM2712 HVS")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Commit 0af46fbc33 ("media: i2c: imx219: Calculate crop rectangle
dynamically") meant that the 1920x1080 switched from using no binning
to using vertical binning but no horizontal binning.
Restore the original behaviour by ensuring the two binning settings
are the same.
Fixes: 0af46fbc33 ("media: i2c: imx219: Calculate crop rectangle dynamically")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
By default, the System Exit Latency and Maximum Exit Latency are used to
calculate hub port U1 and U2 timeout values. This has the effect of
aggressively power-managing a SuperSpeed link but devices are known to
report unfeasibly short device exit latencies in their descriptors,
which under certain usage conditions can significantly degrade
throughput as the link spends longer retraining than being in a useable
state.
The Intel heuristic approach calculates a reasonably large
endpoint-dependent U1 timeout, and uses a minimum U2 timeout that is
several multiples of typical U2 exit latencies.
Add a module parameter that defaults to using this scheme.
This should have the effect of squelching interop edge-cases where LPM
noticeably degrades performance, and avoid the usual workaround where
userspace manually disables it.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Certain versions of the VL805 firmware manipulate the endpoint Link
Control register to toggle ASPM on/off based on workload, but these
versions also report 0 in the Device Capability Acceptable Latency field
leaving the RC with ASPM disabled.
As it turns out, this EP has a broken L0s implementation so a) override
L1 latency to a sensible value and b) mask L0s.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The existing implementation for clkreq-mode="safe" leaves the HARD_DEBUG
with both control bits clear. This can cause link failure if L1
sub-states are enabled and if either of these conditions occurrs:
- The platform does not connect the CLRKEQ# signal to the EP, and a
pull-up is present on the line
- The platform connects the signal to the EP, and the EP enters an L1.x
or ClkPM state
Additional register bits in the HARD_DEBUG register can be used to force
the RC to drive CLKREQ# low. Also, un-advertise L1ss as a) additional
power savings can't be realised and b) enabling L1ss may incur
additional wake latency from L1.0.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The number of words to fetch for SAND formats on vc6 needs to account
for all pixels requested by width.
If cropping fractional pixels, then the width was being increased, but
fetch_count had already been computed. That led to insufficient words
being fetched, and the HVS locked up solid.
Apply the fixup for fractional pixel source cropping before computing
fetch_count.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
BCM2712/vc6 uses 256bit words when reading in P030/SAND128,
increased from 128bit on BCM2711/vc5.
Update the code for cropping the read area to handle the correct
word length.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add two features that assist in diagnosing link instability issues.
The debugfs additions allow for snapshots of the Physical Layer
statistics registers to be taken, during either free-running capture or
after a hardware-controlled capture interval.
To arm the capture engine (and reset the stats counters), write an
integer N to:
/sys/kernel/debug/pcie@<addr>/stats_trigger
The engine will run forever with a value of 0, or disarm after N
microseconds.
To snapshot the hardware stats counters, write to:
/sys/kernel/debug/pcie@<addr>/stats_snapshot
Reading this file will return the snapshot. If no writes have occurred
since boot, the snapshot will be of the initial link training period.
The ltssm_trace module parameter printk's the states during initial link
startup, in situations where failure to establish the link is a fatal
error.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Remove a bogus memory alignment check - transfers will be run bytewise
if needed - and add a check that the overall length is multiple of the
register size, otherwise there is residue.
See: https://github.com/raspberrypi/linux/issues/6733
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Commit 69dbba71ac ("drm/vc4: Add algorithmic handling for SAND")
lost a multiplication by the tile width when doing the pointer arithmetic
for cropping off columns for vc6.
Correct that computation.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Ilitek ILI9806E driver is used in the Pimoroni HyperPixel4
and potentially other displays. Whilst it can support multiple
interfaces, this driver only accounts for SPI configuration and
DPI video data.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This property can be used to delay deassertion of external fundamental
reset, which may be useful for endpoints that require an extended time for
internal setup to complete.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Some endpoints need longer than the minimum Tperst_clk time of 100us
that the PCIe specification allows for, as they may need to sequence
internal resets off the stable output of internal PLLs prior to removal
of fundamental reset. PCIe switches are an especially bad case, in some
cases requiring up to 100 milliseconds for stable downstream link
behaviour.
Parse the DT property brcm,tperst-clk-ms and use this to hold PERST# low
during brcm_pcie_start_link().
The BRCM RC typically outputs 200us of stable refclk before deasserting
PERST#. By masking/forcing the output signal while deasserting the
internal reset, the effect is to extend the length of time that the
refclk is active and stable before PERST# is released.
The TX lanes will enter the Polling state before PERST# is released, but
this appears to be harmless.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Some platforms may require an extended time with refclk active before
PERST# is released. Add a property to let the RC driver know how long to
wait.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The BCM2712 root complexes can interpret priority signalling in two
different ways, based on the incoming Traffic Class of a TLP.
The TLP TCs are assigned to separate internal request/response queues,
and assigned different AXI IDs. These queues can have outgoing AXI
transactions tagged based on:
- Static QoS values
- Dynamic QoS through internal backpressure
- Dynamic QoS with elevation based on Vendor Messages received by the RC
The VDM mechanism is of limited use due to implementation bugs, but the
implicit reordering due to separate ID assignment allows higher-priority
traffic from an EP to overtake other traffic in the RC and rest of the
system.
RP1 assigns TCs based on its internal bus managers, and internally tags
read requests to allow out-of-order completions, so these two features
operate in concert to provide priority service to e.g. MIPI camera or
display traffic.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
There is configurable priority forwarding hardware in this variant of the
Root Complex controller. Add optional properties to configure FIFO
backpressure or Vendor-Defined Message priority forwarding.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The PHY MDIO register map is different on BCM2712, and as the PHY input
clock is 54MHz not 100MHz, enabling refclk SSC is both broken and
unfixable.
Mask out attempts to enable SSC with a controller quirk.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
It appears that bits in the Root Control Register are reset with
perst_n, which means the PCI layer's call to enable CRS prior to
adding/scanning the bus has no effect. Open-code the enable in
brcm_pcie_start_link as a workaround.
Without CRS visibility, configuration reads issued by the CPU don't
retire if the endpoint returns a CRS response - the RC will poll until a
(large) timeout is reached. This means the core can stall for a long
time during boot.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
These chips use a UBUS-AXI bridge component that has configurable
timeout and error response handling.
Suppress AXI error responses to CPU requests, otherwise these are fatal
if they reach the ARM cluster, and set reasonably large timeouts for
both Mem and Cfg requests.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Pitch has no meaning if the modifier isn't DRM_FORMAT_MOD_LINEAR
as there is no guarantee that the value passed follows the
pattern that pitch * height = size.
Remove that check from framebuffer_check.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The SAND modifier with height 0 is now using the provided pitch as
the column stride, but the UBM allocation needs to be done based
on the plane width.
Recompute the width in these conditions.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The SAND handling had been using what was believed to be a runtime
parameter in the modifier, however that has been clarified that
all permitted variants of the modifier must be advertised, so
making it variable wasn't practical.
With a rationalisation of how the producers of this format are
configured, we can switch to a variant that doesn't have as much
variation, and can be configured such that only 2 options are
required.
Add a modifier with value 0 to denote that the height of the luma
column matches the buffer height, and chroma column will be half
that due to YUV420.
A modifier of 1 denotes that the height of the luma column still
matches the buffer height, but the chroma column height is the same.
This can be used to replicate the previous behaviour.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc4's debugfs support was updated with drm_debugfs_entry whilst
BCM2712 support was in progress, and missed that the lookup
in vc6_hvs_debugfs_dlist still followed the old pattern.
Correct that lookup to avoid an invalid dereference.
Fixes: f7af8ae9d3 ("drm/vc4: hvs: Add support for BCM2712 HVS")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Supporting GL rendering of the new HEVC decoder pixel formats requires
Mesa 24.2.5 or later. There are a couple of minor issues holding up
switching to Mesa 24.
Drop the new pixel formats from enum_fmt so that FFMpeg will use
the older ones that earlier versions of Mesa do support.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The BCM2711 and BCM2712 SoCs used on Rapsberry Pi 4 and Raspberry
Pi 5 boards include an HEVC decoder block. Add a driver for it.
Signed-off-by: John Cox <john.cox@raspberrypi.com>
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: hevc_dec: Add in downstream single planar SAND variant
Upstream will take the multi-planar SAND format, but add back
in the downstream single planar variant for backwards compatibility
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: hevc_dec: Add module parameter for video_nr
To avoid user complaints that /dev/video0 isn't their USB
webcam, add downstream patch that allows setting the preferred
video device number.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Adds a binding for the HEVC decoder found on the BCM2711 / Raspberry Pi 4,
and BCM2712 / Raspberry Pi 5.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add V4L2_PIXFMT_NV12MT_COL128 and V4L2_PIXFMT_NV12MT_10_COL128
to describe the Raspberry Pi HEVC decoder NV12 multiplanar formats.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Raspberry Pi HEVC decoder uses a tiled format based on
columns for 8 and 10 bit YUV images, so document them as
NV12MT_COL128 and NV12MT_10_COL128.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Keep track of the number of requests and request objects of a media
device. Helps to verify that all request-related memory is freed.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
By default when the last request object is completed, the whole
request completes as well.
But sometimes you want to manually complete a request in a driver,
so add a manual complete mode for this.
In req_queue the driver marks the request for manual completion by
calling media_request_mark_manual_completion, and when the driver
wants to manually complete the request it calls
media_request_manual_complete().
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
commit 59ac702a93 ("drm/vc4: Get the rid of DRM_ERROR()") converted
all calls to DRM_ERROR into drm_err, but also converted one DRM_DEBUG
into a drm_err.
Switch it back to drm_dbg.
Fixes: 59ac702a93 ("drm/vc4: Get the rid of DRM_ERROR()")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
If an HDMI connector has no EDID and the mode is set via the
kernel command line, then drm_reset_display_info() is the only
thing that will have set up any of connector->display_info.
With commit 26ff1c38fc ("drm/connector: hdmi: Compute bpc
and format automatically"), it is now checked that
DRM_COLOR_FORMAT_RGB444 is supported. Whilst it doesn't fail
the request, it does log dev_warn for every commit, spamming
the log.
For HDMI connectors initialise the color_format field to say
it supports RGB444.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
It was noted that if PV1 was in use to drive DSI1, then the
writeback connector could not be used as HVS channel 2 was
already in use.
The HVS allows PV1 (HVS output 2) to be driven by any HVS
channel via the DSP3_MUX setting, but that was hardcoded to be
either 2 (for PV1) or disabled for TXP.
Expand the available channels field for PV1, and configure
DSP3_MUX accordingly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The commit titled "bcm2835-dma: Derive slave DMA addresses correctly"
(now squashed into DMA roll-up) moved the responsibility for calculating
DMA addresses to the DMA driver. Unfortunately it committed the sin of
using phys_to_dma directly rather than using the approved API, i.e.
dma_map_resource.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
A basic device-specific linear memory mapping was introduced back in
commit ("dma: Take into account dma_pfn_offset") as a single-valued offset
preserved in the device.dma_pfn_offset field, which was initialized for
instance by means of the "dma-ranges" DT property. Afterwards the
functionality was extended to support more than one device-specific region
defined in the device.dma_range_map list of maps. But all of these
improvements concerned a single pointer, page or sg DMA-mapping methods,
while the system resource mapping function turned to miss the
corresponding modification. Thus the dma_direct_map_resource() method now
just casts the CPU physical address to the device DMA address with no
dma-ranges-based mapping taking into account, which is obviously wrong.
Let's fix it by using the phys_to_dma_direct() method to get the
device-specific bus address from the passed memory resource for the case
of the directly mapped DMA.
Fixes: 25f1e18870 ("dma: Take into account dma_pfn_offset")
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Slave addresses for DMA are meant to be supplied as physical addresses
(contrary to what struct snd_dmaengine_dai_dma_data does).
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add ALSA jack detection to the vc4-hdmi audio driver so userspace knows
when to add/remove HDMI audio devices.
Signed-off-by: David Turner <david.turner@raspberrypi.com>
bcm2835_dma_suspend_late is only used if CONFIG_PM_SLEEP is defined,
so make it's presence similarly conditional to avoid a build warning
(and hence error).
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
These fields should not be set by either the user or the kernel driver
so remove them. Replace them with padding bytes to maintain backward
compatibility with existing userland applications.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The principal differences between the downstream SDHOST driver and the
version accepted upstream driver are that the upstream version loses the
overclock support and DMA configuration via DT, but gains some tidying
up (and maintenance by the upstream devs).
Add the missing features (with the exception of the low-overhead logging)
as a patch to the upstream driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
v4l2: Add pisp compression format support to v4l2
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: rp1: cfe: Fix use of freed memory on errors
cfe_probe_complete() calls cfe_put() on both success and fail code paths.
This works for the success path, but causes the cfe_device struct to be
freed, even if it will be used later in the teardown code.
Fix this by making the ref handling a bit saner: Let the video nodes
have the refs as they do now, but also keep a ref in the "main" driver,
released only at cfe_remove() time. This way the driver does not depend
on the video nodes keeping the refs.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Fix width & height in cfe_start_channel()
The logic for handling width & height in cfe_start_channel() is somewhat
odd and, afaics, broken. The code reads:
bool start_fe = is_fe_enabled(cfe) &&
test_all_nodes(cfe, NODE_ENABLED, NODE_STREAMING);
if (start_fe || is_image_output_node(node)) {
width = node->fmt.fmt.pix.width;
height = node->fmt.fmt.pix.height;
}
cfe_start_channel() is called for all video nodes that will be used. So
this means that if, say, fe_stats is enabled as the last node, start_fe
will be true, and width and height will be taken from fe_stats' node.
The width and height will thus contain garbage, which then gets
programmed to the csi2 registers.
It seems that this often still works fine, though, probably if the width
& height are large enough.
Drop the above code, and instead get the width & height from the csi2
subdev's sink pad for the csi2 channel that is used. For metadata the
width & height will be 0 as before.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Fix verbose debug print
The debug print in cfe_schedule_next_csi2_job() is printed every frame,
and should thus use cfe_dbg_irq() to avoid spamming, rather than cfe_dbg().
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Rename xxx_dbg_irq() to xxx_dbg_verbose()
Rename the xxx_dbg_irq() macros to xxx_dbg_verbose(), as they can be
used to verbose debugs outside irq context too.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Add verbose debug module parameter
Expose the verbose debug flag as a module parameter.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Drop unused field
Drop 'sensor_embedded_data' field, as it is unused.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Fix default meta format's field
Set default meta format's field to V4L2_FIELD_NONE, instead of zeroing
it which indicates V4L2_FIELD_ANY. Metadata doesn't have fields, so NONE
makes sense, and furthermore the default v4l2 link validation will check
for matching fields, or that the sink field is NONE.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Fail streaming if FE_CONFIG node is not enabled
When the FE is enabled, ensure that the FE_CONFIG node is enabled.
Otherwise fail cfe_start_streaming() entirely.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: rp1_cfe: Remove PISP specific MBUS formats
Remove the MEDIA_BUS_FMT_PISP* format codcs entirely. For the image
pad formats, use the 16-bit Bayer format mbus codes instead. For the
config and stats pad formats, use MEDIA_BUS_FMT_FIXED.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: rp1_cfe: Fix link validate test for pixel format
Now that we have removed unique PISP media bus codes, the cfe format
table has multiple entries with the same media bus code for 16-bit
formats. The test in cfe_video_link_validate() did not account for this.
Fix it by testing the media bus code and the V4L2 pixelformat 4cc
together.
As a drive-by, ensure we have a valid CSI2 datatype id when programming
the hardware block.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: cfe: Set the CSI-2 link frequency correctly
Use the sensor provided link frequency to set the DPHY timing parameters
on stream_on. This replaces the hard-coded 999 MHz value currently being
used. As a fallback, revert to the original 999 Mhz link frequency.
As a drive-by, fix a 80-character line formatting error.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: cfe: Don't confuse MHz and Mbps
The driver was interchaning these units when talking about link rate.
Fix this to avoid confusion. Apart from the logging message change,
there is no function change in this commit.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: rp1: csi2: Fix missing reg writes
The driver has two places where it writes a register based on a
condition, and when that condition is false, the driver presumes that
the register has the reset value. This is not a good idea, so fix those
places to always write the register.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: fe: Use ~0, not -1, when working with unsigned values
Use ~0, not -1, when working with unsigned values (-1 is not unsigned).
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: Add back reg write debug prints
Add back debug prints in csi2 and pisp_fe reg_write() functions, but use
the 'irq' variants to avoid spamming in normal situation.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: csi2: Track CSI-2 errors
Track the errors from the CSI-2 receiver: overflows and discards. These
are recorded in a table which can be read by the userspace via debugfs.
As tracking the errors may cause much more interrupt load, the tracking
needs to be enabled with a module parameter.
Note that the recording is not perfect: we only record the last
discarded DT for each discard type, instead of recording all of them.
This means that e.g. if the device is discarding two unmatched DTs, the
debugfs file only shows the last one recorded. Recording all of them
would need a more sophisticated recording system to avoid the need of a
very large table, or dynamic allocation.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: csi2: Set values for enum csi2_mode
Set hardcoded values for enum csi2_mode, as the values will be
programmed to HW registers.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: fe: Fix default mbus code
When pisp_fe_pad_set_fmt() is given an mbus code that CFE does not
support, it currently defaults to MEDIA_BUS_FMT_SBGGR10_1X10. This is
not correct, as FE does not support SBGGR10.
Set the default to MEDIA_BUS_FMT_SRGGB16_1X16 instead.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
drivers: media: cfe: Find the source pads on the sensor entity
The driver was assuming that pad 0 on the sensor entity was the
appropriate source pad, but this isn't necessarily the case.
With video-mux, it has the sink pads first, and then the source
pad as the last one.
Iterate through the sensor pads to find the relevant source pads.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: rp1: cfe: Expose find_format_by_pix()
Make find_format_by_pix() accessible to other files in the driver.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Add cfe_find_16bit_code() and cfe_find_compressed_code()
Add helper functions which, given an mbus code, return the 16-bit
remapped mbus code or the compressed mbus code.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: csi2: Use get_frame_desc to get CSI-2 VC and DT
Use get_frame_desc pad op for asking the CSI-2 VC and DT from the source
device driver, instead of hardcoding to VC 0, and getting the DT from a
formats table. To keep backward compatibility with sources that do not
implement get_frame_desc, implement a fallback mechanism that always
uses VC 0, and gets the DT from the formats table, based on the CSI2's
sink pad's format.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Add is_image_node()
The hardware supports streaming from memory (in addition to streaming
from the CSI-2 RX), but the driver does not support this at the moment.
There are multiple places in the driver which uses
is_image_output_node(), even if the "output" part is not relevant. Thus,
in a minor preparation for the possible support for streaming from
memory, and to make it more obvious that the pieces of code are not
about the "output", add is_image_node() which will return true for both
input and output video nodes.
While at it, reformat also the metadata related macros to fit inside 80
columns.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Dual purpose video nodes
The RP1 CSI-2 DMA can capture both video and metadata just fine, but at
the moment the video nodes are only set to support either video or
metadata.
Make the changes to support both video and metadata. This mostly means
tracking both video format and metadata format separately for each video
node, and using vb2_queue_change_type() to change the vb2 queue type
when needed.
Briefly, this means that the user can get/set both video and meta
formats to a single video node. The vb2 queue buffer type will be
changed when the user calls REQBUFS or CREATE_BUFS ioctls. This buffer
type will be then used as the "mode" for the video node when the user
starts the streaming, and based on that either the video or the meta
format will be used.
A bunch of macros are added (node_supports_xxx()), which tell if a node
can support a particular mode, whereas the existing macros
(is_xxx_node()) will tell if the node is currently in a particular mode.
Note that the latter will only work correctly between the start of the
streaming and the end of the streaming, and thus should be only used in
those code paths.
However, as the userspace (libcamera) does not support dual purpose
video nodes, for the time being let's keep the second video node as
V4L2_CAP_META_CAPTURE only to keep the userspace working.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: Drop LE handling
The driver registers for line-end interrupts, but never uses them. This
just causes extra interrupt load, with more complexity in the driver.
Drop the LE handling. It can easily be added back if later needed.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Improve link validation for metadata
Improve the link validation for metadata by:
- Allowing capture buffers that are larger than the incoming frame
(instead of requiring exact match).
- Instead of assuming that a metadata unit ("pixel") is 8 bits, use
find_format_by_code() to get the format and use the bit depth from
there. E.g. bit depth for RAW10 metadata will be 10 bits, when we
move to the upstream metadata formats.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
drivers: media: cfe: Add more robust ISR handlers
Update the ISR logic to be more robust to sensors in problematic states
where interrupts may start arriving overlapped and/or missing.
1) Test for cur_frame in the FE handler, and if present, dequeue it in
an error state so that it does not get orphaned.
2) Move the sequence counter and timestamp variables to the node
structures. This allows the ISR to track channels running ahead when
interrupts arrive unordered.
3) Add a test to ensure we don't have a spurios (but harmlesS) call to
the FE handler in some circumstances.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: rp1: cfe: Fix error paths in cfe_start_streaming
Noted that if we get "node link is not enabled", then we also
get the videobuf2 splat for the driver not cleaning up correctly
on a failed start_streaming, and indeed we weren't returning the
buffers.
Checking the other error paths, noted that the "FE enabled, but
FE_CONFIG node is not" path was not calling pm_runtime_put.
Fix both paths.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drivers: media: cfe: Increase default size of embedded buffer
Increase the size of the default embedded buffer to 16k. This is done to
match what is advertised by the IMX219 driver and workaround a problem
where the embedded stream is not actually used. Without full streams API
support, the media pipeline validation will fail in these circumstances.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: rp1: cfe: Actually use the number of lanes configured
The driver was calling get_mbus_config to ask the sensor subdev
how many CSI2 data lanes it wished to use and with what other
properties, but then failed to pass that to the DPHY configuration.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: rp1: csi2: Fix csi2_pad_set_fmt()
The CSI-2 subdev's set_fmt currently allows setting the source and sink
pad formats quite freely. This is not right, as the CSI-2 block can only
do one of the following when processing the stream: 1) pass through as
is, 2) expand to 16-bits, 3) compress.
The csi2_pad_set_fmt() should take this into account, and only allow
changing the source side mbus code, compared to the sink side format.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: csi2: Use standard link_validate
The current csi2_link_validate() skips some important checks. Let's
rather use the standard v4l2_subdev_link_validate_default() as the
link_validate hook.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: csi2: Squash fixes
media: rp1: fe: Fix pisp_fe_pad_set_fmt()
pisp_fe_pad_set_fmt() allows setting the pad formats quite freely. This
is not correct, and the function should only allow formats as supported
by the hardware. Fix this by:
Allow no format changes for FE_CONFIG_PAD and FE_STATS_PAD. They should
always be the hardcoded initial ones.
Allow setting FE_STREAM_PAD freely (but the mbus code must be
supported), and propagate the format to the FE_OUTPUT0_PAD and
FE_OUTPUT1_PAD pads.
Allow changing the mbus code for FE_OUTPUT0_PAD and FE_OUTPUT1_PAD pads
only if the mbus code is the compressed version of the sink side code.
TODO: FE supports scaling and cropping. This should be represented here
too?
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: fe: Use standard link_validate
The current pisp_fe_link_validate() skips some important checks. Let's
rather use the standard v4l2_subdev_link_validate_default() as the
link_validate hook.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
drivers: media: cfe: Add 16-bit and compressed mono format support
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: rp1: cfe: Add missing remaps
8-bit bayer formats are missing remap definitions. Add them.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
media: rp1: cfe: Add missing compressed remaps
16-bit bayer formats are missing compressed remap definitions. Add them.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
drivers: media: pisp_be: pisp_fe: Update UAPI header licenses
Update the license tags on the pisp UAPI header files with the
"Linux-syscall-note" clause. Also replace the "GPL-2.0" tag with the
preferred "GPL-2.0-only" tag.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: rp1: cfe: Use the MIPI_CSI2_DT_xxx defines for csi_dt
Seeing as we now have the CSI2 data types defined, make use of
them instead of hardcoding the values.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: rp1: cfe: Add a csi_dt value for 16bit formats
Raw 16bit formats didn't have a csi_dt value defined, which
presumably would trip the WARN_ON(!fmt->csi_dt); in
cfe_start_channel.
The value is defined in CSI2 v2.0 as 0x2e, so set it accordingly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drivers: media: pisp_be: Add mono and 48-bit RGB pixel format support
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: pisp_be: Update seqeuence numbers of the buffers
Add a framebuffer sequence counter and increment on every completed job.
This counter is then used to update the VB2 buffer sequence count before
calling vb2_buffer_done().
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: cfe: Add remap entries for mono formats
The 8-bit and 16-bit mono formats were missing the appropriate remap
entries in the format table.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: rp1-cfe: Fix up link validation for CFE CFG input
After commit 5fd3e2412a ("media: v4l2-subdev: Support hybrid links
in v4l2_subdev_link_validate()") link_validate is called on V4L2
OUTPUT devices such as the CFE cfg buffers input.
The CFE link_validate function was assuming it was always the
sink of a link, which goes wrong on that port and does an invalid
dereference.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: rp1-cfe: Swap "raspberypi,rp1-cfe" compatible to downstream driver
Whilst we are wanting to maintain the downstream driver at the same time
as having the upstream merged, swap the "raspberypi,rp1-cfe" compatible
string across.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media/platform/raspberypi/rp1_cfe: Candidate fix for #5821
To avoid lost frame start in a subsequent session, avoid setting
the number of lanes back to 1 or putting CSI-2 Host into reset.
It's not clear if this is a watertight fix -- what if the camera
itself produced a truncated or garbled packet, or continued to
send until the next start? -- but it does seem to fix the issue.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drivers: media: rpi: cfe: Avoid unpack operation for 16-bit formats
The unpack operation is redundant for 16-bit sensor formats, don't set
the hardware to do it in these cases.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: cfe: Workaround for 16-bit mismatch in the hardware
Set the data type for 16-bit modes to 0 (wildcard) to workarond the
16-bit mismatch in hardware.
We also need to suppress the warning about DT == 0 on start stream in
these cases.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Implement a tristate-style option for "supports-cqe". If the property is
absent or zero, disable CQ completely. For 1, enable CQ unconditionally
for eMMC cards, and known-good SD cards. For 2, enable for eMMC cards,
and all SD cards that are not known-bad.
The sdhci-brcmstb driver needs to know about the tristate as its probe
sequence would otherwise override a disable in mmc_of_parse().
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
These cards have a known-good CQ implementation and are based on a
Longsys product. Add the MANFID for Longsys SD, and the particular CID
details for the Pi card.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
We have found that many SD cards in the field, even of the same make and
model, have latent bugs in their CQ implementation. Some product lines
have fewer bugs with newer manufacture dates, but this is not a
guarantee that a particular card is at a particular firmware revision
level.
Many of these bugs lead to card hangs or data corruption. Add a quirk to
mark a card as having a tested, working CQ implementation and ignore the
capability if absent.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The sensor has Low Conversion Gain (HCG) and High Conversion Gain (HCG)
modes, with the supposedly the HCG mode having better noise performance
at high gains.
As this parameter changes the gain range of the sensor, it isn't
possible to make this an automatic property, and there is no
suitable V4L2 control to set it, so just add it as a module parameter.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Copy the same validation logic as from the plane rotation property.
Fixes: 8fec3ff870 ("drm: Add a rotation parameter to connectors.")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
When initialized from panel_bridge_attach(), connector should
allow interlaced modes rather than invariably rejecting them,
so that other components can validate them.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
The upstream addition of the kernel parameter cgroup_disable makes it
possible to configure cgroups at boot time. In theory, re-enabling a
disabled cgroup is simply a case of removing the relevant cgroup_disable
setting, but this is difficult if the setting comes from Device Tree.
Re-introduce cgroup_enable as a way around the problem.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
On Raspberry Pi 4 and earlier models the firmware provides
a low speed (up to 115200 baud) bit-bashed UART on arbitrary
GPIOs using the second VPU core.
The firmware driver is designed to support 19200 baud. Higher
rates up to 115200 seem to work but there may be more jitter.
This can be useful for debug or managing additional low
speed peripherals if the hardware PL011 and 8250 hardware
UARTs are already used for console / bluetooth.
The firmware driver requires a fixed core clock frequency
and also requires the VPU PWM audio driver to be disabled
(dtparam=audio=off)
Runtime configuration is handled via the vc-mailbox APIs
with the FIFO buffers being allocated in uncached VPU
addressable memory. The FIFO pointers are stored in spare
VideoCore multi-core sync registers in order to reduce the number
of uncached SDRAM accesses thereby reducing jitter.
Signed-off-by: Tim Gover <tim.gover@raspberrypi.com>
serial: rpi-fw-uart: Demote debug log messages
A dev_info call in rpi_fw_uart_configure causes kernel log output every
time one opens the UART. Demote it to dev_dbg.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
LBM is only relevant for each active dlist, so there is
no need to double-buffer the allocations.
Cache the allocations per plane so that we can ensure the
allocations are possible.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Following the example of [1], move the state allocation out of the init
function to make it thread safe.
[1] commit 7e0351ae91 ("drm/vc4: tests: Stop allocating the state in
test init")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The tests on vc4 (BCM2835-7) were checking for DSI1 muxing being
to restricted channel 2, and therefore muxing with TXP was impossible.
As we no longer have that restriction, update the capabilities
defined for DSI1, move the tests that used to be impossible to the
valid list, and extend for additional combinations that are now
possible.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The BCM2712 comes with a different LBM size computation than the
previous generations, so let's add the few examples provided as kunit
tests to make sure we always satisfy those requirements.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
We'll start testing our planes code in situations where we will use more
than XRGB8888, so let's add a few common pixel formats.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
We'll start to add some tests for the plane state logic, so let's create
a helper to add a plane to an existing atomic state.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Some tests will need to find a plane to run a test on for a given CRTC.
Let's create a small helper to do that.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The current mock planes were just using the regular drm_plane_state,
while the driver expect struct vc4_plane_state that subclasses
drm_plane_state.
Hook the proper implementations of reset, duplicate_state, destroy and
atomic_check to create vc4_plane_state.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The BCM2712 has a simpler pipeline than the BCM2711, and thus the muxing
requirements are different. Create some tests to make sure we get proper
muxing decisions.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The BCM2712 has a simpler pipeline that can only output to a writeback
connector and two HDMI controllers.
Let's allow our kunit tests to create a mock of that pipeline.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Some tests will need to retrieve the output that was just allocated by
vc4_mock_atomic_add_output().
Instead of making them look them up in the DRM device, we can simply
make vc4_mock_atomic_add_output() return an error pointer that holds the
allocated output instead of the error code.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The DRM device pointer and the DRM encoder pointer are redundant, since
the latter is attached to the former and we can just follow the
drm_encoder->dev pointer.
Let's remove the drm_device pointer argument.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Testing whether the VideoCore generation we want to mock is vc5 or vc4
worked so far, but will be difficult to extend to support BCM2712 (VC6).
Convert to a switch.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
This is a squash of all firmware-kms related patches from previous
branches, up to and including
"drm/vc4: Set the possible crtcs mask correctly for planes with FKMS"
plus a couple of minor fixups for the 5.9 branch.
Please refer to earlier branches for full history.
This patch includes work by Eric Anholt, James Hughes, Phil Elwell,
Dave Stevenson, Dom Cobley, and Jonathon Bell.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: Fixup firmware-kms after "drm/atomic: Pass the full state to CRTC atomic enable/disable"
Prototype for those calls changed, so amend fkms (which isn't
upstream) to match.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: Fixup fkms for API change
Atomic flush and check changed API, so fix up the downstream-only
FKMS driver.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: Make normalize_zpos conditional on using fkms
Eric's view was that there was no point in having zpos
support on vc4 as all the planes had the same functionality.
Can be later squashed into (and fixes):
drm/vc4: Add firmware-kms mode
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
drm/vc4: FKMS: Change of Broadcast RGB mode needs a mode change
The Broadcast RGB (aka HDMI limited/full range) property is only
notified to the firmware on mode change, so this needs to be
signalled when set.
https://github.com/raspberrypi/firmware/issues/1580
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc4/drv: Only notify firmware of display done with kms
fkms driver still wants firmware display to be active
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
ydrm/vc4: fkms: Fix margin calculations for the right/bottom edges
The calculations clipped the right/bottom edge of the clipped
range based on the left/top margins.
https://github.com/raspberrypi/linux/issues/4447
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: fkms: Use new devm_rpi_firmware_get api
drm/kms: Add allow_fb_modifiers
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
drm/vc4: Add async update support for cursor planes
Now that cursors are implemented as regular planes, all cursor
movements result in atomic updates. As the firmware-kms driver
doesn't support asynchronous updates, these are synchronous, which
limits the update rate to the screen refresh rate. Xorg seems unaware
of this (or at least of the effect of this), because if the mouse is
configured with a higher update rate than the screen then continuous
mouse movement results in an increasing backlog of mouse events -
cue extreme lag.
Add minimal support for asynchronous updates - limited to cursor
planes - to eliminate the lag.
See: https://github.com/raspberrypi/linux/pull/4971https://github.com/raspberrypi/linux/issues/4988
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drivers/gpu/drm/vc4: Add missing 32-bit RGB formats
The missing 32-bit per pixel ABGR and various "RGB with an X value"
formats are added. Change sent by Dave Stevenson.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
drm: vc4: Fixup duplicated macro definition in vc4_firmware_kms
Both vc4_drv.h and vc4_firmware_kms.c had definitions for
to_vc4_crtc.
Rename the fkms one to make it unique, and drop the magic
define vc4_crtc vc4_kms_crtc
define to_vc4_crtc to_vc4_kms_crtc
that renamed half the variable and function names in a slightly
unexpected way.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: Fix FKMS for when the YUV chroma planes are different buffers
The code was assuming that it was a single buffer with offsets,
when kmstest uses separate buffers and 0 offsets for each plane.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: fkms: Rename plane related functions
The name collide with the Full KMS functions that are going to be made
public.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
drm/vc4_fkms: Fix up interrupt handler for both 2835/2711 and 2712
2712 has switched from using the SMI peripheral to another interrupt
source for the vsync interrupt, so handle both sources cleanly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: fkms: No SMI abuse needed on BCM2712
Since we don't use the (absent) SMI block to create interrupts on
BCM2712, there's no need to map any registers.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Currently, booting with no hdmi connected has:
pi@pi4:~ $ vcgencmd measure_clock hdmi pixel
frequency(9)=120010256
frequency(29)=74988280
After connecting hdmi we get:
pi@pi4:~ $ vcgencmd measure_clock hdmi pixel
frequency(9)=300005856
frequency(29)=149989744
and that persists after disconnecting hdmi
I can measure this on a power supply as 10mA@5.2V (52mW).
We should always remove clk_set_min_rate requests
when we no longer need them.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
The txp block can implement transpose as it writes out the image
data, so expose that through the new connector rotation property.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm: vc4: txp: Do not allow 24bpp formats when transposing
The hardware doesn't support transposing to 24bpp (RGB888/BGR888)
formats. There's no way to advertise this through DRM, so block
it from atomic_check instead.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
As the writeback connector doesn't have the same realtime
constraints of a live display, drop the panic priority for it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The transposer/writeback connector should be running with a
lower priority, so shouldn't be factored into the load
calculations.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Instead of having 48 generic overlay planes, assign 32 to the
writeback connector so that there is no ambiguity in wlroots
when trying to find a plane for composition using the writeback
connector vs display.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The HVS can accept an arbitrary number of planes, provided
that the overall pixel read load is within limits, and
the display list can fit into the dlist memory.
Now that DRM will support 64 planes per device, increase
the number of overlay planes from 16 to 48 so that the
dlist complexity can be increased (eg 4x4 video wall on
each of 3 displays).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The register to enable/disable background fill was being set
from atomic flush, however that will be applied immediately and
can be a while before the vblank. If it was required for the
current frame but not for the next one, that can result in
corruption for part of the current frame.
Store the state in vc4_hvs, and update it on vblank.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The documentation says that the TPZ filter can not upscale,
and requesting a scaling factor > 1:1 will output the original
image in the top left, and repeat the right/bottom most pixels
thereafter.
That fits perfectly with upscaling a 1x1 image which is done
a fair amount by some compositors to give solid colour, and it
saves a large amount of LBM (TPZ is based on src size, whilst
PPF is based on dest size).
Select TPZ filter for images with source rectangle <=1.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Seeing as the HVS can be configured with regard the scaling filter,
and DRM now supports selecting scaling filters at a per CRTC or
per plane level, we can implement it.
Default remains as the Mitchell/Netravali filter, but nearest
neighbour is now also implemented.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
There are no MEDIA_BUS_FMT_* defines for GRB or BRG, and adding
them is a pain.
Add a DT override to allow setting the order.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The current reset code doesn't actually stop the hdmi output.
That makes it difficult for displays to handle a mode set.
Powering down the PLL does actually remove the hdmi signal
and makes mode sets more reliable
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
There appears to be a requirement for some devices
(I'm testing with a 8K VRROOM 40Gbps HDMI switch)
for a measable delay between removing the hdmi phy output from
the old mode, to enabling the hdmi phy output for the new mode.
Without the delay, a mode switch has a small change of getting a permanent
'no signal', which requires a subsequent mode switch or a unplug/replug
to redetect.
Switching between 4kp24/25/30 modes fails about 5% of time in my testing.
Add a delay to make it impossible to switch faster than this.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
The intention of the vc4.force_hotplug setting is to
ignore hotplug completely.
It can be used when a display toggles hotplug when
switching AV inputs, going into standby or changing a
KVM switch, and some side effect of that is unwanted.
It turns out while vc4.force_hotplug currently makes
hotplug always read as asserted, that isn't enough to
stop drm doing lots of stuff, including re-reading
the edid.
An example of what drm does with a hotplug deasert/assert
and vc4.force_hotplug=1 currently is:
https://paste.debian.net/hidden/dc07434b/
That is unwanted. Lets ignore the hotplug interrupt
completely so drm is blissfully unaware of the hotplug change.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
See: https://forum.libreelec.tv/thread/24783-tv-avr-turns-back-on-right-after-turning-them-off
While the kernel provides a :D flag for assuming device is connected,
it doesn't stop this function from being called and generating a cec_phys_addr_invalidate
message when hotplug is deasserted.
That message provokes a flurry of CEC messages which for many users results in the TV
switching back on again and it's very hard to get it to stay switched off.
It seems to only occur with an AVR and TV connected but has been observed across a
number of manufacturers.
The issue started with https://github.com/raspberrypi/linux/pull/4371
and this provides an optional way of getting back the old behaviour
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
If you disable HDR metadata, then the hardware should stop
sending the infoframe, and that is implemented by the
clear_infoframe hook which wasn't implemented. Add it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
With the command line parser now providing the information about
the tv mode, use that as the preferred choice for initialising the
default of the tv_mode property.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Similar to the ch7006 and nouveau drivers, introduce a "tv_mode" module
parameter that allow setting the TV norm by specifying vc4.tv_norm= on
the kernel command line.
If that is not specified, try inferring one of the most popular norms
(PAL or NTSC) from the video mode specified on the command line. On
Raspberry Pis, this causes the most common cases of the sdtv_mode
setting in config.txt to be respected.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
drm/vc4: Do not reset tv mode as this is already handled by framework
In vc4_vec_connector_reset, the tv mode is already reset to the
property default by drm_atomic_helper_connector_tv_reset, so there
is no need for a local fixup to potentially some other default.
Fixes: 96922af144 ("drm/vc4: Allow setting the TV norm via module parameter")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The DSI block appears to be able to come up stuck in a condition where
it leaves the lanes in HS mode or just jabbering. This stops LP
transfers from completing as there is no LP time available. This is
signalled via the LP1 contention error.
Enabling video briefly clears that condition, so if we detect the
error condition, enable video mode and then retry.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Some DSI peripheral drivers wish to send commands in the
post_disable or panel unprepare callback. These are called
after the DSI host's disable call, but before the host's
post_disable if pre_enable_prev_first is set.
Don't reset the block until post_disable to allow these
commands to be sent.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The TC358762 bridge and panel decodes the mode differently on
DSI0 to DSI1 for no obvious reason, and results in a shift off
the screen.
Whilst it would be possible to change the compatible used for
the panel, that then messes up Pi5.
As it appears to be restricted to vc4 DSI0, fix up the mode
in vc4_dsi.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The pixel to byte FIFO appears to not always reset correctly,
which can lead to colour errors and/or horizontal shifts.
Reset on every vblank to work around the issue.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The block must be enabled for the FIFO resets to be actioned,
so ensure this is the case.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The initialisation sequence differs slightly from the documentation
in that the clocks are meant to be running before resets and
similar.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
DSI0 is misbehaving and needs to action things on vblank to
work around it.
Add a new hook to call across during vblank.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The dmabuf import already checks that the backing buffer is contiguous
and rejects it if it isn't. vc4 also requires that the buffer is
in the bottom 1GB of RAM, and this is all correctly defined via
dma-ranges.
However the kernel silently uses swiotlb to bounce dma buffers
around if they are in the wrong region. This relies on dma sync
functions to be called in order to copy the data to/from the
bounce buffer.
DRM is based on all memory allocations being coherent with the
GPU so that any updates to a framebuffer will be acted on without
the need for any additional update. This is fairly fundamentally
incompatible with needing to call dma_sync_ to handle the bounce
buffer copies, and therefore we have to detect and reject mappings
that use bounce buffers.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
We have a read-modify-write race when updating SCALER_DISPCTRL for
underrun and end-of-frame interrupts.
Ideally it would be fixed via a spinlock or similar, but that will
require a reasonable amount of study to ensure we don't get deadlocks.
The underrun reporting is only for debug, so disable it for now.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Users are reporting running out of DLIST memory. Add a
debugfs file to dump out all the allocations.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
During normal operations, the cursor position update is done through an
asynchronous plane update, which on the vc4 driver basically just
modifies the right dlist word to move the plane to the new coordinates.
However, when we have the overscan margins setup, we fall back to a
regular commit when we are next to the edges. And since that commit
happens to be on a cursor plane, it's considered a legacy cursor update
by KMS.
The main difference it makes is that it won't wait for its completion
(ie, next vblank) before returning. This means if we have multiple
commits happening in rapid succession, we can have several of them
happening before the next vblank.
In parallel, our dlist allocation is tied to a CRTC state, and each time
we do a commit we end up with a new CRTC state, with the previous one
being freed. This means that we free our previous dlist entry (but don't
clear it though) every time a new one is being committed.
Now, if we were to have two commits happening before the next vblank, we
could end up freeing reusing the same dlist entries before the next
vblank.
Indeed, we would start from an initial state taking, for example, the
dlist entries 10 to 20, then start a commit taking the entries 20 to 30
and setting the dlist pointer to 20, and freeing the dlist entries 10 to
20. However, since we haven't reach vblank yet, the HVS is still using
the entries 10 to 20.
If we were to make a new commit now, chances are the allocator are going
to give the 10 to 20 entries back, and we would change their content to
match the new state. If vblank hasn't happened yet, we just corrupted
the active dlist entries.
A first attempt to solve this was made by creating an intermediate dlist
buffer to store the current (ie, as of the last commit) dlist content,
that we would update each time the HVS is done with a frame. However, if
the interrupt handler missed the vblank window, we would end up copying
our intermediate dlist to the hardware one during the composition,
essentially creating the same issue.
Since making sure that our interrupt handler runs within a fixed,
constrained, time window would require to make Linux a real-time kernel,
this seems a bit out of scope.
Instead, we can work around our original issue by keeping the dlist
slots allocation longer. That way, we won't reuse a dlist slot while
it's still in flight. In order to achieve this, instead of freeing the
dlist slot when its associated CRTC state is destroyed, we'll queue it
in a list.
A naive implementation would free the buffers in that queue when we get
our end of frame interrupt. However, there's still a race since, just
like in the shadow dlist case, we don't control when the handler for
that interrupt is going to run. Thus, we can end up with a commit adding
an old dlist allocation to our queue during the window between our
actual interrupt and when our handler will run. And since that buffer is
still being used for the composition of the current frame, we can't free
it right away, exposing us to the original bug.
Fortunately for us, the hardware provides a frame counter that is
increased each time the first line of a frame is being generated.
Associating the frame counter the image is supposed to go away to the
allocation, and then only deallocate buffers that have a counter below
or equal to the one we see when the deallocation code should prevent the
above race from occurring.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The RP1 chip has the Cadence GEM block, but wants the tx_clock
to always run at 125MHz, in the same way as sama7g5.
Add the relevant configuration.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Raspberry Pi RP1 chip has the Cadence GEM ethernet
controller, so add a compatible string for it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
DSI0 and DSI1 have different widths for the command FIFO (24bit
vs 32bit), but the driver was assuming the 32bit width of DSI1
in all cases.
DSI0 also wants the data packed as 24bit big endian, so the
formatting code needs updating.
Handle the difference via the variant structure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Using increased bit depth for no reason increases power
consumption, and differs from the behaviour prior to the
conversion to use the HDMI helper functions.
Initialise the state max_bpc and requested_max_bpc to the
minimum value supported. This only affects Raspberry Pi,
as the other users of the helpers (rockchip/inno_hdmi and
sunx4i) only support a bit depth of 8.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
ws2812-pio-rp1 is a PIO-based driver for WS2812 LEDS. It creates a
character device in /dev, the default name of which is /dev/leds<n>,
where <n> is the instance number. The number of LEDS should be set
in the DT overlay, as should whether it is RGB or RGBW, and the default
brightness.
Write data to the /dev/* entry in a 4 bytes-per-pixel format in RGBW
order:
RR GG BB WW RR GG BB WW ...
The white values are ignored unless the rgbw flag is set for the device.
To change the brightness, write a single byte to offset 0, 255 being
full brightness and 0 being off.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Use the PIO hardware on RP1 to implement a PWM interface.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pwm: rp1: use pwmchip_get_drvdata() instead of container_of()
The PWM framework may not embed struct pwm_chip within the driver’s
private data. Using container_of() can result in accessing invalid
memory or NULL pointers, especially after recent kernel changes.
Switch to pwmchip_get_drvdata() to reliably access the driver data.
This resolves kernel warnings and probe failures seen after updating
from kernel 6.12.28 to 6.12.34 [1]
While at it remove the now obsolete `struct pwm_chip chip` member from
`struct pwm_pio_rp1`.
[1] https://github.com/raspberrypi/linux/issues/6971
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
Provide remote access to the PIO hardware in RP1. There is a single
instance, with 4 state machines.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Support larger data transfers
Add a separate IOCTL for larger transfer with a 32-bit data_bytes
field.
See: https://github.com/raspberrypi/utils/issues/107
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: More logical probe sequence
Sort the probe function initialisation into a more logical order.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Minor cosmetic tweaks
No functional change.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Add in-kernel DMA support
Add kernel-facing implementations of pio_sm_config_xfer and
pio_xm_xfer_data.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Handle probe errors
Ensure that rp1_pio_open fails if the device failed to probe.
Link: https://github.com/raspberrypi/linux/issues/6593
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: SM_CONFIG_XFER32 = larger DMA bufs
Add an ioctl type - SM_CONFIG_XFER32 - that takes uints for the buf_size
and buf_count values.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc/rp1-pio: Fix copy/paste error in pio_rp1.h
As per the subject, there was a copy/paste error that caused
pio_sm_unclaim from a driver to result in a call to
pio_sm_claim. Fix it.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Fix parameter checks wihout client
Passing bad parameters to an API call without a pio pointer will cause
a NULL pointer exception when the persistent error is set. Guard
against that.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Convert floats to 24.8 fixed point
Floating point arithmetic is not supported in the kernel, so use fixed
point instead.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Error out on incompatible firmware
If the RP1 firmware has reported an error then return that from the PIO
probe function, otherwise defer the probing.
Link: https://github.com/raspberrypi/linux/issues/6642
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Demote fw probe error to warning
Support for the RP1 firmware mailbox API is rolling out to Pi 5 EEPROM
images. For most users, the fact that the PIO is not available is no
cause for alarm. Change the message to a warning, so that it does not
appear with "quiet" in cmdline.txt.
Link: https://github.com/raspberrypi/linux/issues/6642
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: rp1-pio: Don't just reuse the same DMA buf
A missing pointer increment meant that not only was the same buffer
being reused again and again, there was also no protection against
using it simultaneously for multiple transfers. Fix that basic bug, and
also move a similar increment to before the transfer is started, which
feels less racy.
See: https://github.com/raspberrypi/linux/issues/6919
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The RP1 firmware runs a simple communications channel over some shared
memory and a mailbox. This driver provides access to that channel.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
firmware: rp1: Simplify rp1_firmware_get
Simplify the implementation of rp1_firmware_get, requiring its clients
to have a valid 'firmware' property. Also make it return NULL on error.
Link: https://github.com/raspberrypi/linux/issues/6593
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
firmware: rp1: Linger on firmware failure
To avoid pointless retries, let the probe function succeed if the
firmware interface is configured correctly but the firmware is
incompatible. The value of the private drvdata field holds the outcome.
Link: https://github.com/raspberrypi/linux/issues/6642
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
firmware: rp1: Rename to rp1-fw to avoid module name collision
There is already the driver in drivers/mfd/rp1.ko, so having
drivers/firmware/rp1.ko can cause issues when using modinfo
and similar, and we can get errors with "Module rp1 is already
loaded" when trying to load it.
Rename the module so that the name is unique.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
mailbox: rp1: Don't claim channels in of_xlate
The of_xlate method saves the calculated event mask in the con_priv
field. It also rejects subsequent attempt to use that channel because
the mask is non-zero, which causes a repeated instantiation of a client
driver to fail.
The of_xlate method is not meant to be a point of resource acquisition.
Leave the con_priv initialisation, but drop the test that it was
previously zero.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The Raspberry Pi RP1 includes 2 M3 cores running firmware. This driver
adds a mailbox communication channel to them via a doorbell and some
shared memory.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Attempting to start a non-idle channel causes an error message to be
logged, and is inefficient. Test for emptiness of the desc_issued list
before doing so.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The xHC may commence Host Initiated Data Moves for streaming endpoints -
see USB3.2 spec s8.12.1.4.2.4. However, this behaviour is typically
counterproductive as the submission of UAS URBs in {Status, Data,
Command} order and 1 outstanding IO per stream ID means the device never
enters Move Data after a HIMD for Status or Data stages with the same
stream ID. For OUT transfers this is especially inefficient as the host
will start transmitting multiple bulk packets as a burst, all of which
get NAKed by the device - wasting bandwidth.
Also, some buggy UAS adapters don't properly handle the EP flow control
state this creates - e.g. RTL9210.
Set Host Initiated Data Move Disable to always defer stream selection to
the device. xHC implementations may treat this field as "don't care,
forced to 1" anyway - xHCI 1.2 s4.12.1.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
SPI transfers are of defined length, unlike some UART traffic, so it is
safe to let the DMA controller choose a suitable memory width.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
For devices where transfer lengths are not known upfront, there is a
danger when the destination is wider than the source that partial words
can be lost at the end of a transfer. Ideally the controller would be
able to flush the residue, but it can't - it's not even possible to tell
that there is any.
Instead, allow the client driver to avoid the problem by setting a
smaller width.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Some connectors, particularly writeback, can implement flip
or transpose operations as writing back to memory.
Add a connector rotation property to control this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Some hardware will implement transpose as a rotation operation,
which when combined with X and Y reflect can result in a rotation,
but is a discrete operation in its own right.
Add an option for transpose only.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The limit of 32 planes per DRM device is dictated by the use
of planes_mask returning a u32.
Change to a u64 such that 64 planes can be supported by a device.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The non-desktop property "Indicates the output should be ignored for
purposes of displaying a standard desktop environment or console."
That sounds like it should be true for all writeback and virtual
connectors as you shouldn't render a desktop to them, so set it
by default.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The DHT11 datasheet is pretty cryptic, but it does suggest that after
each integer value (humidity and temperature) there are "decimal"
values. Validate these as integers in the range 0-9 and treat them as
tenths of a unit.
Link: https://github.com/raspberrypi/linux/issues/6220
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
For platforms that have xHCI controllers attached over PCIe, and
non-coherent routes to main memory, a theoretical race exists between
posting new TRBs to a ring, and writing to the doorbell register.
In a contended system, write traffic from the CPU may be stalled before
the memory controller, whereas the CPU to Endpoint route is separate
and not likely to be contended. Similarly, the DMA route from the
endpoint to main memory may be separate and uncontended.
Therefore the xHCI can receive a doorbell write and find a stale view
of a transfer ring. In cases where only a single TRB is ping-ponged at
a time, this can cause the endpoint to not get polled at all.
Adding a readl() before the write forces a round-trip transaction
across PCIe, definitively serialising the CPU along the PCI
producer-consumer ordering rules.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
If a device frequently NAKs, it can exhaust the scheduled handshakes in
a frame. It will then not get polled by the controller until the next
frame interval. This is most noticeable on FS devices as the controller
schedules a small set of transactions only once per full-speed frame.
Setting the ENH_PER_NAK_FS/LS bits in the GUCTL1 register increases the
number of transactions that can be scheduled to Async (Control/Bulk)
endpoints in the respective frame time. In the FS case, this only
applies to FS devices directly connected to root ports.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Add two quirk properties that control whether or not the controller
issues many more handshakes to FS/HS Async endpoints in a single
(micro)frame. Enabling these can significantly increase throughput for
endpoints that frequently respond with NAKs.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
A user has reported that a card of this model from late 2021 doesn't
work, so extend the date range and make it match on all card sizes.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
As a workaround (and possibly a fix) for CPU spins observed on BCM2837,
use ptep_clear_flush_young instead of ptep_test_and_clear_young inside
lru_gen_look_around in order to expose PTE changes to the MMU. Note that
on architectures that don't require an explicit flush,
ptep_clear_flush_young just calls ptep_test_and_clear_young.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Some apps like linpack use numa_setpolicy to disable numa,
but that tends to have a significant performance hit for us.
If you have a cmdline.txt setting of numa_policy (to something other
than default), then lets ignore runtime changes and stick with
the cmdline.txt setting.
Not specifying numa_setpolicy in cmdline, or setting
numa_setpolicy=default(*) will allow runtime settings to work.
(*) easier to do when numa_setpolicy=interleave is set in DT.
Ignore logging for the first 40 seconds as there are some
expected switches during boot.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Show process name in set_mempolicy() ignored message
Signed-off-by: Trevor Man <tman_github@trejan.com>
To help work around certain memory controller limitations or similar, a
random NUMA allocation memory policy is added.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Add iommu_dma_numa_policy= kernel parameter which can be used to modify
the NUMA allocation policy of remapped buffer allocations.
Policy is only used for devices which are not associated with a NUMA node.
Syntax identical to what tmpfs accepts as it's mpol argument is accepted.
Some examples:
iommu_dma_numa_policy=interleave
iommu_dma_numa_policy=interleave=skip-interleave
iommu_dma_numa_policy=bind:0-3,5,7,9-15
iommu_dma_numa_policy=bind=static:1-2
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Add numa_policy kernel argument to allow overriding the kernel's default
NUMA policy at boot time.
Syntax identical to what tmpfs accepts as it's mpol argument is accepted.
Some examples:
numa_policy=interleave
numa_policy=interleave=skip-interleave
numa_policy=bind:0-3,5,7,9-15
numa_policy=bind=static:1-2
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
The i.MX8MP makes calls on it's source device to determine
the link-frequency that should be configured on the CSI2 receiver.
When the source is behind a video mux, we need to pass this call through
to the connected device.
Map the control handler of the source device to the video-mux,
essentially proxying all controls on the mux to the device which has
it's link enabled.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Add EXPORT_SYMBOL_GPL() for find_cpio_data() so that loadable modules
may also parse uncompressed cpio.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
The Sony IMX500 is a stacked 1/2.3-inch CMOS digital image sensor and
inbuilt AI processor with an active array CNN (Convolutional Neural
Network) inference engine. The native sensor size is 4056H x 3040V, and
the module also contains an in-built ISP for the CNN. The module is
programmable through an I2C interface with firmware and neural network
uploads being made over SPI. This driver supports imaging only.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
media: i2c: imx500: Inbuilt AI processor support
Add support for the IMX500's inbuilt AI processor. The IMX500 program
loader, AI processor firmware, DNN weights are accessed via the kernel's
firmware interface on 'open' and are transferred to the IMX500 over SPI.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
media: i2c: imx500: Enable LED during SPI transfers
The Raspberry Pi 'AI Camera' is equipped with an LED. Enable this LED
during SPI transfers to indicate to the end-user that progress is being
made during large tramsfers.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
drivers: media: imx500: Fixes for vblank control
Reduce the default/max framerate of the 2x2 binned mode to 30fps.
The current limit of 50fps can cause the sensor to produce corrupt
frames and cause missing framing events.
Also fixup the vblank control min/max/default/step paramters when
setting up.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx500: Simplify the vblank control init
Set the VBLANK control minimum and default values to IMX500_VBLANK_MIN
unconditionally everywhere.
Remove the mode specific framerate_default parameter, it is now unused.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx500: Enable LS correction
This correction is calibrated to approx 5000K.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
imx500: Fix for long exposure setup
The IMX500 (unlike the IMX477/IMX708) requires two regsiters to be set
for the exposure shift value to work correctly. The additional register
write (which was missing) is for the integration time shift.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx500: Enable sensor temperature monitoring
The register needs to be disabled before loading any firmware, otherwise
the upload fails for unknown reasons. Re-enable before starting the
sensor streaming.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx500: Add device id readback control
Add a new custom control V4L2_CID_USER_GET_IMX500_DEVICE_ID to allow
userland to query the device id from the IMX500 sensor eeprom.
Note that this device id can only be accessed when a network firmware
has been upoloaded to the device, so cannot be cached on probe.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx500: pm_runtime error paths
This change amends various error-paths in imx500_start_streaming() to
ensure that pm_runtime refcounts do not remain erroneously incremented
on failure.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
media: i2c: imx500: GPIO acquire/release semantics
When the imx500 driver is used as part of the 'AI Camera', the poweroff
state is never reached as the camera and gpio driver share a regulator.
By releasing the GPIOs when they are not in use, 'AI Camera' is able to
achieve a powered-down state.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
Add YAML device tree binding for the Sony IMX500 CMOS image sensor /
CNN inference engine. Also, add a MAINTAINERS entry.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
The check if the oscillator stop bit is set was reading from Control_1
register instead of the Seconds register.
This caused the Seconds register to be incorrectly changed if bit 7 of
Control_1 happens to be set.
Signed-off-by: Axel Hammarberg <axel.hammarberg@gmail.com>
In the same way that other subsystems support the setting of device
id numbers from Device Tree aliases, allow gpiochip numbers to be
derived from "gpiochip<n>" aliases.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The DW SPI interface has a 16-bit clock divider, where the bottom bit
of the divisor must be 0. Limit how low the clock speed can go to
prevent the clock divider from being truncated, as that could lead to
a much higher clock rate than requested.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Ensure the transmit FIFO has emptied before ending the transfer by
dropping the TX threshold to 0 when the last byte has been pushed into
the FIFO. Include a similar fix for the non-IRQ paths.
See: https://github.com/raspberrypi/linux/issues/6285
Fixes: 6014649de7 ("spi: dw: Save bandwidth with the TMOD_TO feature")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Do an end-run around ASoC in lieu of not being able to easily find the
associated DMA controller capabilities.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
There's no real need to constrain MEM access widths to 32-bit (or
narrower), as the DMAC is intelligent enough to size memory accesses
appropriately. Wider accesses are more efficient.
Similarly, MEM burst lengths don't need to be a function of DEV burst
lengths - the DMAC packs/unpacks data into/from its internal channel
FIFOs appropriately. Longer accesses are more efficient.
However, the DMAC doesn't have complete support for unaligned accesses,
and blocks are always defined in integer multiples of SRC_WIDTH, so odd
source lengths or buffer alignments will prevent wide accesses being
used, as before.
There is an implicit requirement to limit requested DEV read burst
lengths to less than the hardware's maximum configured MSIZE - otherwise
RX data will be left over at the end of a block. There is no config
register that reports this value, so the AXI burst length parameter is
used to produce a facsimile of it. Warn if such a request arrives that
doesn't respect this.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Disabling the peripheral resets controller state which has a dangerous
side-effect of disabling the DMA handshake interface while it is active.
This can cause DMA channels to hang.
The error recovery pathway will wait for DMA to stop and reset the chip
anyway, so mask further FIFO interrupts and let the transfer finish
gracefully.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
TMOD_RO is the receive-only mode that doesn't require data in the
transmit FIFO in order to generate clock cycles. Using TMOD_RO when the
device doesn't care about the data sent to it saves CPU time and memory
bandwidth.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
TMOD_TO is the transmit-only mode that doesn't put data into the receive
FIFO. Using TMOD_TO when the user doesn't want the received data saves
CPU time and memory bandwidth.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The firmware advertises its features as a string of words separated by
spaces. Ensure that feature names are only matched in their entirety.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The snps,block-size DT property declares the maximum block size for each
channel of the dw-axi-dmac. However, the driver ignores these when
setting max_seg_size and uses MAX_BLOCK_SIZE (4096) instead.
To take advantage of the efficiencies of larger blocks, calculate the
minimum block size across all channels and use that instead.
See: https://github.com/raspberrypi/linux/issues/6256
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The Raspberry Pi RP2040 GPIO bridge is an I2C-attached device exposing
both a Tx-only SPI controller, and a GPIO controller.
Due to the relative difference in transfer rates between standard-mode
I2C and SPI, the GPIO bridge makes use of 12 MiB of non-volatile storage
to cache repeated transfers. This cache is arranged in ~8 KiB blocks and
is addressed by the MD5 digest of the data contained therein.
Optionally, this driver is able to take advantage of Raspberry Pi RP1
GPIOs to achieve faster than I2C data transfer rates.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
spi: rp2040-gpio-bridge: Add debugfs progress indicator
Useful for tracking upload progress via userspace.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
spi: rp2040-gpio-bridge: add missing MD5 dependency
rp2040-gpio-bridge relies on the md5 crypto driver. This dependency
cannot be determined automatically as rp2040-gpio-bridge does not
use any of md5's symbols directly.
Declare a soft 'pre' dependency on md5 to ensure that it is included and
loaded before rp2040-gpio-bridge.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
spi: rp2040-gpio-bridge: fix gpiod error handling
In some circumstances, devm_gpiod_get_array_optional() can return
PTR_ERR rather than NULL to indicate failure. Handle these cases.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
spi: rp2040-gpio-bridge: probe: Cfg fast_xfer clk
Fast transfer mode requires that the first bit of data is clocked with a
rising edge. This can cause extra bits of data to be clocked on hardware
where the clock signal uses a pull-up. This change ensures that clk is
driven low before fast data transfer mode is entered.
Signed-off-by: Richard Oliver <richard.oliver@raspberrypi.com>
Offset the backend dev-nodes starting at /dev/video20
onwards to maintain backward compatibility with the
pre-upstreamed kernel driver.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add helpers to set and get vchiq driver data. vchiq_set_drvdata() and
vchiq_get_drvdata() wraps dev_set_drvdata() and dev_get_drvdata()
respectively.
Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Add version 4.17.1 of the Hailo PCIe device drivers.
Sourced from https://github.com/hailo-ai/hailort-drivers/
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: pcie: hailo: Fix include paths
An attempt to fix the include paths - they look reasonable, but the
GitHub auto-builds fail.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drivers: media: pci: Update Hailo accelerator device driver to v4.18.0
Sourced from https://github.com/hailo-ai/hailort-drivers/
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: pci: Add wrapper after removal of follow_pfn
drivers: media: pci: Fix Hailo compile warnings
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drivers: media: pci: Update Hailo accelerator device driver to v4.19
Sourced from https://github.com/hailo-ai/hailort-drivers/
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: pci: Update Hailo accelerator device driver to v4.20
Sourced from https://github.com/hailo-ai/hailort-drivers
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: pci: hailo: Fix kernel warning when calling find_vdma()
Calling this function without holding the mmap_read_lock causes the
kernel to throw an error message, spamming the dmesg logs when running
the Hailo hardware.
Fix it by adding the approprite lock/unlock functions around find_vdma().
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: pci: hailo: Better lock handling when calling find_vdma()
Due to possible instabilities, reduce the mmap read lock time to only
cover the call to find_vdma().
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Pass the DRM connector name to any configured backlight
device so that userspace can associate the two items.
Ideally this should be in drm_panel, but it is bridge/panel
that creates the drm_connector and therefore knows the name.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/bridge: panel: Ensure backlight is reachable
Ensure that the various options of modules vs builtin results
in being able to call into the backlight code.
https://github.com/raspberrypi/linux/issues/6198
Fixes: 573f8fd0ab ("drm/bridge: panel: Name an associated backlight device")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The naming of backlight devices is not terribly useful for
associating a backlight controller with a display (assuming
it is attached to one).
Add a sysfs node that will return a display name that can be set
by other subsystems.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Even when configured to use only gpiod CS lines, the DW SPI controller
still expects a bit to be set in the SER register, otherwise transfers
stall. For the csgpiod case, nominate bit 0 for the job.
See: https://github.com/raspberrypi/linux/issues/6159
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Reverts 8a4b2fc9c9 ("drm/bridge: tc358762: Split register programming from pre-enable to enable")
as we want the config commands sent before video starts.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The autodetection of resolution/timing by the TC358762 can lead
to the display being shifted by a pixel or two.
Program the TC358762 with the requested mode timing so that
it can reproduce it accurately.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
"rotation" is listed as a standard property of panels in panel-common.yaml,
therefore it would be logical to process that from within the core
code should a panel driver not implement the get_orientation hook.
Call of_drm_get_panel_orientation from
drm_connector_set_orientation_from_panel to get that information.
This removes the need for any boiler-plate in panel drivers for calling
drm_connector_set_orientation_from_panel or
drm_connector_set_panel_orientation.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This code:
for_each_sg(sgl, sg, sg_len, i)
num_sgs += DIV_ROUND_UP(sg_dma_len(sg), axi_block_len);
determines how many hw_desc are allocated.
If sg_dma_len(sg)=0 we don't allocate for this sgl.
However in the next loop, we will increment loop
for this case, and loop gets higher than num_sgs
and we trample memory.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Newer versions of the DesignWare I2C block support the detection of
stuck signals, and a mechanism to recover from them. Add the required
software support to the driver.
This change was prompted by the observation that reading a single byte
from register 0 of a VEML7700 seems to cause it to issue an ACK too
early, and the controller to complain about losing arbitration. There
is a suspicion that this may be a more widespread problem, but at least
this patch prevents the bus from locking up.
See: https://github.com/raspberrypi/linux/issues/6057
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Recent Integral cards end up with corrupt sectors after a flash erase.
This covers sizes for the A2 range, which can't be differentiated from
the A1 range which might not have the same issue.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Only CMD38 with Arg=0x1 (Discard) is supported when in CQ mode, so
turn it off before issuing a non-discard erase op.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Cards with manufacture dates in 2019 and 2020 have been seen in the wild
that hang indefinitely if issued a cache flush command in CQ mode.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Samsung EVO Plus, Pro Plus and Evo Ultimate cards of this era appear to
have a broken cache-flush implementation when operating in CQ mode.
Unfortunately the cards seem to use a separate CID name string for every
variant and capacity, so nobble the cache feature for this MANFID, OEMID
and year. Turning this off seems to have negligible impact on
random-write throughput in non-CQ mode.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Posted write tracking introduced in the commit below raced with re-use
of the requests between completion and submission, potentially causing
underflow of the pending write count.
Fixes: e6c1e862b2 ("mmc: restrict posted write counts for SD cards in CQ mode")
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Command Queueing requires Write Cache and Power off Notification support
from the card - but using the write cache forms a contract with the host
whereby the card expects to be told about impending power-down.
The implication is that (for performance) the card can do unsafe things
with pending write data - including reordering what gets committed to
nonvolatile storage at what time.
Exposed SD slots and platforms powered by hotpluggable means (i.e.
Raspberry Pis) can't guarantee that surprise removal won't happen.
To limit the scope for cards to invent new ways to trash filesystems,
limit pending writes to 1 (equivalent to the non-CQ behaviour).
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
fixup: mmc: restrict posted write counts for SD cards in CQ mode
Leaving card->max_posted_writes unintialised was a bad thing to do.
Also, cqe_enable is 1 if hsq is enabled as hsq substitutes the cqhci
implementation with its own.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Recovery claims the MMC card so the card-detect work gets significantly
delayed - leading to lots of error recovery loops that can never do
anything but fail.
Explicitly detect the card after CQE has halted and bail if it's not
there.
Also ratelimit a not-very-descriptive warning - one occurrence in dmesg
is enough to signal that something is amiss.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
If the controller is being reset, then the CQE needs to be reset as well.
For removable cards, CQHCI_SSC1 must specify a polling mode (CBC=0)
otherwise it's possible that the controller stops emitting periodic
CMD13s on card removal, without raising an error status interrupt.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
This gains about 8-12% sequential write speed with the fastest SD/eMMC
cards, and Class A1/A2 card sequential performance is only assured with
a 4MiB write length.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The attached PHY performs parameter validation, so the switch from HS200
to HS (before selecting HS400/HS400es) with a 200MHz clock fails to
update pad timings and results in CRC errors from the card.
Underclocking the interface is safe, so do that in the downgrade callback.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The spec allows for up to two 512-byte pages to be allocated for the
Extension Register General Info block, so allocate accordingly.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Don't attempt to turn on CQ if the other mandatory features are not
indicated as supported by the card. Also make sure that the register write
actually stuck, as some cards claim support but never report back that
the queue engine is enabled.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Certain status bits in these registers may need polling outside of
SD-specific code. Export in sd_ops.h
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The eMMC spec says that in certain circumstances the controller can't
respond to a halt request - in practice, this occurs if a CMD
timeout happens (card went away/crashed).
Clear the halt request by writing 0 to CQHCI_CTL. Also fix a logic error
testing for halt in cqhci_request.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
For unknown reasons the controller seems to reset the idle polling timer
interval on CQE enable/disable to 8 clocks which is extremely short.
Just use the reset value in the eMMC spec (4096 clock periods which at
200MHz is ~20uS).
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Add a LED_FULL trigger equivalent to mmc_start_request() in
mmc_cqe_start_req(), otherwise it stays off forever.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The Performance Extension register is regularly accessed in a hot path
to do write cache flushes. Don't invoke kmalloc/kfree for every access,
preallocate a 512B buffer for this purpose.
Also remove an unused alloc in sd_enable_cache().
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Application class A2 cards require CQ to be enabled to realise their
stated performance figures. Add support to enable/disable card CQ via
the Performance Enhancement extension register, and cater for the slight
differences in command set versus eMMC.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Commit 7d239fbf9d broke 802.1X authentication by setting
profile->use_fwsup = NONE whenever PSK is not used. However
802.1X does not use PSK and requires profile->use_fwsup set
to 1X, or brcmf_cfg80211_set_pmk() fails. Fix this by checking
that profile->use_fwsup is not already set to 1X and avoid
setting it to NONE in that case.
Fixes: 7d239fbf9d (brcmfmac: Fix interoperating DPP and other encryption network access)
Fixes: https://github.com/raspberrypi/linux/issues/5964
1. If firmware supports 4-way handshake offload but not supports DPP
4-way offload, when user first connects encryption network, driver will
set "sup_wpa 1" to firmware, but it will further result in DPP
connection failure since firmware won't send EAPOL frame to host.
2. Fix DPP AP mode handling action frames.
3. For some firmware without fwsup support, the join procedure will be
skipped due to "sup_wpa" iovar returning not-support. Check the fwsup
feature before do such iovar.
Signed-off-by: Kurt Lee <kurt.lee@cypress.com>
Signed-off-by: Double Lo <double.lo@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
In deep sleep mode (DS1) ARM is off and once exit trigger comes than
mailbox Interrupt comes to host and whole reinitiation should be done
in the ARM to start TX/RX.
Also fix below issus for DS1 exit:
1. Sent Tx Control frame only after firmware redownload complete (check
F2 Ready before sending Tx Control frame to Firmware)
2. intermittent High DS1 TX Exit latency time (almost 3sec) ==> This is
fixed by skipping host Mailbox interrupt Multiple times (ulp state
mechanism)
3. RX GlOM save/restore in Firmware
4. Add ULP event enable & event_msgs_ext iovar configuration in FMAC
5. Add ULP_EVENT_RECV state machine for sbwad support
6. Support 2 Byte Shared memory read for DS1 Exit HUDI implementation
Signed-off-by: Praveen Babu C <pucn@cypress.com>
Signed-off-by: Naveen Gupta <nagu@cypress.com>
[Merge from 4.14.77 to 5.4.18; set BRCMF_SDIO_MAX_ACCESS_ERRORS to 20]
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
JIRA: SWWLAN-135583
JIRA: SWWLAN-136577
i2c_mux_add_adapter takes a force_nr parameter that allows an explicit
bus number to be associated with a channel. However, only i2c-mux-reg
and i2c-mux-gpio make use of it.
To help with situations where it is desirable to have a fixed, known
base address for the channels of a mux, create a "base-nr" property.
When force_nr is 0 and base-nr is set and non-zero, form a force_nr
value from the sum of base-nr and the channel ID.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
This patch adds the device ID for the BCM4343A2 module, found e.g. in
the Infineon (Cypress) CYW43439 chip. The required firmware file is
named 'BCM4343A2.hcd'.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
If enabled, DMA_BOUNCE_UNALIGNED_KMALLOC causes the swiotlb buffers
(64MB, by default) to be allocated, even on systems where the DMA
controller can reach all of RAM. This is a huge amount of RAM to
waste on a device with only 512MB to start with, such as the Zero 2 W.
See: https://github.com/raspberrypi/linux/issues/5975
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add support for non-standard bus speeds by treating them as detuned
versions of the slowest standard speed not less than the requested
speed.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Calculate the HCNT and LCNT values for all modes using the rise and
fall times of SCL, the aim being a 50/50 mark/space ratio.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
There are three parkmode disable bits, one for each bus instance type.
Add FS/LS and parse the quirk out of DT. Also update the slightly
mangled quirk descriptions.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
There are three disable bits, one for each bus-instance type. Add a
quirk to cover the FS/LS type, and update the slightly mangled quirk
descriptions in the process.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The forced conversion of native CS lines into software CS lines is done
whether or not the controller has been given any CS lines to use. This
breaks the use of the spi0-0cs overlay to prevent SPI from claiming any
CS lines, particularly with spidev which doesn't pass in the SPI_NO_CS
flag at creation.
Use the presence of an empty cs-gpios property as an indication that no
CS lines should be used, bypassing the native CS conversion code.
See: https://github.com/raspberrypi/linux/issues/5835
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
For CSI2 receivers that need to know the link frequency,
add it as a control to the driver.
Interlaced modes are 216Mbp/s or 108MHz, whilst going through
the I2P to deinterlace gives 432Mb/s or 216MHz.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
CSI2 devices are meant to use the 1Xnn formats rather than 2Xnn
such as MEDIA_BUS_FMT_UYVY8_2X8.
For devices with ADV7180_FLAG_MIPI_CSI2 set, use
MEDIA_BUS_FMT_UYVY8_1X16.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Pi 5 uses BL31 as its armstub file, so the reset goes via PSCI. Parse
any "reboot" parameter as a partition number to reboot into.
N.B. This code path is only used if reboot mode has been set to warm
or soft.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add support for the ROHM BU64754 Motor Driver for Camera Autofocus. A
V4L2 Subdevice is registered and provides a single
V4L2_CID_FOCUS_ABSOLUTE control.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Commit 7cd70656d1 ("drm/bridge: display-connector: implement
bus fmts callbacks") added use of drm_atomic_helper_bridge_*
functions, but didn't select the dependency of DRM_KMS_HELPER.
If nothing else selected that dependency it resulted in a
build failure.
Select the missing dependency.
Fixes: 7cd70656d1 ("drm/bridge: display-connector: implement bus fmts callbacks")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The mainline driver has implemented analogue gain using the control
V4L2_CID_GAIN instead of V4L2_CID_ANALOGUE_GAIN.
libcamera requires V4L2_CID_ANALOGUE_GAIN, and therefore fails.
Update the driver to use V4L2_CID_ANALOGUE_GAIN.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Step wise governor increases the mitigation level when the temperature
goes above a threshold and will decrease the mitigation when the
temperature falls below the threshold. If it were a case, where the
temperature hovers around a threshold, the mitigation will be applied
and removed at every iteration. This reaction to the temperature is
inefficient for performance.
The use of hysteresis temperature could avoid this ping-pong of
mitigation by relaxing the mitigation to happen only when the
temperature goes below this lower hysteresis value.
Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org>
Signed-off-by: Lina Iyer <ilina@codeaurora.org>
drivers: thermal: step_wise: avoid throttling at hysteresis temperature after dropping below it
Signed-off-by: Serge Schneider <serge@raspberrypi.org>
Fix hysteresis support in gov_step_wise.c
Directly get hyst value instead of going through an
optional and, now, unimplemented function.
Signed-off-by: Jürgen Kreileder <jk@blackdown.de>
Users have reported log spam created by "Event Ring Full" xHC event
TRBs. These are caused by interrupt latency in conjunction with a very
busy set of devices on the bus. The errors are benign, but throughput
will suffer as the xHC will pause processing of transfers until the
event ring is drained by the kernel. Expand the number of event TRB slots
available by increasing the number of event ring segments in the ERST.
Controllers have a hardware-defined limit as to the number of ERST
entries they can process, so make the actual number in use
min(ERST_MAX_SEGS, hw_max).
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
With the new support for a chain of sys_off handlers, gpio-poweroff
does not disable a normal shutdown (though it does delay it). There
is therefore no need for the noisy WARN from the kernel.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Certain controllers (dwc-mshc) generate timeout conditions separately to
command-completion conditions, where the end result is interrupts are
separated in time depending on the current SDCLK frequency.
This causes spurious interrupts if SDCLK is slow compared to the CPU's
ability to process and return from interrupt. This occurs during card
probe with an empty slot where all commands that would generate a
response time out.
Add a quirk to squelch command response interrupts when a command
timeout interrupt is received.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
For situations where there are multiple DRM cards in a system,
add a query of DT for "drm_fb" designations for cards to set
their preferred /dev/fbN designation.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/fb_helper: Change query for FB designation from drm_fb to drm-fb
Fixes: 1216ea56c2 ("drm/fb-helper: Look up preferred fbdev node number from DT")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add a flag custom_fb_num to denote that the client has
requested a specific fbdev node number via node.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Vision Components have made an OV9281 module which blocks reading
back the majority of registers to comply with NDAs, and in doing
so doesn't allow auto-increment register reading as used when
reading the chip ID.
Use two reads and manually combine the results.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add binding for the new RTC driver for Raspberry Pi.
This platform has an RTC managed by firmware, and this RTC
driver provides the simple mailbox interface to access it.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
dt: bindings: update rpi-rtc binding
Add property for bcm2712 firmware RTC driver charger control
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
This supports setting and reading the real time clock
and supports wakeup alarms.
To support wake up alarms you want this bootloader config:
POWER_OFF_ON_HALT=1
WAKE_ON_GPIO=0
You can test with:
echo +600 | sudo tee /sys/class/rtc/rtc0/wakealarm
sudo halt
That will halt (in an almost no power state),
then wake and restart after 10 minutes.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
drivers: rtc-rpi: add battery charge circuit control and readback
Parse devicetree for a charger voltage and apply it. If nonzero and a
valid voltage, the firmware will enable charging, otherwise the charger
circuit is disabled.
Add sysfs attributes to read back the supported charge voltage range,
the measured battery voltage, and the charger setpoint.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
We currently see these regularly:
[ 25.157560] irq 31, desc: 00000000c15e6d2c, depth: 0, count: 0, unhandled: 0
[ 25.164658] ->handle_irq(): 00000000b1775675, brcmstb_l2_intc_irq_handle+0x0/0x1a8
[ 25.172352] ->irq_data.chip(): 00000000fea59f1c, gic_chip_mode1+0x0/0x108
[ 25.179166] ->action(): 000000003eda6d6f
[ 25.183096] ->action->handler(): 000000002c09e646, bad_chained_irq+0x0/0x58
[ 25.190084] IRQ_LEVEL set
[ 25.193142] IRQ_NOPROBE set
[ 25.196198] IRQ_NOREQUEST set
[ 25.199255] IRQ_NOTHREAD set
with:
$ cat /proc/interrupts | grep 31:
31: 1 0 0 0 GICv2 129 Level (null)
The interrupt is described in DT with IRQ_TYPE_LEVEL_HIGH
But the current compatible string uses the controller in edge triggered mode
(as that config matches our register layout).
Add a new compatible structure for level driven interrupt with our register layout.
We had already been using this compatible string in device tree, so no change needed
there.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Add a driver for BCM2712 IOMMUs.
There is a small driver for the Shared IOMMU TLB Cache.
Each IOMMU instance is a separate device.
IOMMUs are set up with a "pass-through" range covering
the lowest 40BGytes (which should cover all of SDRAM)
for the benefit of non-IOMMU-aware devices that share
a physical IOMMU; and translation for addresses in the
range 40GB to 42GB.
An optional parameter adds a DMA offset (which otherwise
would be lost?) to virtual addresses for DMA masters on a
bus such as PCIe.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
iommu: bcm2712-iommu: Map and unmap multiple pages in a single call
For efficiency, the map_pages() and unmap_pages() calls now pay
attention to their "count" argument.
Remove a special case for a "pass-through" address range, which
the DMA/IOMMU subsystem isn't told exists and should never use.
Fix a bug where we omitted to set *mapped to 0 in error cases.
Minor style fixes and tidying.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
iommu/bcm2712: don't allow building as module
Since bcm2712-iommu{,-cache}.c doesn't have usual module descriptors
such as `MODULE_LICENSE`, configuring this as 'M' fails the build with
`ERROR: modpost: missing MODULE_LICENSE() in <...>/bcm2712-iommu.o`.
Since it seems like the code is not intended to be built as a module
anyway (it registers the driver with `builtin_platform_driver()`), don't
allow building this code as a module.
Signed-off-by: Ratchanan Srirattanamet <peathot@hotmail.com>
iommu: bcm2712-iommu: Add locking; fix address offset; tidy
- Now using spin_lock_irqsave in map, unmap, sync and iova_to_phys.
- Simplify bounds checks as all allocations should be in aperture.
- Use iommu_iotlb_gather_add_range(); NB gather range is inclusive.
- Fix missing address offset in bcm2712_iommu_sync_all.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
The BCM2712 has a DMA-Lite controller that is basically a BCM2835-style
DMA controller that supports 40 bits DMA addresses.
We need it for HDMI audio to work, but this breaks BCM2835-38 so we
should rework this later.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
dmaengine: bcm2835: Fix dma driver for BCM2835-38
The previous commit broke support on older devices.
Make the breaking parts of patch conditional on
the device being used.
Fixes: 6e1856ac7c39 ("dmaengine: bcm2835: HACK: Support DMA-Lite channels")
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
BCM2712 has 6 40-bit channels - DMA6 to DMA11. Add a new compatible
string to indicate that the current platform is BCM2712.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Formerly the delay was omitted as bit-banged SPI seldom achieved
even one Mbit/s; but some modern platforms can run faster, and
some SPI devices may need to be clocked slower.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Formerly, if configured using DT, CS GPIOs were driven from spi.c
and it was possible for CS to be asserted (low) *before* starting
to drive SCK. CS GPIOs have been brought under control of this
driver in both ACPI and DT cases, with a fixup for GPIO polarity.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
BCM2712 has a PM block but neither ASB nor RPIVID_ASB. Use the absence
of the "asb" register range to indicate BCM2712 and its different PM
register range.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
BCM2712 lacks the "asb" and "rpivid_asb" register ranges, but still
requires the use of the bcm2835-power driver to reset the V3D block.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drivers: hwmon: rp1-adc: check conversion validity before supplying value
The SAR ADC architecture may complete a conversion but instability in the
comparator can corrupt the result. Such corruption is signalled in the CS
ERR bit, asserted alongside each conversion result.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Add optional properties to tune the AXI interface -
cdns,aw2w-max-pipe, cdns,ar2r-max-pipe and cdns,use-aw2b-fill.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
This supports reading and writing OTP using the firmware
mailbox interface.
It needs supporting firmware to run.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Add support for the RP1 VEC hardware.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: rp1-vec: Allow non-standard modes with various crops
Tweak sync timings in the advertised modelines.
Accept other, custom modes, provided they fit within the active
area of one of the existing hardware-supported TV modes.
Instead of always padding symmetrically, try to respect the user's
[hv]sync_start values, allowing the image to be shifted around
the screen (to fine-tune overscan correction).
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm/rp1: depends on, instead of select, MFD_RP1
According to kconfig-language.txt [1], select should be used only for
"non-visible symbols ... and for symbols with no dependencies". Since
MFD_RP1 both is visible and has a dependency, "select" should not be
used and "depends on" should be used instead.
In particular, this fixes the build of this kernel tree on NixOS, where
its kernel config system will try to answer 'M' to as many config as
possible.
[1] https://www.kernel.org/doc/html/latest/kbuild/kconfig-language.html
Signed-off-by: Ratchanan Srirattanamet <peathot@hotmail.com>
drm: rp1: Use tv_mode from the command line and fix for Linux 6.6
Use the standard enum drm_connector_tv_mode instead of a private
enum and switch from the legacy to the standard tv_mode property.
Remove the module parameter "tv_norm". Instead, get tv_mode from
the command line and make this the connector's default TV mode.
Don't restrict the choice of modes based on tv_mode, but interpret
nonstandard combinations as NTSC or PAL, depending on resolution.
Thus the default tv_mode=NTSC effectively means "Auto".
Tweak the advertised horizontal timings for 625/50 to match Rec.601
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: VEC and DPI drivers: Fix bug #5901
Rework probe() to use devm_drm_dev_alloc(), embedding the DRM
device in the DPI or VEC device as now seems to be recommended.
Change order of resource allocation and driver initialization.
This prevents it trying to write to an unmapped register during
clean-up, which previously could crash.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: vec: Support more video modes in the RP1 VEC driver
Support a wider range of pixel clock rates. The driver will round
pixclock up to 108MHz/n but tries to honour the desired image width
and position (of the centre of the display relative to HSYNC_STARTs).
This adds complexity but removes the need for separate 13.5MHz and
15.428MHz modes.
Support "fake" double-rate progressive modes (in which only every
2nd scanline is displayed). To work around aspect ratio issues.
Add Monochrome TV mode support. Add "vintage" modes (544x380i for
System A; 848x738i for System E) when configured for Monochrome.
Add a way to create a "custom" display mode from a module parameter.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: rp1-vec: Add DRM_FORMAT_ARGB8888 and DRM_FORMAT_ABGR8888
Android requires this.
As the underlying hardware doesn't support alpha blending,
we ignore the alpha value.
Signed-off-by: Jan Kehren <jan.kehren@emteria.com>
drivers: drm: rp1-vec: Increase width limit, for PAL 16:9 @ 18MHz
There was no technical reason for the DRM mode's width limit of 848;
increase it to 960 (720*18MHz/13.5MHz) to support ~square pixels on
16:9 screens. Tweak the PAL active window to start slightly earlier.
(The maximum number of visible columns at 18MHz is about 942.)
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm/vc4: Make VEC progressive modes readily accessible
Add predefined modelines for the 240p (NTSC) and 288p (PAL) progressive
modes, and report them through vc4_vec_connector_get_modes().
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
drm/rp1-vec: Run DRM default client setup
Call drm_client_setup() to run the kernel's default client setup
for DRM. Set fbdev_probe in struct drm_driver, so that the client
setup can start the common fbdev client.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm: rp1: Enable VEC->GPIO output; cosmetic change to registers
In the VEC driver, enable mapping VEC (not DPI) to DPI GPIOs.
This is to support VEC output over GPIO on Raspberry Pi CM5.
It is harmless as DPI and VEC could not be used concurrently,
and the output is anyway conditional on pinctrl.
Also, tweak the style of VIDEO_OUT_CFG register definitions
(in both DPI and VEC drivers) to be more Linux-friendly.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Add support for the RP1 DPI hardware.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm/rp1: depends on, instead of select, MFD_RP1
According to kconfig-language.txt [1], select should be used only for
"non-visible symbols ... and for symbols with no dependencies". Since
MFD_RP1 both is visible and has a dependency, "select" should not be
used and "depends on" should be used instead.
In particular, this fixes the build of this kernel tree on NixOS, where
its kernel config system will try to answer 'M' to as many config as
possible.
[1] https://www.kernel.org/doc/html/latest/kbuild/kconfig-language.html
Signed-off-by: Ratchanan Srirattanamet <peathot@hotmail.com>
drm: rp1: VEC and DPI drivers: Fix bug #5901
Rework probe() to use devm_drm_dev_alloc(), embedding the DRM
device in the DPI or VEC device as now seems to be recommended.
Change order of resource allocation and driver initialization.
This prevents it trying to write to an unmapped register during
clean-up, which previously could crash.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: dpi: Add support for MEDIA_BUS_FMT_RGB565_1X24_CPADHI
This new format corresponds to the Raspberry Pi legacy DPI mode 3.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: rp1-dpi: Add DRM_FORMAT_ARGB8888 and DRM_FORMAT_ABGR8888
Android requires this.
As the underlying hardware doesn't support alpha blending,
we ignore the alpha value.
Signed-off-by: Jan Kehren <jan.kehren@emteria.com>
drm: rp1: rp1-dpi: Add interlaced modes and PIO program to fix VSYNC
Implement interlaced modes by wobbling the base pointer and VFP width
for every field. This results in correct pixels but incorrect VSYNC.
Now use PIO to generate a fixed-up VSYNC by sampling DE and HSYNC.
This requires DPI's DE output to be mapped to GPIO1, which we check.
When DE is not exposed, the internal fixup is disabled. VSYNC/GPIO2
becomes a modified signal, designed to help an external device or
PIO program synthesize CSYNC or VSYNC.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: rp1-dpi: Fix optional dependency on RP1_PIO
Add optional dependency to Kconfig, and conditionally compile
PIO-dependent code. Add a mode validation function to reject
interlaced modes when RP1_PIO is not present.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: rp1-dpi: Add "rgb_order" property (to match VC4 DPI)
As on VC4, the OF property overrides the order implied by media
bus format. Only 4 of the 6 possible orders are supported. New
add-on hardware designs should not rely on this "legacy" feature.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm/rp1: DPI interlace: Improve precision of PIO-generated VSYNC
Instead of trying to minimize the delay between seeing HSYNC edge
and asserting VSYNC, try to predict the next HSYNC edge precisely.
This eliminates the round-trip delay but introduces mode-dependent
rounding error. HSYNC->VSYNC lag reduced from ~30ns to -5ns..+10ns
(plus up to 5ns synchronization jitter as before).
This may benefit e.g. SCART HATs, particularly those that generate
Composite Sync using a XNOR gate.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm/rp1-dpi: Run DRM default client setup
Call drm_client_setup() to run the kernel's default client setup
for DRM. Set fbdev_probe in struct drm_driver, so that the client
setup can start the common fbdev client.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/rp1/rp1_dpi: Move Composite Sync generation into the kernel
Move RP1 DPI's PIO-assisted Composite Sync generation code,
previously released as a separate utility, into the kernel driver.
There are 3 variants for progressive, generic interlaced and TV-
style interlaced CSync, alongside the existing VSync fixup.
Check that all of GPIOs 1-3 are mapped to DPI, so PIO won't try
to snoop on a missing output, or override another device's pins.
Add "force_csync" module parameter, for convenience of testing,
as few tools can set DRM_MODE_FLAG_CSYNC.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drm: rp1: Enable VEC->GPIO output; cosmetic change to registers
In the VEC driver, enable mapping VEC (not DPI) to DPI GPIOs.
This is to support VEC output over GPIO on Raspberry Pi CM5.
It is harmless as DPI and VEC could not be used concurrently,
and the output is anyway conditional on pinctrl.
Also, tweak the style of VIDEO_OUT_CFG register definitions
(in both DPI and VEC drivers) to be more Linux-friendly.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Add support for the RP1 DSI hardware.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm/rp1: depends on, instead of select, MFD_RP1
According to kconfig-language.txt [1], select should be used only for
"non-visible symbols ... and for symbols with no dependencies". Since
MFD_RP1 both is visible and has a dependency, "select" should not be
used and "depends on" should be used instead.
In particular, this fixes the build of this kernel tree on NixOS, where
its kernel config system will try to answer 'M' to as many config as
possible.
[1] https://www.kernel.org/doc/html/latest/kbuild/kconfig-language.html
Signed-off-by: Ratchanan Srirattanamet <peathot@hotmail.com>
DRM: rp1: rp1-dsi: Fix escape clock divider and timeouts.
Escape clock divider was fixed at 5, which is correct at 800Mbps/lane
but increasingly out of spec for higher rates. Compute it correctly.
High speed timeout was fixed at 5*512 == 2560 byte-clocks per lane.
Compute it conservatively to be 8/7 times the line period (assuming
there will be a transition to LP some time during each scanline?)
keeping the old value as a lower bound. Increase LPRX TO to 1024,
and BTA TO to 0xb00 (same value as in bridge/synopsys/dw-mipi-dsi).
(No change to LP_CMD_TIM. To do: compute this correctly.)
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: rp1-dsi: Switch to PLL_SYS source for DPI when 8 * lanes > bpp
To support 4 lanes, re-parent DPI clock source between DSI byteclock
(using the new "variable sources" defined in clk-rp1) and PLL_SYS.
This is to cover cases in which byteclock < pixclock <= 200MHz.
Tidying: All frequencies now in Hz (not kHz), where DSI speed is now
represented by byteclock to simplify arithmetic. Clamp DPI and byte
clocks to their legal ranges; fix up HSTX timeout to avoid an unsafe
assumption that it would return to LP state for every scanline.
Because of RP1's clock topology, the ratio between DSI and DPI clocks
may not be exact with 3 or 4 lanes, leading to slightly irregular
timings each time DSI switches between HS and LP states. Tweak to
inhibit LP during Horizontal BP when sync pulses were requested.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm: rp1: rp1-dsi: Add DRM_FORMAT_ARGB8888 and DRM_FORMAT_ABGR8888
Android requires this.
As the underlying hardware doesn't support alpha blending,
we ignore the alpha value.
Signed-off-by: Jan Kehren <jan.kehren@emteria.com>
drivers: drm: rp1-dsi: Implement more DSI options and flags
Now implementing:
- Per-command selection of LP or HS for commands (previously LP)
- EoTp transmission option (previously EoTp was always disabled)
- Non-continuous clock option (previously always continuous)
- Per-command enabling of ACK request (in command mode only)
Make a plausible (and possibly correct) attempt to measure the
longest LP command that will fit into vertical blanking lines.
DON'T set both "Burst Mode" and "Sync Events" flags together.
This is redundant in the standard IP; in this RP1 variant it
would enable Sync Pulses but may break with some video timings.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
drm/rp1-dsi: Run DRM default client setup
Call drm_client_setup() to run the kernel's default client setup
for DRM. Set fbdev_probe in struct drm_driver, so that the client
setup can start the common fbdev client.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Don't assume that DMA addresses of devices are the same as their
physical addresses - convert correctly.
The CFG2 register layout is used when there are more than 8 channels,
but also when configured for more than 16 target peripheral devices
because the index of the handshake signal has to be made wider.
Reset the DMAC on probe
The driver goes to the trouble of tracking when transfers have been
paused, but then doesn't report that state when queried.
Not having APB registers is not an error - for most use cases it's
not even of interest, it's expected. Demote the message to debug level,
which is disabled by default.
Each channel has a descriptor pool, which is shared between transfers.
It is unsafe to treat the total number of descriptors allocated from a
pool as the number allocated to a specific transfer; doing so leads
to releasing buffers that shouldn't be released and walking off the
ends of descriptor lists. Instead, give each transfer descriptor its
own count.
Support partial transfers:
Some use cases involve streaming from a device where the transfer only
proceeds when the device's FIFO occupancy exceeds a certain threshold.
In such cases (e.g. when pulling data from a UART) it is important to
know how much data has been transferred so far, in order that remaining
bytes can be read from the FIFO directly by software.
Add the necessary code to provide this "residue" value with a finer,
sub-transfer granularity.
In order to prevent the occasional byte getting stuck in the DMA
controller's internal buffers, restrict the destination memory width
to the source register width.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
dmaengine: dw-axi-dmac: Fix a non-atomic update
dw_axi_dma_interrupt disables interrupts for the duration of the channel
handling. It does so by clearing a bit in the DMA_CFG register - an
action that involves a read-modify-write. That in itself would be safe
because there will be no further interrupts, hence no reentrancy, were
it the only bit of code accessing that register.
The only neighbour of INT_EN is DMAC_EN - the main enable for the block.
That's not the sort of thing you would expect to be modified during the
normal course of operation, but bizarrely it is set at the start of the
transfer of every block, in axi_chan_block_xfer_star, by a call to
axi_dma_enable. This can lead to INT_EN being accidentally cleared,
which causes all DMA transfers to time out.
One might think that the enabling was being delayed until the first
transfer, but the probe function calls axi_dma_resume which in turn
calls axi_dma_enable, so that isn't the case.
Fix the atomicity problem by removing the spurious call to
axi_dma_enable.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The SMBUS emulation code turns an SMBUS quick command into a zero-
length read. This controller can't do zero length accesses, but it
can do quick commands, so reverse the emulation. The alternative
would be to properly implement the SMBUS support but that is a lot
more work, and unnecessary just to get i2cdetect working.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: Liam Fraser <liam@raspberrypi.com>
mmc: sdhci-of-dwcmshc: rp1 sdio changes
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drivers: mmc: sdhci-of-dwcmshc: add RP1 dt ID and quirks
Differentiate the RP1 variant of the Designware MSHC controller(s).
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ARM: pl011: Add rs485 to the RP1 support
pl011_axi_probe, added for RP1 support, lacks the rs485 additions that
appeared during its development.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
tty/serial: pl011: restrict RX burst FIFO threshold
If the associated DMA controller has lower burst length support than the
level the FIFO is set to, then bytes will be left in the RX FIFO at the
end of a DMA block - requiring a round-trip through the timeout interrupt
handler rather than an end-of-block DMA interrupt.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
tty/serial: pl011: Also unregister pl011_axi_platform_driver
See: https://github.com/raspberrypi/linux/issues/6379
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
dwc3 allocates scratch and event buffers in the top-level driver. Hack the
probe function to set the DMA mask before trying to allocate these.
I think the event buffers are only used in device mode, but the scratch
buffers may be used if core hibernation is enabled.
usb: dwc3: add support for new DT quirks
Apply the optional axi-pipe-limit and dis-in-autoretry-quirk properties
during driver probe.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
phy: phy-brcm-usb: Add 2712 support
usb: dwc3: if the host controller instance number is present in DT, use it
If two instances of a dwc3 host controller are specified in devicetree,
then the probe order may be arbitrary which results in the device names
swapping on a per-boot basis.
If a "usb" alias with the instance number is specified, then use
that to construct the device name instead of autogenerating one.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
rp1 dwc3 changes
drivers: usb: dwc3: allow setting GTXTHRCFG on dwc_usb3.0 hardware
Equivalent register fields exist in the SuperSpeed Host version of the
hardware, so allow the use of TX thresholds if specified in devicetree.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: usb: dwc3: remove downstream quirk dis-in-autoretry
Upstream have unilaterally disabled the feature.
Partially reverts 6e9142a26ee0fdc3a5adc49ed6cedc0b16ec2ed1 (downstream)
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
macb: Add device tree properties that allow configuration of the AXI max pipeline register
net: macb: add support for ethtool interrupt moderation configuration
Only global throttling of rx or tx by time quanta is supported.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
macb: add platform device shutdown function. Prevents AXI master over PCIE from hanging when the host is rebooted.
net: macb: increase polling interval for MDIO completion
MDIO is a slow bus (single-digit MHz). Polling at 1us intervals
is a bit aggressive, so increase to 100us as the transaction
usually takes 100-200us to complete.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
net: macb: Several patches for RP1
64-bit RX fix
Also set DMA coherent mask
Add device tree properties that allow configuration of the AXI max
pipeline register
Add support for ethtool interrupt moderation configuration
Only global throttling of rx or tx by time quanta is supported.
Add platform device shutdown function. Prevents AXI master over PCIE
from hanging when the host is rebooted.
Increase polling interval for MDIO completion
MDIO is a slow bus (single-digit MHz). Polling at 1us intervals
is a bit aggressive, so increase to 100us as the transaction
usually takes 100-200us to complete.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
net: macb: Support the phy-reset-gpios property
Allow a PHY to be reset with an optional GPIO. The reset duration can
be specified in milliseconds - the default is 10ms.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drivers: net: macb: close device on driver shutdown
Fix some suspicious locking and instead call into macb_close, which
deregisters and frees all resources the corresponding macb_open
claimed.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
net: macb: add hack to prevent TX stalls in a quiet system
See https://github.com/raspberrypi/linux-2712/issues/89
There is some critical window during TX where a further write to the
TSTART bit while TX is active does not cause newly queued TX descriptors
to be consumed.
For now "wait a bit, then try anyway" seems to work.
Requires further investigation, but this unsticks NFS reliably.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
net: macb: set default interrupt moderation for GEM hardware
Defaulting to intmod = 0 is antisocial, as the MAC can generate over
130,000 interrupts per second. 50us is a sensible default.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
reset_control_reset should not be used with shared reset controllers.
Add support for reset_control_assert and _deassert to get the desired
behaviour and avoid ugly warnings in the kernel log.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
BCM2712 has an SD Express capable SDHCI implementation and uses
the SDIO CFG register block present on other STB chips.
Add plumbing for SD Express handover and BCM2712-specific functions.
Due to the common bus infrastructure between BCM2711 and BCM2712,
the driver also needs to implement 32-bit IO accessors.
mmc: brcmstb: override card presence if broken-cd is set
Not just if the card is declared as nonremovable.
sdhci: brcmstb: align SD express switchover with SD spec v8.00
Part 1 of the Physical specification, figure 3-24, details the switch
sequence for cards initially probed as SD. Add a missing check for DAT2
level after switching VDD2 on.
sdhci: brcmstb: clean up SD Express probe and error handling
Refactor to avoid spurious error messages in dmesg if the requisite SD
Express DT nodes aren't present.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
mmc: sdhci-brcmstb: only use the delay line PHY for tuneable speeds
The MMC core has a 200MHz core clock which allows the use of DDR50 and
below without incremental phase tuning. SDR50/SDR104 and the EMMC HS200
speeds require tuning.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
mmc: sdhci-brcmstb: remove 32-bit accessors for BCM2712
The reason for adding these are lost to the mists of time (and for a
previous chip revision). Removing these accessors appears to have no ill
effect on production chips, so get rid of the unnecessary RMW cycles.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: mmc: sdhci-brcmstb: fix usage of SD_PIN_SEL on BCM2712
The SDIO_CFG register SD_PIN_SEL conflates two settings - whether eMMC
HS or SD UHS timings are applied to the interface, and whether or not
the card-detect line is functional. SD_PIN_SEL can only be changed when
the SD clock isn't running, so add a bcm2712-specific clock setup.
Toggling SD_PIN_SEL at runtime means the integrated card-detect feature
can't be used, so this controller needs a cd-gpios property.
Also fix conditionals for usage of the delay-line PHY - no-1-8-v will
imply no bits set in hsemmc_mask or uhs_mask, so remove it.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: sdhci-brcmstb: set CQE timer clock frequency
CQHCI keeps track of tags in flight with internal timers, so the clock
frequency driving the timer needs to be specified. The config registers
default to 0 (100kHz) which means timeouts will be significantly shorter
than they should be. Assume the timer clock comes from the controller
base clock.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
pinctrl: bcm2712: Reject invalid pulls
Reject attempts to set pulls on aon-sgpios, and fix pull shift
values.
pinctrl: bcm2712: Add 7712 support, fix 2712 count
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl-bcm2712: add EMMC pins so pulls can be set
These pins have pad controls but not mux controls. They look enough like
GPIOs to squeeze in at the end of the list though.
pinctrl: bcm2712: correct BCM2712C0 AON_GPIO pad pull control offset
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
pinctrl: bcm2712: on C0 the regular GPIO pad control register moves too
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
pinctrl: bcm2712: Implement (partially) pinconf_get
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl: bcm2712: Convert to generic pinconf
Remove the legacy brcm,* pin configuration support and replace it with
a proper generic pinconf interface, using named functions instead of
alt function numbers. This is nicer for users, less error-prone, and
immune to some of the C0->D0 changes.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl: bcm2712: Remove vestigial pull parameter
Now the legacy brcm, pinconf parameters are no longer supported, this
custom pin config parameter is not needed.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl: bcm2712: Guard against bad func numbers
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl: bcm2712: A better attempt at D0 support
The BCM2712D0 sparse pinctrl maps play havoc with the old GPIO_REGS
macro, so make the bit positions explicit. And delete the unwanted
GPIO and pinmux declarations on D0.
Note that a Pi 5 with D0 requires a separate DTS file with "bcm2712d0"
compatible strings.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl: bcm2712: Delete base register constants
BCM2712D0 deletes many GPIOs and their associated mux and pad bits,
so much so that the offsets to the start of the pad control registers
changes. Remove the constant offsets from the *GPIO_REGS macros,
compensating by adjusting the per-GPIO values.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl: bcm2712: Fix for sparse GPIOs
BCM2712D0's sparse GPIO map revealed that it is not safe to treat
group_selector as the GPIO number - it is an index into the array of
pinctrl_pin_descs, and the "number" member says which GPIO it refers to.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl: bcm2712: Fix for the first valid GPIO
A non-zero mux bit number is used to detect a valid entry in the
pin_regs tables, but GPIO 0 (GPIO 1 on D0) is a valid GPIO with a mux
bit number of zero, so add a high-bit on all valid entries to
distinguish this from an uninitialised row in the table.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drivers: pinctrl: add BCM2712D0 EMMC pins
The pad control registers are concatenated onto the GPIO pad control
registers, as with previous steppings.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
gpio-brcmstb: Report the correct bank width
gpio: brcmstb: Use bank address as gpiochip label
If the path to the device node is used as gpiochip label then
gpio-brcmstb instances with multiple banks end up with duplicated
names. Instead, use a combination of the driver name with the physical
address of the bank, which is both unique and helpful for devmem
debugging.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio: mmio: Add DIRECT mode for shared access
The generic MMIO GPIO library uses shadow registers for efficiency,
but this breaks attempts by raspi-gpio to change other GPIOs in the
same bank. Add a DIRECT mode that makes fewer assumptions about the
existing register contents, but note that genuinely simultaneous
accesses are likely to lose updates.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio: brcmstb: Don't always clear interrupt mask
If the GPIO controller is not being used as an interrupt source
leave the interrupt mask register alone. On BCM2712 it might be used
to generate interrupts to the VPU firmware, and on other devices it
doesn't matter since no interrupts will be generated.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio: brcmstb: Use dynamic GPIO base numbers
Forcing a gpiochip to have a fixed base number now leads to a warning
message. Remove the need to do so by calculating hwirq numbers based
on bank numbers.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Fixes: 3b0213d56e ("gpio: Add GPIO support for Broadcom STB SoCs")
For "Really Good Reasons" [1] the SPI core requires a match
between compatible device strings and the name in spi_device_id.
The ili9486 driver uses compatible strings "waveshare,rpi-lcd-35"
and "ozzmaker,piscreen", but "rpi-lcd-35" and "piscreen" are missing,
so add them.
Compatible string "ilitek,ili9486" is already used by
staging/fbtft/fb_ili9486, therefore leaving it present in ili9486 as an
spi_device_id causes the incorrect module to be loaded, therefore remove
this id.
[1] https://elixir.bootlin.com/linux/latest/source/drivers/spi/spi.c#L487
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Loading the regulatory database from the debian files fails with
"loaded regulatory.db is malformed or signature is missing/invalid"
due to missing certificates. Add these debian-specific certificates
from upstream to fix this error. See #5536 for details.
The certificates have been imported as following:
patch -p1 <<<$(
curl https://salsa.debian.org/kernel-team/linux/-/raw/\
master/debian/patches/debian/\
wireless-add-debian-wireless-regdb-certificates.patch
)
Signed-off-by: Nicolai Buchwitz <n.buchwitz@kunbus.com>
The integrated USB2.0 hub in the VL805 chipset has a bug where it
incorrectly determines the remaining available frame time before the
host next sends a SOF packet with an incremented frame_number.
See the USB2.0 specification sections 11.3 and 11.14.2.3.
The hub's non-periodic TT handler can transmit the IN/OUT handshake
token too late, so a following 64-byte DATA0/1 packet causes the ACK
handshake to collide with the propagated SOF. This causes port babble.
Avoid ringing doorbells for vulnerable endpoints during uFrame 7 if the
TR is Idle to stop one source of babble. An IN transfer for a Running TR
may happen at any time, so there's not much we can do about that.
Ideally a hub firmware update to properly implement frame timeouts is
needed, and to avoid spinning for up to 125us when submitting TDs to
Idle rings.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
xhci: constrain XHCI_VLI_HUB_TT_QUIRK to old firmware versions
VLI have a firmware update for the VL805 which resolves the incorrect
frame time calculation in the hub's TT. Limit applying the quirk to
known-bad firmwares.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The VL805 can cause data corruption if a SS Bulk OUT endpoint enters a
flow-control condition and there are TRBs in the transfer ring that are
not an integral size of wMaxPacket and the endpoint is behind one or more
hubs.
This is frequently the case encountered when FAT32 filesystems are
present on mass-storage devices with cluster sizes of 1 sector, and the
filesystem is being written to with an aggregate of small files.
The initial implementation of this quirk separated TRBs that didn't
adhere to this limitation into two - the first a multiple of wMaxPacket
and the second the 512-byte remainder - in an attempt to force TD
fragments to align with packet boundaries. This reduced the incidence
rate of data corruption but did not resolve it.
The fix as recommended by VIA is to disable bursts if this sequence of
TRBs can occur.
Limit turning off bursts to just USB mass-storage devices by searching
the device's configuration for an interface with a class type of
USB_CLASS_MASS_STORAGE.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The VL805 controller can't cope with the TR Dequeue Pointer for an endpoint
being set to a Link TRB. The hardware-maintained endpoint context ends up
stuck at the address of the Link TRB, leading to erroneous ring expansion
events whenever the enqueue pointer wraps to the dequeue position.
If the search for the end of the current TD and ring cycle state lands on
a Link TRB, move to the next segment.
Link: https://github.com/raspberrypi/linux/issues/3919
[6.5.y Fixup - move downstream quirk bits further along]
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
As of [1], using PPS_FETCH on a 64-bit ARM kernel with a 32-bit userland
is broken, returning a timeout. This is because the requested 4-byte
alignment for struct pps_ktime_compat (illegal on arm64) results in the
timeout flags field being uninitialised.
Make the hack specific to X86_64 builds with CONFIG_COMPAT defined.
[1] commit c2a49fe8ee ("pps: fix padding issue with PPS_FETCH for
ioctl_compat")
See: https://github.com/raspberrypi/linux/issues/5430
Fixes: c2a49fe8ee ("pps: fix padding issue with PPS_FETCH for ioctl_compat")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Contrary to what struct snd_dmaengine_dai_dma_data suggests, the
configuration of addresses of DMA slave interfaces should be done in
CPU physical addresses.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Contrary to what struct snd_dmaengine_dai_dma_data suggests, the
configuration of addresses of DMA slave interfaces should be done in
CPU physical addresses.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
It has been observed that edge events can be lost when GPIO edges occur
close to each other. Investigation suggests this is due to a hardware
bug, although no mechanism has been identified.
Work around the event loss by moving the IRQ acknowledgement into the
main ISR, adding missing events by explicit level-change detection.
See: https://forums.raspberrypi.com/viewtopic.php?t=350295
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The kernel needs to recognise the default BDADDRs used by the Bluetooth
modems, so add a few more that we care about.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The kernel Bluetooth framework understands that devices may not
be programmed with valid Bluetooth addresses. It also has the ability
to override a Bluetooth address with the value of the local-bd-address
DT property, but it ignores the validity of the existing address when
doing so.
Add a new boolean property, fallback-bd-address, which indicates that
the given local-bd-address property should only be used if the device
does not already have a valid BDADDR.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The BCM2835 mini-UART has no modem status interrupt, causing all
transmission to stop, never to use, if a speed change ever happens
while the CTS signal is high.
Add a simple polling mechanism in order to allow recovery in that
situation.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Since [1], the fbdev deferred IO framework is careful to cancel
pending updates on close to prevent dirty pages being accessed after
they may have been reused. However, this is not necessary in the case
that the pagelist is empty, and drivers that don't make use of the
pagelist may have wanted updates cancelled for no good reason.
Avoid penalising fbdev drivers that don't make use of the pagelist by
making the cancelling of deferred IO on close conditional on there
being a non-empty pagelist.
See: https://github.com/raspberrypi/linux/issues/5398
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[1] 3efc61d952 ("fbdev: Fix invalid page access after closing deferred I/O devices")
While waiting for random data, use sleeps that are proportional
to the amount of data expected. Prevent indefinite waits by
giving up if nothing is received for a second.
See: https://github.com/raspberrypi/linux/issues/5390
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The imx708 is a 12MP MIPI sensor with a 16:9 aspect ratio, here using
two CSI-2 lanes. It is a "quad Bayer" sensor with all 3 modes offering
10-bit output:
12MP: 4608x2592 up to 14.35fps (full FoV)
1080p: 2304x1296 up to 56.02fps (full FoV)
720p: 1536x864 up to 120.12fps (cropped)
This imx708 sensor driver is based heavily on the imx477 driver and
has been tested on the Raspberry Pi platform using libcamera.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drivers: media: imx708: Enable long exposure mode
Enable long exposure modes by using the long exposure shift register setting
in the imx708 sensor.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: i2c: imx708: Fix crop information
The 1536x864 mode contained incorrect crop information.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
drivers: media: i2c: imx708: Fix WIDE_DYNAMIC_RANGE control with long exposure
Setting V4L2_CID_WIDE_DYNAMIC_RANGE was causing the long exposure
shift count to be reset, which is incorrect if the user has already
changed the frame length to cause it to have a non-zero value.
Because it only updates control ranges and doesn't set any registers,
the control can also be applied when the sensor is not powered on.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
drivers: media: imx708: Increase usable link frequencies
Add support for three different usable link frequencies (default 450Mhz,
447Mhz, and 453MHz) for the IMX708 camera sensor. The choice of
frequency is handled thorugh the "link-frequency" overlay parameter.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx708: Remove unused control fields
Remove unused and redundant control fields from the state structure.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx708: Tidy-ups to address upstream review comments
This commit addresses vaious tidy-ups requesed for upstreaming the
IMX708 driver. Notably:
- Remove #define IMX708_NUM_SUPPLIES and use ARRAY_SIZE() directly
- Use dev_err_probe where possible in imx708_probe()
- Fix error handling paths in imx708_probe()
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx708: Follow the standard devicetree labels
Switch the system clock name from "xclk" to "inclk".
Use lower case lables for all regulator names.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drives: media: imx708: Put HFLIP and VFLIP controls in a cluster
Create a cluster for the HVLIP and VFLIP controls so they are treated
as a single composite control.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx708: Adjust broken line correction parameter
In full-resolution mode, the LPF_INTENSITY_EN and LPF_INTENSITY
registers control Quad Bayer Re-mosaic broken line correction.
Expose this as a module parameter "qbc_adjust": zero disables
the correction and values in the range 2 to 5 set its strength.
There is a trade-off between coloured and monochrome patterns.
The previous fixed value 4 could produce ladder/spots artefacts
in coloured textures. The new default value 2 may suit a wider
range of scenes.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
media: i2c: imx708: Squash fixes
media: i2c: imx708: Fix lockdep issues.
The driver had a lockdep_assert_held in imx708_get_format_code,
but the calls from enum_mbus_code and enum_frame_size didn't take
the mutex.
Likewise imx708_set_framing_limits calling __v4l2_ctrl_modify_range
had a lockdep, but when going through the probe function the mutex
hadn't been taken.
Fix both cases.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: Tweak default PDAF gain table in imx708 driver
After analyzing more Raspberry Pi V3 cameras, adjust the
default PDAF shield-pixel gain tables (they can still be
overridden by camera OTP where programmed).
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Replace the existing imx708.yaml file with sony,imx708.yaml that follows
the latest devicetree conventions for camera sensors.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add YAML devicetree binding for IMX708 CMOS image sensor.
Let's also add a MAINTAINERS entry for the binding and driver.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The power up/down sequence is already ramped. Extend this to
the first user movement as well, as this will generally avoid
the "tick" noises due to rapid movements and overshooting.
Subsequent movements are generally smaller and so don't cause
issues.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Uses the regulator notifier framework so that the current
focus position will be restored whenever any user of the
regulator powers it up. This means that should the VCM
and sensor share a common regulator then starting the sensor
will automatically restore the default position. If they
have independent regulators then it will behave be powered
up when the VCM subdev is opened.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The VCM driver will often be controlled via a regulator,
therefore add in the relevant DT hooks.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The DW9817 is effectively the same as DW9807 from a programming
interface, however it drives +/-100mA instead of 0-100mA. This means
that the power on ramp needs to take the lens from the midpoint, and
power off return it there. It also changes the default position for
the module.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The DW9817 is programmatically the same as DW9807, but
the output drive is a bi-directional -100 to +100mA
instead of 0-100mA.
Add the appropriate compativle string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
On some switches, having EEE enabled causes the link to become
unstable. With this patch, adding 'genet.eee=N' to the kernel command
line will cause EEE to be disabled on the link.
See: https://github.com/raspberrypi/linux/issues/4289
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
As there isn't currently a defined mechanism for selecting an
external trigger mode on image sensors, copy the imx477
approach of using a module parameter to enable ext trig.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Whilst the adv7180 driver support s_routing, nothing else
does, and there is a missing lump of framework code to
define the mapping from connectors on a board to the inputs
they represent on the ADV7180.
Add a nasty hack to take a module parameter that is passed in
to s_routing on any call to G_STD, or S_STD (or subdev
g_input_status call).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
There is no obligation for all source devices on a video-mux to
require the same bus configuration, so read the configuration
from the sink ports, and relay via get_mbus_config on the source
port.
If the sources support get_mbus_config, then call that first.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The BCM54213PE identifier is a RPI-specific addition.
Populate the remainder of the driver functions, including the
required probe routine.
Add a version of bcm54xx_suspend, from upstream.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Also check for which HDMI devices are connected and only create
devices for those that are present.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
snd_bcm2835: disable HDMI audio when vc4 is used (#3640)
Things don't work too well when both the vc4 driver and the firmware
driver are trying to control the same audio output:
[ 763.569406] bcm2835_audio bcm2835_audio: vchi message timeout, msg=5
Hence, when the vc4 HDMI driver is used, let it control audio. This is done
by introducing a new device tree property to the audio node, and
extending the vc4-kms-v3d overlays to set it appropriately.
Signed-off-by: Hristo Venev <hristo@venev.name>
staging: bcm2835-audio: Add disable-headphones flag
Add a property to allow the headphone output to be disabled. Use an
integer property rather than a boolean so that an overlay can clear it.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
staging: bcm2835-audio: Find compatible firmware node
Commit "ARM: dts: Adopt the upstream snd_bcm2835 handling" removed the
audio section from the DT and the driver can no longer access the
referenced firmware node 'brcm,firmware'. Fix that by searching for a
compatible firmware node instead, similar to drivers/gpu/drm/vc4.
Fixes: b9e62329e096 ("ARM: dts: Adopt the upstream snd_bcm2835 handling")
Signed-off-by: Juerg Haefliger <juergh@proton.me>
staging: bcm2835-audio: Fix firmware node refcounting
Decrement firmware node refcounts on all exit paths in set_hdmi_enables().
Signed-off-by: Juerg Haefliger <juergh@proton.me>
staging: bcm2835-audio: Log errors in case of firmware query failures
The driver queries the firmware for the number of detected HDMI displays
and their IDs. Log error messages if queries fail.
Signed-off-by: Juerg Haefliger <juergh@proton.me>
staging: bcm2835-audio: Fix unused enable_hdmi module parameter
The commit "Add HDMI1 facility to the driver." made the enable_hdmi module
parameter unused. Fix that by making it a global switch for all available
HDMI audio outputs.
Fixes: 755f336608 ("Add HDMI1 facility to the driver.")
Signed-off-by: Juerg Haefliger <juergh@proton.me>
staging: bcm2835-audio: Fix unused enable_headphones module parameter
Since commit "staging: bcm2835-audio: Add disable-headphones flag" the
enabling/disabling of the headphones output is solely determined by the
presence of the DT property 'brcm,disable-headphones' and the
enable_headphones module parameter is unused. Fix that by making it a
global switch.
Fixes: ee90e47d88 ("staging: bcm2835-audio: Add disable-headphones flag")
Signed-off-by: Juerg Haefliger <juergh@proton.me>
Add a driver for the Arducam 64MP camera sensor.
Whilst the sensor supports 2 or 4 CSI2 data lanes, this driver
currently only supports 2 lanes.
The following Bayer modes are currently available:
9152x6944 10-bit @ 2.7fps
4624x3472 10-bit (binned) @ 10fps
3840x2160 10-bit (cropped/binned) @ 20fps
2312x1736 10-bit (binned) @ 30fps
1920x1080 10-bit (cropped/binned) @ 60fps
1280x720 10-bit (cropped/binned) @ 120fps
Signed-off-by: Lee Jackson <info@arducam.com>
media: i2c: arducam_64mp: Advertise embedded data node on media pad 1
This commit updates the arducam_64mp driver to adverise support for
embedded data streams.
The arducam_64mp sensor subdevice overloads the media pad to differentiate
between image stream (pad 0) and embedded data stream (pad 1) when
performing the v4l2_subdev_pad_ops functions.
Signed-off-by: Lee Jackson <info@arducam.com>
media: i2c: arducam_64mp: Modify the line length of 1280x720 resolution
Arducam 64MP has specific requirements for the line length, and if these
conditions are not met, the camera will not function properly. Under the
previous configuration, once a stream off operation is performed, the
camera will not output any data, even if a stream on operation is
performed. This prevents us from switching from 1280x720 to another
resolution.
Signed-off-by: Lee Jackson <lee.jackson@arducam.com>
media: i2c: arducam_64mp: Add 8000x6000 resolution
Added 8000x6000 10-bit (cropped) @ 3fps mode for Arducam 64MP
Signed-off-by: Lee Jackson <lee.jackson@arducam.com>
media: i2c: arducam_64mp: Add PDAF support
Enable PDAF output for all modes, and also need to modify Embedded Line
Width to 11560 * 3 (two lines of Embedded Data + one line of PDAF).
Signed-off-by: Lee Jackson <lee.jackson@arducam.com>
drivers: media: arducam_64mp: Add V4L2_CID_LINK_FREQ control
Add V4L2_CID_LINK_FREQ as a read-only control with a value of 456 Mhz.
This will be used by the CFE driver to corretly setup the DPHY timing
parameters in the CSI-2 block.
Signed-off-by: Lee Jackson <lee.jackson@arducam.com>
[ I would like to pursue fixing this more directly first before actually
merging this, but I thought I'd send this to the list now anyway as a
the "backup" plan. If I can't figure out how to make headway on the
main plan in the next few days, it'll be easy to just do this. ]
Stephen reported that a static key warning splat appears during early
boot on systems that credit randomness from device trees that contain an
"rng-seed" property, because because setup_machine_fdt() is called
before jump_label_init() during setup_arch():
static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init()
WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224 44b43e377bfc84bc99bb5ab885ff694984ee09ff
pstate: 600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : static_key_enable_cpuslocked+0xb0/0xb8
lr : static_key_enable_cpuslocked+0xb0/0xb8
sp : ffffffe51c393cf0
x29: ffffffe51c393cf0 x28: 000000008185054c x27: 00000000f1042f10
x26: 0000000000000000 x25: 00000000f10302b2 x24: 0000002513200000
x23: 0000002513200000 x22: ffffffe51c1c9000 x21: fffffffdfdc00000
x20: ffffffe51c2f0831 x19: ffffffe51c6fcfc0 x18: 00000000ffff1020
x17: 00000000e1e2ac90 x16: 00000000000000e0 x15: ffffffe51b710708
x14: 0000000000000066 x13: 0000000000000018 x12: 0000000000000000
x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000
x8 : 0000000000000000 x7 : 61632065726f6665 x6 : 6220646573752027
x5 : ffffffe51c641d25 x4 : ffffffe51c13142c x3 : ffff0a00ffffff05
x2 : 40000000ffffe003 x1 : 00000000000001c0 x0 : 0000000000000065
Call trace:
static_key_enable_cpuslocked+0xb0/0xb8
static_key_enable+0x2c/0x40
crng_set_ready+0x24/0x30
execute_in_process_context+0x80/0x90
_credit_init_bits+0x100/0x154
add_bootloader_randomness+0x64/0x78
early_init_dt_scan_chosen+0x140/0x184
early_init_dt_scan_nodes+0x28/0x4c
early_init_dt_scan+0x40/0x44
setup_machine_fdt+0x7c/0x120
setup_arch+0x74/0x1d8
start_kernel+0x84/0x44c
__primary_switched+0xc0/0xc8
---[ end trace 0000000000000000 ]---
random: crng init done
Machine model: Google Lazor (rev1 - 2) with LTE
A trivial fix went in to address this on arm64, 73e2d827a5 ("arm64:
Initialize jump labels before setup_machine_fdt()"). But it appears that
fixing it on other platforms might not be so trivial. Instead, defer the
setting of the static branch until later in the boot process.
Fixes: f5bda35fba ("random: use static branch for crng_ready()")
Reported-by: Stephen Boyd <swboyd@chromium.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The driver had a number of issues, checkpatch warnings/errors,
and other limitations, so fix these up to make it usable.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
hwmon: emc2305: Add calls to initialise of cooling maps
Commit 46ef9d4ed2 ("hwmon: emc2305: fixups for driver submitted to
mailing lists") missed adding the call to thermal_of_cooling_device_register
required to configure any cooling maps for the device, hence stopping it
from actually ever changing speed.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
hwmon: emc2305: Change OF properties pwm-min & pwm-max to u8
There is no DT binding for emc2305 as mainline are still
discussing how to do a generic fan binding.
The 5.15 driver was reading the "emc2305," properties
"cooling-levels", "pwm-max", "pwm-min", and "pwm-channel" as u8.
The overlay was writing them as u16 (;) so it was working.
The 6.1 driver was reading as u32, which failed as there is
insufficient data.
As this is all downstream only, revert to u8 to match 5.15.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
It is quite common for the devm_thermal_zone_of_sensor_register
to need to defer, so avoid spamming the log by using
dev_err_probe instead of dev_err.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add a driver for the Arducam Pivariety series CSI2 camera sensor.
Signed-off-by: Lee Jackson <info@arducam.com>
SQUASH: Fix VIDEO_ARDUCAM_PIVARIETY Kconfig entry
The cherry-pick from rpi-5.17.y put it in the wrong section, and failed
to update it for 5.18.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
media: i2c: arducam-pivariety: Add custom controls
Add support for strobe_shift, strobe_width and mode custom controls.
Signed-off-by: Lee Jackson <info@arducam.com>
media: i2c: arducam-pivariety: Fix mutex init and NULL pointer
The mutex used in arducam-pivariety was not properly initialized,
which could lead to undefined behavior. This also caused a NULL
pointer dereference under certain conditions.
This patch ensures the mutex is correctly initialized during probe
and prevents NULL pointer dereferences.
Signed-off-by: Yuriy Pasichnyk <yurijpasichnyk11@gmail.com>
Add YAML device tree binding for Arducam Pivariety CMOS image sensor, and
the relevant MAINTAINERS entries.
Signed-off-by: Lee Jackson <info@arducam.com>
On some platforms the cma area can be half the entire system memory,
meaning that allocations start happening in the cma area immediately.
This leads to fragmentation and subsequent fatal cma_alloc failures.
We introduce an "alloc_in_cma_threshold" parameter which requires that
this many sixteenths of the free pages must be in cma before it will
try to use them. By default this is set to 12, but the previous
behaviour can be restored by setting it to 8 on startup.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Adds a driver for the Analog Devices AD5398 10 bit
I2C DAC which is commonly used for driving VCM lens
mechanisms.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: Rename ad5398 to ad5398_vcm
There's already a regulator module called ad5398 that exposes
this device through the regulator API. That is meaningless in
the terms that it uses and how it maps to V4L2, so a new driver
was added. However the module name collision wasn't noted, so
rename it now.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add a binding for Analog Devices AD5398 10bit current
sinking DAC when used as a lens VCM driver.
FIXME: Convert to YAML
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Omnivision OV2311 is a CSI2 1600x1300 global shutter image sensor.
Add a driver for it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: Update ov2311 Kconfig entry
Bring the OV2311 Kconfig declaration in line with upstream entries.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
media: i2c: ov2311: Fix uninitialized variable usage
Signed-off-by: Alexander Winkowski <dereference23@outlook.com>
"media: i2c: Remove .s_power() from ov7251" removed the call that
sent ov7251_global_init_setting to the sensor. Send it when starting
streaming.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The sck-idle-input property indicates that the spi-gpio driver should
return the SCK line to an input when the chip select signals are
inactive.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
https://github.com/raspberrypi/linux/issues/4440
Upstream has added additional device specific controls, so the
V4L2_CID_USER_BASE + 0x10e0 value that had been defined for use with
the ISP has been taken by something else (and +0x10f0 has been used as
well)
Duplicate the use on V4L2_CID_USER_BASE + 0x10e0 so that userspace
(libcamera) doesn't need to change. Once the driver is upstream, then
we'll update libcamera to adopt the new value as it then won't change.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Not all implementations wire up the enable GPIO and may just tie
it to a supply rail.
Make it optional.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The I2C to the Atmel is very fussy, and locks up easily on
Pi0-3 particularly on reads.
The LCD power status is controlled solely by this driver, so
rather than reading it back from the Atmel, use the cached
status last set.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
regulator/rpi-panel: Power off display on shutdown
Adds a shutdown function to turn off the backlight, bridge, and
touch controller on shutdown.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
regulator/rpi-panel: Remove the ID read
Reading from the Atmel has always been troublesome due to
clock stretching, and the driver does nothing with it anyway.
Remove the read and assume that if the overlay has been
configured (most likely through the firmware autodetection)
that the hardware is present.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
As happens occasionally, an upstream change has once again prevented
spidev from being loaded via Device Tree. We now need "spidev" to be
included in the new spi_device_id list, otherwise although the
spidev driver gets loaded no /dev/spidev*.* entries will appear.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
An unwanted side effect of enabling the BRCMDBG config setting is
redefining brcmf_info to be brcmf_err. This can be alarming to users
and makes it harder to spot real errors, so don't do it.
See: https://github.com/raspberrypi/linux/issues/4663
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
For cases where spare PWM outputs are available, but are desired
to be addressed a standard outputs instead.
Wraps a PWM channel as a new GPIO chip with the one output.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
pwm: gpio-pwm: follow pwm_apply_might_sleep() rename
Fixes: 03286093be68("drivers/gpio: Add a driver that wraps the PWM API as a GPIO controller")
Signed-off-by: Ratchanan Srirattanamet <peathot@hotmail.com>
Some platforms include a fan-speed register that reports RPM directly
as an alternative to counting interrupts from the fan tachometer input.
Add support for reading a register at a given offset (rpm-offset) within
a block declared in another node (rpm-regmap). This indirection allows
the usual address mapping to be performed, and for address sharing with
another driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Should we go through the timeout failure case with port_disable
not returning all buffers for whatever reason, the
buffers_with_vpu counter gets left at a non-zero value, which
will cause reference counting issues should the instance be
reused.
Reset the count when the port is enabled again, but before
any buffers have been sent to the VPU.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The firmware can only use I2C0 if the kernel isn't, therefore
with libcamera and DRM using it the PoE HAT fan control needs
to move to the kernel.
Add the option for the driver to be created by the PoE HAT core
MFD driver, and use the I2C regmap that provides to control fan
functions.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Raspbery Pi PoE+ HAT exposes a fan controller and power
supply status reporting via a single I2C address.
Create an MFD device that allows loading of the relevant
sub-drivers, with a shared I2C regmap.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Serge Schneider <serge@raspberrypi.com>
power: rpi-poe: Drop CURRENT_AVG as it is not hardware averaged
As documented the _AVG parameters are meant to be hardware
averaged, but the implementation for the PoE+ HAT was done in
software in the firmware.
Drop the property.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
power: rpi-poe: Add option of being created by MFD or FW
The firmware can only use I2C0 if the kernel isn't, therefore
with libcamera and DRM using it the PoE HAT fan control needs
to move to the kernel.
Add the option for the driver to be created by the PoE HAT core
MFD driver, and use the I2C regmap that provides to control fan
functions.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
As we're wanting to wrap the image_fx component for deinterlacing,
add the deinterlace algorithm values to enum mmal_parameter_imagefx
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Adds enum mmal_interlace_type and struct
mmal_parameter_video_interlace_type to allow for querying the
interlacing mode on decoders.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Bindings for the panel-dsi specific additions to panel-simple.
Allow for DSI specific bus settings and panel timing
to be define in devicetree. Very similar to panel-dpi.
Signed-off-by: Timon Skerutsch <kernel@diodes-delight.com>
The Geekworm MZP280 panel is a 480x640 (portrait) panel with a
capacitive touch interface and a 40 pin header meant to interface
directly with the Raspberry Pi. The screen is 2.8 inches diagonally,
and there appear to be at least 4 distinct versions all with the same
panel timings.
Timings were derived from drivers posted on the github located here:
https://github.com/tianyoujian/MZDPI/tree/master/vga
Additional details about this panel family can be found here:
https://wiki.geekworm.com/2.8_inch_Touch_Screen_for_Pi_zero
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
The regulator of the Waveshare DSI-TOUCH series panels is different.
Add a new driver for this regulator.
Signed-off-by: Waveshare_Team <support@waveshare.com>
drivers/regulator : Adjust power enable sequence
Avoid direct enabling of LCD power here.
Certain screens require immediate initialization after power-on
to prevent polarization.
Signed-off-by: Waveshare_Team <support@waveshare.com>
Waveshare sell a range of DSI panels of varying sizes, all
using a common MCU to control the panel and backlight.
Add a panel driver that supports these panels.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: waveshare: Fix up timings for 10.1" panel
The 10.1" panel doesn't work with the timings defined. vc4
will always have been fixing up the timing due to the limited
integer divider, so compute the fixed up mode and use it
directly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drivers/gpu/drm/panel:fix waveshare panel software restart/shutdown display is abnormal
Fixed the screen stays white when the user restarts or shuts down
Signed-off-by: eng33 <eng33@waveshare.com>
drivers/gpu/drm/panel:Modify the DSI mode to fix the problem that 7.9inch cannot be displayed
Signed-off-by: eng33 <eng33@waveshare.com>
drivers/gpu/drm/panel:Modified the timing of 11.9inch to fix the issue that 11.9inch was displayed abnormally
Signed-off-by: eng33 <eng33@waveshare.com>
Driver:add waveshare 4inch dsi lcd (C) driver
Signed-off-by: Eng33 <eng33@waveshare.net>
drivers:gpu:drm:panel: Added waveshare 5.0inch, 6.25inch, and 8.8inch dsi screen devices
Signed-off-by: eng33 <eng33@waveshare.com>
drm: panel: Added waveshare 13.3inch panel
Signed-off-by: eng33 <eng33@waveshare.com>
drm: panel: Added waveshare 7.0inch h dsi screen support
Signed-off-by: Waveshare_Team <support@waveshare.com>
drivers/gpu/drm/panel : Add the device for the Waveshare DSI-TOUCH series panels.
the driver are provided for the Waveshare DSI-TOUCH series panels,
modelled after the other Ilitek controller drivers,
but not limited to Ilitek series controllers.
The aim is to offer a more consistent operation and
experience for the Waveshare DSI-TOUCH series panels.
Signed-off-by: Waveshare_Team <support@waveshare.com>
drivers/gpu/drm/panel : Update display driver
1) Add LCD power control
2) Added support for:
A) Add support for 3.4-DSI-TOUCH-C
B) Add support for 4-DSI-TOUCH-C
C) Add support for 8-DSI-TOUCH-A
D) Add support for 9-DSI-TOUCH-B
E) Add support for 10.1-DSI-TOUCH-B
F) Add support for 12.3-DSI-TOUCH-A
Signed-off-by: Waveshare_Team <support@waveshare.com>
The Top DisplayOptoelectronics (TDO) T17B driver chip is used
in the TL040HDS20CT panel (found in the Pimoroni HyperPixel4
Square display) and potentially other displays.
The driver chip supports SPI for configuration and DPI
video data.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This new panel uses the ILI9881C IC but needs an alternate
init sequence, and therefore requires a new compatible string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Penk Chen <penk@cutiepi.io>
drm/panel: ilitek-ili9881c: Clean up on mipi_dsi_attach failure
mipi_dsi_attach is allowed to fail, and currently the probe
code doesn't clean up (mainly drm_panel_remove) if this happens.
Add cleanup code on failure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: panel-ilitek9881c: Add prepare_upstream_first flag
The panel sends MIPI DCS commands during prepare and is expecting
the bus to remain in LP-11 state in-between.
Set the prepare_upstream_first flag so that the upstream DSI host
is prepared / pre_enabled first, and therefore the bus is in a
defined state.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: panel-ilitek9881c: Use cansleep methods
Use cansleep version of gpiod_set_value so external IO drivers (like
via I2C) can be used.
Signed-off-by: Mark Williams <mwp@mwp.id.au>
drm/panel: panel-ilitek9881c: Crystalfontz support
Add support for Crystalfontz CFAF7201280A0-050Tx panel.
Signed-off-by: Mark Williams <mwp@mwp.id.au>
drm/panel: ilitek-ili9881c: Allow configuration of the number of lanes
Not all panels use all 4 data lanes, so allow configuration based
on the compatible string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: ili9881: Add configuration for the new panels
Add configuration for the 5" and 7" Raspberry Pi 720x1280
DSI panels based on ili9881.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm: panel: ili9881: Correct symmetry on enable/disable return codes
ili9881c_enable is always returning 0.
ili9881c_disable was returning the error code from
mipi_dsi_dcs_set_display_off.
If non-zero, the drm_panel framework will leave the panel marked as
enabled, and not run the enable hook next time around. That isn't
helpful, particularly as we're expecting unprepare to disable
resets and regulators.
Change ili9881c_disable to match enable in always returning 0.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm: panel: ili9881: Add option to reconfigure setup commands
The driver is typically asking for LP commands, but then tries
to send set_display_[on|off] from enable/disable when the host
will be in HS mode.
It also sends shutdown commands just before it asserts reset and
disables the regulator, which is rather redundant.
Add an option to configure these two choices from the panel_desc.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Update panel-ilitek-ili9881c.c
Correcting display dimension typo
ILI9881C: Update timings for CFAF7201280A0-050TX
Update of ILI9881C CFAF7201280A0-050TX panel timing to work with the CM5.
Signed-off-by: Mark Williams <mark@crystalfontz.com>
drm/panel: ilitek-ili9881c: Restore lanes configuration for nwe080 panel
This config was missing with the forward porting of the rasp pi kernel to
6.12. Refer to https://github.com/raspberrypi/linux/issues/6856
Signed-off-by: Jack O'Brien <obri.jack.02@gmail.com>
Signed-off-by: Penk Chen <penk@cutiepi.io>
drm/panel: ilitek-ili9881c: Clean up on mipi_dsi_attach failure
mipi_dsi_attach is allowed to fail, and currently the probe
code doesn't clean up (mainly drm_panel_remove) if this happens.
Add cleanup code on failure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: panel-ilitek9881c: Add prepare_upstream_first flag
The panel sends MIPI DCS commands during prepare and is expecting
the bus to remain in LP-11 state in-between.
Set the prepare_upstream_first flag so that the upstream DSI host
is prepared / pre_enabled first, and therefore the bus is in a
defined state.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: panel-ilitek9881c: Use cansleep methods
Use cansleep version of gpiod_set_value so external IO drivers (like
via I2C) can be used.
Signed-off-by: Mark Williams <mwp@mwp.id.au>
drm/panel: panel-ilitek9881c: Crystalfontz support
Add support for Crystalfontz CFAF7201280A0-050Tx panel.
Signed-off-by: Mark Williams <mwp@mwp.id.au>
drm/panel: ilitek-ili9881c: Allow configuration of the number of lanes
Not all panels use all 4 data lanes, so allow configuration based
on the compatible string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: ili9881: Add configuration for the new panels
Add configuration for the 5" and 7" Raspberry Pi 720x1280
DSI panels based on ili9881.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm: panel: ili9881: Correct symmetry on enable/disable return codes
ili9881c_enable is always returning 0.
ili9881c_disable was returning the error code from
mipi_dsi_dcs_set_display_off.
If non-zero, the drm_panel framework will leave the panel marked as
enabled, and not run the enable hook next time around. That isn't
helpful, particularly as we're expecting unprepare to disable
resets and regulators.
Change ili9881c_disable to match enable in always returning 0.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm: panel: ili9881: Add option to reconfigure setup commands
The driver is typically asking for LP commands, but then tries
to send set_display_[on|off] from enable/disable when the host
will be in HS mode.
It also sends shutdown commands just before it asserts reset and
disables the regulator, which is rather redundant.
Add an option to configure these two choices from the panel_desc.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Update panel-ilitek-ili9881c.c
Correcting display dimension typo
ILI9881C: Update timings for CFAF7201280A0-050TX
Update of ILI9881C CFAF7201280A0-050TX panel timing to work with the CM5.
Signed-off-by: Mark Williams <mark@crystalfontz.com>
drm/panel: ilitek-ili9881c: Restore lanes configuration for nwe080 panel
This config was missing with the forward porting of the rasp pi kernel to
6.12. Refer to https://github.com/raspberrypi/linux/issues/6856
Signed-off-by: Jack O'Brien <obri.jack.02@gmail.com>
There is no reason why the control GPIOs for the panel can not
be connected to I2C or similar GPIO interfaces that may need to
sleep, therefore switch from gpiod_set_value to
gpiod_set_value_cansleep calls to configure them.
Without that you get complaints from gpiolib every time the state
is changed.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm: panel: jdi-lt070me05000: Add prepare_upstream_first flag
The panel driver wants to send DCS commands from the prepare
hook, therefore the DSI host wants to be pre_enabled first.
Set the flag to achieve this.
https://forums.raspberrypi.com/viewtopic.php?t=354708
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Raspberry Pi 7" 800x480 panel uses a Toshiba TC358762 DSI
to DPI bridge chip, so there is a requirement for the timings
to be specified for the end panel. Add such a definition.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel-simple: Populate bpc when using panel-dpi
panel-dpi doesn't know the bit depth, so in the same way that
DPI is guessed for the connector type, guess that it'll be 8bpc.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel-simple: Allow the bus format to be read from DT for panel-dpi
The "panel-dpi" compatible string configures panel from device tree,
but it doesn't provide any way of configuring the bus format (colour
representation), nor does it populate it.
Add a DT parameter "bus-format" that allows the MEDIA_BUS_FMT_xxx value
to be specified from device tree.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: simple: add Geekworm MZP280 Panel
Add support for the Geekworm MZP280 Panel
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Acked-by: Maxime Ripard <maxime@cerno.tech>
drm/panel: simple: Add Innolux AT056tN53V1 5.6" VGA
Add support for the Innolux AT056tN53V1 5.6" VGA (640x480) TFT LCD
panel.
Signed-off-by: Joerg Quinten <aBUGSworstnightmare@gmail.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
drm/panel: simple: Alter the timing for the Pi 7" DSI display
vc4 has always fixed up the timing, so the values defined have
never actually appeared on the wire.
The display appears to want a slightly longer HFP, so extend
the timings and recompute the clock to give the same frame rate.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel: add panel-dsi
Equivalent to panel-dpi for configuring a simple DSI panel with
device tree side timings and bus settings.
Motiviation is the same as for panel-dpi of wanting to support
new simple panels without needing to patch the kernel.
Signed-off-by: Timon Skerutsch <kernel@diodes-delight.com>
drm/panel-simple: Remove custom handling of orientation
The framework now handles reading orientation from DT, therefore
remove the custom get_orientation hook from panel-simple.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel-simple: Fix 7inch panel mode for misalignment
The 7inch panel is one line off the screen both horizontally
and vertically.
Alter the panel mode to correct this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel-simple: Increase pixel clock on Pi 7inch panel
The Toshiba bridge is very fussy and doesn't like the CM3
output when being told to produce a 27.777MHz pixel clock, which
is an almost perfect match to the DSI link integer divider.
Increasing to 30MHz will switch the DSI link from 333MHz to 400MHz
and makes the bridge happy with the same video timing as works
on Pi4.
(Pi4 will be using a link frequency of 375MHz due to a 3GHz
parent PLL).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
We now have the hardware I2C controller pinmuxed to the drive the
display I2C, but this controller does not support clock stretching.
The Atmel micro-controller in the panel requires clock stretching
to allow it to prepare any data to be read.
Split the rpi_touchscreen_i2c_read into two independent transactions with
a delay between them for the Atmel to prepare the data.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel/raspberrypi-ts: Insert delay before polling for startup state
In switching to the hardware I2C controller there is an issue
where we seem to not get back the correct state from the Pi
touchscreen.
Insert a delay before polling to avoid this condition.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel/raspberrypi-touchscreen: Handle I2C errors.
rpi_touchscreen_i2c_read returns any errors from i2c_transfer,
or the 8 bit received value.
Check for error values before trying to process the data as
valid.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/panel/raspberrypi-touchscreen: Insert more delays.
This avoids failures in cases where the panel is enabled
or re-probed very soon after being disabled or probed.
These can occur because the Atmel device can mis-behave
over I2C for a few ms after any write to the POWERON register.
Add call to v4l2_ctrl_new_fwnode_properties to read and
create the fwnode based controls.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Whilst the hardware can't achieve the limits of level 4.2 under
all situations, it can exceed level 4.0.
Allow selection of levels 4.1 and 4.2.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Adafruit Mini-PiTFT13 display needs offsets applying when rotated,
so use the "variant" mechanism to select a custom set_addr_win method
using a dedicated compatible string of "fbtft,minipitft13".
See: https://github.com/raspberrypi/firmware/issues/1524
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
DMABUFs are all handled by videobuf2, so there is no reason not
to enable support for them.
Note that this driver is still using the vmalloc allocator, so
the buffers it allocates will not be compatible with the codec
or ISP driver that require contiguous buffers. However this
driver should be able to import the buffers allocated by them.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
In order to simplify the driver slightly, use the same PLL
configuration, and hence pixel rate and link frequency (to be
added) for the full, 1080p, and binned modes.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
There are many registers in common between all the modes.
Pull those out into one common table.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
To make comparisons of the mode registers easier, put the registers
for the binned and VGA modes in the same order as the others.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The driver did expose V4L2_CID_HBLANK, but as a READ_ONLY control.
The sensor only uses the HTS register to control the line length,
so convert this control to read/write, with the appropriate ranges.
Adopt the old fixed values as the minimum values permitted in each
mode to avoid issues of it not streaming.
This should allow exposure times up to ~3 seconds (up from ~1sec).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
v4l2_async_register_subdev doesn't bind in lens or flash drivers,
but v4l2_async_register_subdev_sensor does.
Switch to using v4l2_async_register_subdev_sensor.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The driver supported using GPIOs to control the shutdown line,
but no regulator control.
Add regulator hooks.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add these missing V4L2 controls. Tested binned and full resolution
modes in all four orientations using Raspberry Pi running libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Fixes the following v4l2-compliance failure:
fail: v4l2-test-controls.cpp(871): subscribe event for control 'User Controls' failed test
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Trial and error reveals that the minimum vblank value appears to be 24
(the OV5647 data sheet does not give any clues). This fixes streaming
lock-ups in full resolution mode.
Fixes: 9b5a5ebedc ("media: i2c: ov5647: Add support for V4L2_CID_VBLANK")
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
The top offset in the pixel array is actually 6 (see page 3-1 of the
OV5647 data sheet).
Fixes: f2f7ad5ce5 ("media: i2c: ov5647: Selection compliance fixes")
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Parse device properties and register controls for them using the V4L2
fwnode properties helpers.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Previously they were returning positive non-zero codes for success,
which were getting passed up the call stack. Since release 4.19,
do_dentry_open (fs/open.c) has been catching these and flagging an
error. (So this driver has been broken since that date.)
Fixes: 3c2472a [media] media: i2c: Add support for OV5647 sensor
Signed-off-by: David Plowman <david.plowman@raspberrypi.org>
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The kernel modules aes-neon-blk and aes-neon-bs perform poorly, at least on
Cortex-A72 without crypto extensions. In fact, aes-arm64 outperforms them
on benchmarks, despite it being a simpler implementation (only accelerating
the single-block AES cipher).
For modes of operation where multiple cipher blocks can be processed in
parallel, aes-neon-bs outperforms aes-neon-blk by around 60-70% and aes-arm64
is another 10-20% faster still. But the difference is even more marked with
modes of operation with dependencies between neighbouring blocks, such as
CBC encryption, which defeat parallelism: in these cases, aes-arm64 is
typically around 250% faster than either aes-neon-blk or aes-neon-bs.
The key trade-off with aes-arm64 is that the look-up tables are situated in
RAM. This leaves them potentially open to cache timing attacks. The two other
modules, by contrast, load the look-up tables into NEON registers and so are
able to perform in constant time.
This patch aims to load aes-arm64 more often.
If none of the currently-loaded crypto modules implement a given algorithm,
a new one is typically selected for loading using a platform-neutral alias
describing the required algorithm. To enable users to still
load aes-neon-blk or aes-neon-bs if they really want them, while still
ensuring that aes-arm64 is usually selected, remove the aliases from
aes-neonbs-glue.c and aes-glue.c and apply them to aes-cipher-glue.c, but
still build the two NEON modules.
Since aes-glue.c can also be used to build aes-ce-blk, leave them enabled
if USE_V8_CRYPTO_EXTENSIONS is defined, to ensure they are selected if we
in future use a CPU which has the crypto extensions enabled.
Note that the algorithm priority specifiers are unchanged, so if
aes-neon-bs is loaded at the same time as aes-arm64, the former will be
used in preference. However, aes-neon-blk and aes-arm64 have tied priority,
so whichever module was loaded first will be used (assuming aes-neon-bs is
not loaded).
Signed-off-by: Ben Avison <bavison@riscosopen.org>
A relatively recent commit ([1]) contained optimisation for the PIO
SPI FIFO-filling functions. The commit message includes the phrase
"[t]he blind and counted loops are always called with nonzero count".
This is technically true, but it is still possible for count to become
zero before the loop is entered - if tfr->len is zero. Moving the loop
exit condition to the end of the loop saves a few cycles, but results
in a near-infinite loop should the revised count be zero on entry.
Strangely, zero-lengthed transfers aren't filtered by the SPI framework
and, even more strangely, the Python3 spidev library is triggering them
for no obvious reason.
Avoid the problem completely by bailing out of the main transfer
function early if trf->len is zero, although there may be a case for
moving the mitigation into the framework.
See: https://github.com/raspberrypi/linux/issues/4100
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[1] 26751de25d ("spi: bcm2835: Micro-optimise FIFO loops")
Support has been added for the unpacked (16bpp) versions of
the MIPI raw 10, 12, and 14 formats, so add the 4CCs for them.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
DSI1 on BCM2711 doesn't require the DMA workaround that is used
on BCM2835/6/7, therefore it needs a new compatible string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Not all systems have the interrupt line wired up, so switch to
polling the touchscreen off a timer if no interrupt line is
configured.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
input: edt-ft5x06: Handle unreliable TOUCH_UP events
The ft5x06 is unreliable in sending touch up events, so some
touch IDs can become stuck in the detected state.
Ensure that IDs that are unreported by the controller are
released.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
input: edt-ft5x06: Only look at the number of points reported
Register 0x02 in the FT5x06 is TD_STATUS containing the number
of valid touch points being reported.
Iterate over that number of points rather than all that are
supported on the device.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
input: edt-ft5x06: Only read data for number of points reported
Rather than always reading the maximum number of points supported
by the chip (which may be as high as 10), read the number of
active points first, and read data for just those.
In most cases this will result in less data on the I2C bus,
with only the maximum touch points taking more due to a second
read that has to configure the start address.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
input: edt-ft5x06: Fix patch reading only the number of points reported
Fix bad conflict resolution from upstream updates. Need to read
from tsdata->tdata_offset bytes, not from tsdata->offset.
Also fix logging of i2c read errors to cover both transactions.
Fixes: 7216fcfe2e ("input: edt-ft5x06: Only read data for number of points reported")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Input: edt-ft54x6: Clean up timer and workqueue on remove
If no interrupt is defined then a timer and workqueue are used
to poll the controller.
On remove these were not being cleaned up correctly.
Fixes: ca61fdaba7 "Input: edt-ft5x06: Poll the device if no interrupt is
configured."
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
input: touchscreen: edt-ft5x06: Suppress bogus data on startup
When polled without the use of IRQ, FT5x06 registers may return
undefined initial data, causing unwanted touches or event spamming.
A simple way to filter this out is to suppress touches until the
TD_STATUS register changes for the first time.
Increase the delay before first polling to 300ms, to avoid
transient I2C read flakiness that seems to occur after reset.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
input: edt-ft5x06: Include I2C details in names for the devices
libinput uses the input device name alone. If you have two
identical input devices, then there is no way to differentiate
between them, and in the case of touchscreens that means no
way to associate them with the appropriate display device.
Add the I2C bus and address to the start of the input device
name so that the name is always unique within the system.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
input: edt-ft5x06: Correct prefix length in snprintf
snprintf takes the length of the array that we can print into,
and has to fit the NULL terminator in there too.
Printing the prefix is generally "12-3456 " which is 8 desired
characters (the length of EDT_NAME_PREFIX_LEN) and the NULL.
The space is therefore being truncated to fit the NULL in.
Increase the length snprintf is allowed to use.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Define a new mailbox (SET_REBOOT_FLAGS) which may be used to
pass optional flags to the Raspberry Pi firmware that changes
the behaviour of the bootloader and firmware during a reboot.
Currently this just defines the 'tryboot' flag which causes
the firmware to load tryboot.txt instead config.txt. This
alternate configuration file can be used to specify the
path of an alternate firmware and kernels allowing a fallback
mechanism to be implemented for OS upgrades.
The gpio-fsm driver implements simple state machines that allow GPIOs
to be controlled in response to inputs from other GPIOs - real and
soft/virtual - and time delays. It can:
+ create dummy GPIOs for drivers that demand them,
+ drive multiple GPIOs from a single input, with optional delays,
+ add a debounce circuit to an input,
+ drive pattern sequences onto LEDs
etc.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Fix a build warning
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Rename 'num-soft-gpios' to avoid warning
As of 5.10, the Device Tree parser warns about properties that look
like references to "suppliers" of various services. "num-soft-gpios"
resembles a declaration of a GPIO called "num-soft", causing the value
to be interpreted as a phandle, the owner of which is checked for a
"#gpio-cells" property.
To avoid this warning, rename the gpio-fsm property to "num-swgpios".
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Show state info in /sys/class/gpio-fsm
Add gpio-fsm sysfs entries under /sys/class/gpio-fsm. For each state
machine show the current state, which state (if any) will be entered
after a delay, and the current value of that delay.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Fix shutdown timeout handling
The driver is intended to jump directly to a shutdown state in the
event of a timeout during shutdown, but the sense of the test was
inverted.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Clamp the delay time to zero
The sysfs delay_ms value is calculated live, and it is possible for
the time left to appear to be negative briefly if the timer handling
hasn't completed. Ensure the displayed value never goes below zero,
for the sake of appearances.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Sort functions into a more logical order
Move some functions into a more logical ordering. This change causes
no functional change and is essentially cosmetic.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio_fsm: Rework the atomic-vs-non-atomic split
Partition the code to separate atomic and non-atomic methods so that
none of them have to handle both cases. The result avoids using deferred
work unless necessary, and should be easier to understand.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Driver for the BCM2835 ISP hardware block. This driver uses the MMAL
component to program the ISP hardware through the VC firmware.
The ISP component can produce two video stream outputs, and Bayer
image statistics. This can't be encompassed in a simple V4L2
M2M device, so create a new device that registers 4 video nodes.
This patch squashes all the development patches from the earlier
rpi-5.4.y branch into one
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
staging/bcm2835-isp: Add the unpacked (16bpp) raw formats
Now that the firmware supports the unpacked (16bpp) variants
of the MIPI raw formats, add the mappings.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-isp: Log the number of excess supported formats
When logging that the firmware has provided more supported formats
than we had allocated storage for, log the number allocated and
returned.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: vc04_services: ISP: Add colour denoise control
Add colour denoise control to the bcm2835 driver through a new v4l2
control: V4L2_CID_USER_BCM2835_ISP_CDN.
Add the accompanying MMAL configuration structure definitions as well.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
bcm2835-isp: Allow formats with different colour spaces.
Each supported format now includes a mask showing the allowed colour
spaces, as well as a default colour space for when one was not
specified.
Additionally we translate the colour space to mmal format and pass it
over to the VideoCore.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
media: i2c: add ov9281 driver.
Change-Id: I7b77250bbc56d2f861450cf77271ad15f9b88ab1
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: fix mclk issue when probe multiple camera.
Takes the ov9281 part only from the Rockchip's patch.
Change-Id: I30e833baf2c1bb07d6d87ddb3b00759ab45a90e4
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: add enum_frame_interval function for iq tool 2.2 and hal3
Adds the ov9281 parts of the Rockchip patch adding enum_frame_interval to
a large number of drivers.
Change-Id: I03344cd6cf278dd7c18fce8e97479089ef185a5c
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: Fixup for recent kernel releases, and remove custom code
The Rockchip driver was based on a 4.4 kernel, and had several custom
Rockchip parts.
Update to 5.4 kernel APIs, with the relevant controls required by
libcamera, and remove custom Rockchip parts.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Read chip ID via 2 reads
Vision Components have made an OV9281 module which blocks reading
back the majority of registers to comply with NDAs, and in doing
so doesn't allow auto-increment register reading as used when
reading the chip ID.
Use two reads and manually combine the results.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Add support for 8 bit readout
The sensor supports 8 bit mode as well as 10bit, so add the
relevant code to allow selection of this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: ov9281: Add 1280x720 and 640x480 modes
Breaks out common register set and adds the different registers
for 1280x720 (cropped) and 640x480 (skipped) modes
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Fixed picture line bug in all ov9281 modes
Signed-off-by: Mathias Anhalt <mathiasanhalt@web.de>
Added hflip and vflip controls to ov9281
Signed-off-by: Mathias Anhalt <mathiasanhalt@web.de>
media: i2c: ov9281: Remove override of subdev name
From the original Rockchip driver, the subdev was renamed
from the default to being "mov9281 <dev_name>" whereas the
default would have been "ov9281 <dev_name>".
Remove the override to drop back to the default rather than
a vendor custom string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: v4l2-subdev: add subdev-wide state struct
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
media: i2c: ov9281: Add fwnode properties controls
Add call to v4l2_ctrl_new_fwnode_properties to read and
create the fwnode based controls.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Sensor should report RAW color space
Tested on Raspberry Pi running libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Partial revert "media: i2c: add ov9281 driver."
This partially reverts commit 84e98e3a4f.
The commit had merged some changes to other drivers with adding the ov9281
driver. Only the ov9281 parts have been reverted.
staging/bcm2835-isp: Fix compiler warning
The result of dividing a u32 by a size_t is an unsigned int on arm32
and a long unsigned int on arm64. Use "%zu" (the size_t format) to
remove the build warning for 64-bit builds.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
staging: vc04_services: isp: Set the YUV420/YVU420 format stride to 64 bytes
The bcm2835 ISP requires the base address of all input/output planes to have 32
byte alignment. Using a Y stride of 32 bytes would not guarantee that the V
plane would fulfil this, e.g. a height of 650 lines would mean the V plane
buffer is not 32 byte aligned for YUV420 formats.
Having a Y stride of 64 bytes would ensure both U and V planes have a 32 byte
alignment, as the luma height will always be an even number of lines.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
vc04_services: isp: Report input node as wanting full range RAW color space
RAW color spaces are more usually reported as having full range
quantization.
Tested using libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
drivers: bcm2835_isp: Allow multiple users for the ISP driver.
Add a second (identical) set of device nodes to allow concurrent use of the ISP
hardware by another user. This change effectively creates a second state
structure (struct bcm2835_isp_dev) to maintain independent state for the second
user. Node and media entity names are appened with the instance index
appropriately.
Further users can be added by changing the BCM2835_ISP_NUM_INSTANCES define.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: bcm2835_isp: Fix div by 0 bug.
Fix a possible division by 0 bug when setting up the mmal port for the stats
port.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
staging/bcm2835-isp: Fix cleanup after init fail
bcm2835_isp_remove is called on an initialisation failure, but at that
point the drvdata hasn't been set. This causes a crash when e.g. using
the cutdown firmware (gpu_mem=16).
Move platform_set_drvdata before the instance probing loop to avoid the
problem.
See: https://github.com/raspberrypi/linux/issues/4774
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2835-v4l2-isp: Add missing lock initialization
ISP device allocation is dynamic hence the locks too.
struct mutex queue_lock is not initialized which result in bug.
Fixing same by initializing it.
[ 29.847138] INFO: trying to register non-static key.
[ 29.847156] The code is fine but needs lockdep annotation, or maybe
[ 29.847159] you didn't initialize this object before use?
[ 29.847161] turning off the locking correctness validator.
[ 29.847167] CPU: 1 PID: 343 Comm: v4l_id Tainted: G C 5.15.11-rt24-v8+ #8
[ 29.847187] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[ 29.847194] Call trace:
[ 29.847197] dump_backtrace+0x0/0x1b8
[ 29.847227] show_stack+0x20/0x30
[ 29.847240] dump_stack_lvl+0x8c/0xb8
[ 29.847254] dump_stack+0x18/0x34
[ 29.847263] register_lock_class+0x494/0x4a0
[ 29.847278] __lock_acquire+0x80/0x1680
[ 29.847289] lock_acquire+0x214/0x3a0
[ 29.847300] mutex_lock_nested+0x70/0xc8
[ 29.847312] _vb2_fop_release+0x3c/0xa8 [videobuf2_v4l2]
[ 29.847346] vb2_fop_release+0x34/0x60 [videobuf2_v4l2]
[ 29.847367] v4l2_release+0xc8/0x108 [videodev]
[ 29.847453] __fput+0x8c/0x258
[ 29.847476] ____fput+0x18/0x28
[ 29.847487] task_work_run+0x98/0x180
[ 29.847502] do_notify_resume+0x228/0x3f8
[ 29.847515] el0_svc+0xec/0xf0
[ 29.847523] el0t_64_sync_handler+0x90/0xb8
[ 29.847531] el0t_64_sync+0x180/0x184
Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
staging: vc04_services: isp: Permit all sRGB colour spaces on ISP outputs
ISP outputs actually support all colour spaces that are fundamentally
sRGB underneath, regardless of whether an RGB or YUV output format is
actually requested.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
drivers: staging: bcm2835-isp: Do not cleanup mmal vcsm buffer on stop_streaming
On stop_streaming() the vcsm buffer handle gets released by the buffer cleanup
code. This will subsequently cause and error if userland re-queues the same
buffer on the next start_streaming() call.
Remove this cleanup code and rely on the vb2_ops->buf_cleanup() call to do the
cleanups instead.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: staging: bcm2835-isp: Clear LS table handle in the firmware
When all nodes have stopped streaming, ensure the firmware has released its
handle on the LS table dmabuf. This is done by passing a null handle in the
LS params.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: staging: bcm2835-isp: Respect caller's stride value
The stride value reported for output image buffers should be at least
as large as any value that was passed in by the caller (subject to
correct alignment for the pixel format). If the value is zero (meaning
no value was passed), or is too small, the minimum acceptable value
will be substituted.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
staging: vc04_services: bcm2835-isp: Drop include Makefile directive
Drop the include directive. They can break the build, when one only
wants to build a subdirectory. Replace with "../" for the includes in
the bcm2835-isp instead.
The fix is equivalent to the four patches between 29d49a76c5
("staging: vc04_services: bcm2835-audio: Drop include Makefile
directive")...2529ca211402 ("staging: vc04_services: interface: Drop
include Makefile directive")
Fixes: c8f89c9551c1 ("staging: vc04_services: ISP: Add a more complex ISP processing component")
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: bcm2835-v4l2-isp: Register with vchiq_bus_type
Register the bcm2835-v4l2-isp driver with the vchiq_bus_type instead of
using the platform driver/device.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: bcm2835-v4l2-isp: Explicitly set DMA mask
The platform model originally handled the DMA mask. Now that
we are on the vchiq_bus we need to explicitly set this.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
drivers: media: bcm2835_isp: Cache LS table dmabuf
Clients such as libcamera do not change the LS table dmabuf on every
frame. In such cases instead of mapping/remapping the same dmabuf on
every frame to send to the firmware, cache the dmabuf once and only
update and remap if the dmabuf has been changed by the userland client.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
This file defines the userland interface to the bcm2835-isp driver
that will follow in a separate commit.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
uapi: bcm2835-isp: Add colour denoise configuration
Add a configuration structure for colour denoise to the bcm2835_isp
driver.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
This adds a V4L2 memory to memory device that wraps the MMAL
video decode and video_encode components for H264 and MJPEG encode
and decode, MPEG4, H263, and VP8 decode (and MPEG2 decode
if the appropriate licence has been purchased).
This patch squashes all the work done in developing the driver
on the Raspberry Pi rpi-5.4.y kernel branch.
Thanks to Kieran Bingham, Aman Gupta, Chen-Yu Tsai, and
Marek Behún for their contributions. Please refer to the
rpi-5.4.y branch for the full history.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Ensure OUTPUT timestamps are always forwarded
The firmware by default tries to ensure that decoded frame
timestamps always increment. This is counter to the V4L2 API
which wants exactly the OUTPUT queue timestamps passed to the
CAPTURE queue buffers.
Disable the firmware option.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/vc04_services/codec: Add support for CID MPEG_HEADER_MODE
Control V4L2_CID_MPEG_VIDEO_HEADER_MODE controls whether the encoder
is meant to emit the header bytes as a separate packet or with the
first encoded frame.
Add support for it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/vc04_services/codec: Clear last buf dequeued flag on START
It appears that the V4L2 M2M framework requires the driver to manually
call vb2_clear_last_buffer_dequeued on the CAPTURE queue during a
V4L2_DEC_CMD_START.
Add such a call.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/vc04-services/codec: Fix logical precedence issue
Two issues identified with operator precedence in logical
expressions. Fix them.
https://github.com/raspberrypi/linux/issues/4040
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: bcm2835-codec: Switch to s32fract
staging/bcm2835-codec: Add the unpacked (16bpp) raw formats
Now that the firmware supports the unpacked (16bpp) variants
of the MIPI raw formats, add the mappings.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Log the number of excess supported formats
When logging that the firmware has provided more supported formats
than we had allocated storage for, log the number allocated and
returned.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Add support for pixel aspect ratio
If the format is detected by the driver and a V4L2_EVENT_SOURCE_CHANGE
event is generated, then pass on the pixel aspect ratio as well.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Implement additional g_selection calls for decode
v4l_cropcap calls our vidioc_g_pixelaspect function to get the pixel
aspect ratio, but also calls g_selection for V4L2_SEL_TGT_CROP_BOUNDS
and V4L2_SEL_TGT_CROP_DEFAULT. Whilst it allows for vidioc_g_pixelaspect
not to be implemented, it doesn't allow for either of the other two.
Add in support for the additional selection targets.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Add VC-1 support.
Providing the relevant licence has been purchased, then Pi0-3
can decode VC-1.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Fix support for levels 4.1 and 4.2
The driver said it supported H264 levels 4.1 and 4.2, but
was missing the V4L2 to MMAL mappings.
Add in those mappings.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Set the colourspace appropriately for RGB formats
Video decode supports YUV and RGB formats. YUV needs to report SMPTE170M
or REC709 appropriately, whilst RGB should report SRGB.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Pass corrupt frame flag.
MMAL has the flag MMAL_BUFFER_HEADER_FLAG_CORRUPTED but that
wasn't being passed through, so add it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Do not update crop from S_FMT after res change
During decode, setting the CAPTURE queue format was setting the crop
rectangle to the requested height before aligning up the format to
cater for simple clients that weren't expecting to deal with cropping
and the SELECTION API.
This caused problems on some resolution change events if the client
didn't also then use the selection API.
Disable the crop update after a resolution change.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
bcm2835: Allow compressed frames to set sizeimage (#4386)
Allow the user to set sizeimage in TRY_FMT and S_FMT if the format
flags have V4L2_FMT_FLAG_COMPRESSED set
Signed-off-by: John Cox <jc@kynesim.co.uk>
staging/bcm2835-codec: Change the default codec res to 32x32
In order to effectively guarantee that a V4L2_EVENT_SOURCE_CHANGE
event occurs, adopt a default resolution of 32x32 so that it
is incredibly unlikely to be decoding a stream of that resolution
and therefore failing to note a "change" requiring the event.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Add support for decoding interlaced streams
The video decoder can support decoding interlaced streams, so add
the required plumbing to signal this correctly.
The encoder and ISP do NOT support interlaced data, so trying to
configure an interlaced format on those nodes will be rejected.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Correct ENUM_FRAMESIZES stepsize to 2
Being YUV420 formats, the step size is always 2 to avoid part
chroma subsampling.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Return buffers to QUEUED not ERROR state
Should start_streaming fail, or buffers be queued during
stop_streaming, they should be returned to the core as QUEUED
and not (as currently) as ERROR.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835_codec: Log MMAL flags in hex
The flags is a bitmask, so it's far easier to interpret as hex
data instead of decimal.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Allow custom specified strides/bytesperline.
If the client provides a bytesperline value in try_fmt/s_fmt then
validate it and correct if necessary.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835_codec: Add support for image_fx to deinterlace
Adds another /dev/video node wrapping image_fx doing deinterlace.
Co-developed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
staging/bcm2835-v4l2_codec: Fix for encode selection API
Matches correct behaviour from DECODE and DEINTERLACE
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
staging: bcm2835-codec: Allow decode res changed before STREAMON(CAPTURE)
The V4L2 stateful video decoder API requires that you can STREAMON
on only the OUTPUT queue, feed in buffers, and wait for the
SOURCE_CHANGE event.
This requires that we enable the MMAL output port at the same time
as the input port, because the output port is the one that creates
the SOURCE_CHANGED event.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Do not send buffers to the VPU unless streaming
With video decode we now enable both input and output ports on
the component. This means that buffers will get passed to the VPU
earlier than desired if they are queued befoer STREAMON.
Check that the queue is streaming before sending buffers to the VPU.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Format changed should trigger drain
When a format changed event occurs, the spec says that it
triggers an implicit drain, and that needs to be signalled
via -EPIPE.
For BCM2835, the format changed event happens at the point
the format change occurs, so no further buffers exist from
before the resolution changed point. We therefore signal the
last buffer immediately.
We don't have a V4L2 available to us at this point, so set
the videobuf2 queue last_buffer_dequeued flag directly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Signal the firmware to stop on all changes
The firmware defaults to not stopping video decode if only the
pixel aspect ratio or colourspace change. V4L2 requires us
to stop decoding on any change, therefore tell the firmware
of the desire for this alternate behaviour.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Queue flushed buffers instead of completing
When a buffer is returned on a port that is disabled, return it
to the videobuf2 QUEUED state instead of DONE which returns it
to the client.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835_codec: Correct flushing code for refcounting
Completions don't reference count, so setting the completion
on the first buffer returned and then not reinitialising it
means that the flush function doesn't behave as intended.
Signal the completion when the last buffer is returned.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Ensure all ctrls are set on streamon
Currently the code was only setting some controls from
bcm2835_codec_set_ctrls, but it's simpler to use
v4l2_ctrl_handler_setup to avoid forgetting to adding new
controls to the list.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Add support for H&V Flips to ISP
The ISP can do H & V flips whilst resizing or converting
the image, so expose that via V4L2_CID_[H|V]FLIP.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
bcm2835-v4l2-codec: Remove advertised support of VP8
The support for this format by firmware is very limited
and won't be faster than the arm.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Pass V4L2_CID_MPEG_VIDEO_H264_MIN_QP/MAX_QP to bcm2835-v4l2-codec
Following raspberrypi/linux#4704. This is necessary to set up
quantization for variable bitrate to avoid video flickering.
staging/bcm2835-codec: bytesperline for YUV420/YVU420 needs to be 64
Matching https://github.com/raspberrypi/linux/pull/4419, the ISP
block (which is also used on the input of the encoder, and output
of the decoder) needs the base address of all planes to be aligned
to multiples of 32. This includes the chroma planes of YUV420 and
YVU420.
If the height is only a multiple of 2 (not 4), then you get an odd
number of lines in the second plane, which means the 3rd plane
starts at a multiple of bytesperline/2.
Set the minimum bytesperline alignment to 64 for those formats
so that the plane alignment is always right.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Allow a different stride alignment per role
Deinterlace and decode aren't affected in the same way as encode
and ISP by the alignment requirement on 3 plane YUV420.
Decode would be affected, but it always aligns the height up to
a macroblock, and uses the selection API to reflect that.
Add in the facility to set the bytesperline alignment per role.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: vc04_services: codec: Add support for V4L2_PIX_FMT_RGBA32 format
We already support V4L2_PIX_FMT_BGR32 which is the same thing with red
and blue swapped, so it makes sense to include this variant as well.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
bcm2835-codec: Return empty buffers to the VPU instead of queueing to vbuf2
The encoder can skip frames totally should rate control overshoot
the target bitrate too far. In this situation it generates an
output buffer of length 0.
V4L2 treats a buffer of length 0 as an end of stream flag, which is
not appropriate in this case, therefore we can not return that buffer
to the client.
The driver was returning the buffer to videobuf2 in the QUEUED state,
however that buffer was then not dequeued again, so the number of
buffers was reduced each time this happened. In the pathological
case of using GStreamer's videotestsrc in mode 1 for noise, this happens
sufficiently frequently to totally stall the pipeline.
If the port is still enabled then return the buffer straight back to
the VPU rather than to videobuf2.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: bcm2835-codec: Add support for V4L2_PIX_FMT_NV12_COL128
V4L2_PIX_FMT_NV12_COL128 is supported by the ISP and the input of
video_encode, output of video_decode, and both input and output
of the ISP.
Add in the plumbing to support the format on those ports.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: bcm2835-codec: Set crop_height for compressed formats
In particular for the encoder where the CAPTURE format dictates
the parameters given to the codec we need to be able to set the
value passed as the crop_height for the compressed format.
There's no crop available for cropped modes, so always set
crop_height to the requested height.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: bcm2835-codec: Set port format from s_selection
s_selection allows the crop region of an uncompressed pixel
format to be specified, but it wasn't passing the setting on to
the firmware. Depending on call order this would potentially
mean that the crop wasn't actioned.
Set the port format on s_selection if we have a component created.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
bcm2835-codec: /dev/video31 as interface to image_encode JPEG encoder
Signed-off-by: Maxim Devaev <mdevaev@gmail.com>
bcm2835-v4l2-codec: support H.264 5.0 and 5.1 levels
vc04_services: bcm2835-codec: Remove redundant role check
vidioc_try_encoder_cmd checks the role, but the ioctl is disabled
for any roles in which it is invalid.
Remove the redundant check.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: bcm2835-codec: Allow encoder_cmd on ISP and deinterlace
ISP and deinterlace also need a mechanism for passing effectively
an EOS through the pipeline to signal when all buffers have been
processed.
VIDIOC_ENCODER_CMD does exactly this for encoders, so reuse the same
function for ISP and deinterlace.
(VIDIOC_DECODER_CMD is slightly different in that it also passes
details of when and how to stop, so is not as relevant).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: bcm2835_codec: Allow larger images through the ISP
Whilst the codecs are restricted to 1920x1080 / 1080x1920, the ISP
isn't, but the limits advertised via V4L2 was 1920x1920 for all
roles.
Increase the limit to 16k x 16k for the ISP.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-v4l2-codec: Enable selection ioctl for ISP
The ISP cases do nothing. Remove the break that separates them from the
deinterlace case so they now do the same as deinterlace. This enables
simple width & height setting, but does not enable setting left and
top coordinates.
Signed-off-by: John Cox <jc@kynesim.co.uk>
media: bcm2835-v4l2-codec: Add profile & level ctrls to decode
In order to support discovery of what profile & levels are supported by
stateful decoders implement the profile and level controls where they
are defined by V4L2.
Signed-off-by: John Cox <jc@kynesim.co.uk>
vc04_services: bcm2835_codec: Ignore READ_ONLY ctrls in s_ctrl
In adding the MPEG2/MPEG4/H264 level and profile controls to
the decoder, they weren't declared as read-only, nor handlers
added to bcm2835_codec_s_ctrl. That resulted in an error message
"Invalid control" being logged every time v4l2_ctrl_handler_setup
was called from bcm2835_codec_create_component.
Define those controls as read only, and exit early from s_ctrl
on read only controls.
Fixes: "media: bcm2835-v4l2-codec: Add profile & level ctrls to decode"
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: bcm2835_codec: Set MPEG2_LEVEL control to READ_ONLY
V4L2_CID_MPEG_VIDEO_MPEG2_LEVEL was missed from
"vc04_services: bcm2835_codec: Ignore READ_ONLY ctrls in s_ctrl"
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Add V4L2_CID_MPEG_VIDEO_B_FRAMES control
FFmpeg insists on trying to set V4L2_CID_MPEG_VIDEO_B_FRAMES to
0, and generates an error should it fail.
As our encoder doesn't support B frames, add a stub handler for
it to silence FFmpeg.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Add support for V4L2_CID_MPEG_VIDEO_GOP_SIZE
For H264, V4L2_CID_MPEG_VIDEO_H264_I_PERIOD is meant to be the intra
I-frame period, whilst V4L2_CID_MPEG_VIDEO_GOP_SIZE is the intra IDR
frame period.
The firmware encoder doesn't produce I-frames that aren't IDR as well,
therefore V4L2_CID_MPEG_VIDEO_GOP_SIZE is technically the correct
control, however users may have adopted V4L2_CID_MPEG_VIDEO_H264_I_PERIOD.
Add support for V4L2_CID_MPEG_VIDEO_GOP_SIZE controlling the encoder,
and have VIDIOC_S_CTRL for V4L2_CID_MPEG_VIDEO_H264_I_PERIOD update
the value for V4L2_CID_MPEG_VIDEO_GOP_SIZE (the reverse is not
implemented).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Add missing alignment for V4L2_PIX_FMT_RGBA32
The patch adding image encode (JPEG) to the driver missed adding
the alignment constraint for V4L2_PIX_FMT_RGBA32, which meant
it ended up giving a stride and size of 0.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Downgrade the level for a debug message
The debug message from bcm2835_codec_buf_prepare when the buffer
size is incorrect can be a little spammy if the application isn't
careful on how it drives it, therefore drop the priority of the
message.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: bcm2835-codec: Correct alignment requirements for YUYV
The firmware wants the YUYV format stride alignment to be to a multiple
of 32pixels / 64 bytes. The kernel driver was configuring it to a multiple
of 16 pixels / 32 bytes, which then failed when it tried starting to
stream.
Correct the alignment requirements.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Fix up for 6.8 - use ignore_cap_streaming
Drops downstream patch to v4l2_mem2mem, and uses the new mainline
flag to achieve the same functionality
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: 32bpp RGB formats need a 64byte alignment
The firmware needs 16 pixel alignment on RGBx 32bpp formats, which
would be 64 byte. The driver was only setting 32byte alignment.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835_codec: Pass framerate to the component if set late
For video encoding, if the framerate was set after the component
was created, then it wasn't set correctly on the port, and an
old value was encoded in the bitstream.
Update the port status when the framerate is set.
https://github.com/raspberrypi/rpicam-apps/issues/664
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: vc04_services: bcm2835-codec: Drop include Makefile directive
Drop the include directive. They can break the build, when one only
wants to build a subdirectory. Replace with "../" for the includes in
the bcm2835-v4l2-codec instead.
The fix is equivalent to the four patches between 29d49a76c5
("staging: vc04_services: bcm2835-audio: Drop include Makefile
directive")...2529ca211402 ("staging: vc04_services: interface: Drop
include Makefile directive")
Fixes: afaec52747 ("staging: vc04_services: Add a V4L2 M2M codec driver")
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: bcm2835-v4l2-codec: Register with vchiq_bus_type
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: bcm2835-codec: Explicitly set DMA mask
The platform model originally handled the DMA mask. Now that
we are on the vchiq_bus we need to explicitly set this.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: bcm2835-codec: Disable HEADER_ON_OPEN for video encode
Video encode can defer generating the header until the first
frame is presented, which allows it to take the colourspace
information from the frame rather than just the format.
Enable that for the V4L2 driver now that the firmware populates
all the parameters.
https://github.com/raspberrypi/firmware/issues/1885
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: bcm2835-codec: Add support for H264 level 5.0 and 5.1
We do NOT claim to support decoding in real-time for these levels,
but can transcode some content, and handle 1920x1200.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: codec: Allocate the max number of buffers on the VPU
The VPU's API can't match the use of VIDIOC_CREATE_BUFS to add buffers
to the internal pool whilst a port is enabled, therefore allocate
the maximum number of buffers possible in V4L2 to avoid the issue.
As these are only buffer headers, the overhead is relatively small.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc04_services: vchiq-mmal: Add defines for mmal_es_format flags
There is a flags field in struct mmal_es_format, but the defines
for what the bits meant weren't included in the headers.
For V4L2_PIX_FMT_NV12_COL128 support we need them, so add them in.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
With the vc-sm-cma driver we can support zero copy of buffers between
the kernel and VPU. Add this support to mmal-vchiq.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc-sm-cma: fixed kbuild problem
error logs:
drivers/staging/vc04_services/vc-sm-cma/Kconfig:1:error: recursive dependency detected!
drivers/staging/vc04_services/vc-sm-cma/Kconfig:1: symbol BCM_VC_SM_CMA is selected by BCM2835_VCHIQ_MMAL
drivers/staging/vc04_services/vchiq-mmal/Kconfig:1: symbol BCM2835_VCHIQ_MMAL depends on BCM2835_VCHIQ
drivers/staging/vc04_services/Kconfig:14: symbol BCM2835_VCHIQ is selected by BCM_VC_SM_CMA
For a resolution refer to Documentation/kbuild/kconfig-language.rst
subsection "Kconfig recursive dependency limitations"
Tested-by: make ARCH=arm64 bcm2711_defconfig
Test platform: fedora 33
Branch: rpi-5.10.y
Add Broadcom VideoCore Shared Memory support.
This new driver allows contiguous memory blocks to be imported
into the VideoCore VPU memory map, and manages the lifetime of
those objects, only releasing the source dmabuf once the VPU has
confirmed it has finished with it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: vcsm-cma: Fix memory leak from not detaching dmabuf
When importing there was a missing call to detach the buffer,
so each import leaked the sg table entry.
Actually the release process for both locally allocated and
imported buffers is identical, so fix them to both use the same
function.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/vc-sm-cma: Avoid log spamming on Pi0/1 over cache alias.
Pi 0/1 use the 0x80000000 cache alias as the ARM also sees the world
through the VPU L2 cache.
vc-sm-cma was trying to ensure it was in an uncached alias (0xc), and
complaining on every allocation if it weren't. Reduce this logging.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc-sm-cma: Restore correct cache maintainance operations
We have been using the more expensive flush operations rather than
invalidate and clean since kernel rpi-5.9.y
These are exposed with:
52f1453513 Re-expose some dmi APIs for use in VCSM
But I believe that commit was dropped when (non-cma) vc-sm was dropped,
and didn't get updated when the commit was restored
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
staging: vc04_services: Fix clang14 warning
Insert a break to fix a fallthrough warning from clang14. Since the
fallthrough was to another break, this is a cosmetic change.
See: https://github.com/raspberrypi/linux/issues/5078
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
vc04_services/vc-sm-cma: Handle upstream require vchiq_instance to be passed around
vc04_services/vc-sm-cma: Switch one-bit bitfields to bool
Clang 16 warns:
../drivers/staging/vc04_services/vc-sm-cma/vc_sm.c:816:19: warning: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Wsingle-bit-bitfield-constant-conversion]
buffer->imported = 1;
^ ~
../drivers/staging/vc04_services/vc-sm-cma/vc_sm.c:822:17: warning: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Wsingle-bit-bitfield-constant-conversion]
buffer->in_use = 1;
^ ~
2 warnings generated.
Signed-off-by: Alexander Winkowski <dereference23@outlook.com>
vc04_services: vcsm-cma: Detach from the correct dmabuf
Commit d3292daee3 ("dma-buf: Make locking consistent in dma_buf_detach()")
added checking that the same dmabuf for which dma_buf_attach
was called is passed into dma_buf_detach, which flagged up
that vcsm-cma was passing in the wrong dmabuf.
Correct this so that we don't get the WARN on every dma_buf
release.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: vc04_services: vc-sm-cma: Remove deprecated header
The vchiq_connected.h header was removed in f875976ecf ("staging:
vc04_services: Drop vchiq_connected.[ch] files") to simplify the
implementation.
Update the vc_sm driver accordingly which can still use the same
functions through the vchiq_arm.h header declarations.
Fixes: b1ab7a05eb6c ("staging: vc04_services: Add new vc-sm-cma driver")
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: vc-sm-cma: Drop include Makefile directive
Drop the include directive. They can break the build, when one only
wants to build a subdirectory. Replace with "../" for the includes in
the vc_sm files instead.
The fix is equivalent to the four patches between 29d49a76c5
("staging: vc04_services: bcm2835-audio: Drop include Makefile
directive")...2529ca211402 ("staging: vc04_services: interface: Drop
include Makefile directive")
Fixes: b1ab7a05eb6c ("staging: vc04_services: Add new vc-sm-cma driver")
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: vc-sm-cma: Register with vchiq_bus_type
Register the vcsm rive with the vchiq_bus_type instead of useing the
platform driver/device.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: vc-sm-cma: Explicitly set DMA mask
The platform model originally handled the DMA mask. Now that
we are on the vchiq_bus we need to explicitly set this.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: vc-sm-cma: Use [map|unmap]_attachment_unlocked
lockdep throws warnings when using libcamera as buffers are
mapped and unmapped as the dmabuf->resv lock hasn't been taken.
Switch to using the _unlocked variants so that the framework takes
the lock.
https://github.com/raspberrypi/linux/issues/6814
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: vc04_services: vc-sm-cma: Use a mutex instead of spinlock
There are no contexts where we should be calling the kernelid_map
IDR functions where we can't sleep, so switch from using a spinlock
to using a mutex.
https://github.com/raspberrypi/linux/issues/6815
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Increase the delay before entering the lower power state to 2 seconds
(the maximum allowed) in order to reduce the packet latencies,
particularly for inbound packets.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Display variants are intended as a replacement for the now-deleted
fbtft_device drivers. Drivers can register additional compatible
strings with a custom callback that can make the required changes
to the fbtft_display structure.
Start the ball rolling by adding adafruit18, adafruit18_green and
sainsmart18 displays.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The actpwr trigger is a meta trigger that cycles between an inverted
mmc0 and default-on. It is written in a way that could fairly easily
be generalised to support alternative sets of source triggers.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The upstream Unicam driver needs a set of userland changes to get
libcamera to run, and those aren't written or merged yet.
Move the "brcm,bcm2835-unicam" compatible from the upstream driver
to the old downstream version so that users can run libcamera
against 6.10.
Once the libcamera changes have been merged this can be reverted
to use the upstream driver.
If using the non-legacy compatible then assume we want to use
media-controller API for configuration.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add a driver for the Unicam camera receiver block on BCM283x processors.
Compared to the bcm2835-camera driver present in staging, this driver
handles the Unicam block only (CSI-2 receiver), and doesn't depend on
the VC4 firmware running on the VPU.
The commit is made up of a series of changes cherry-picked from the
rpi-5.4.y branch of https://github.com/raspberrypi/linux/ with
additional enhancements, forward-ported to the mainline kernel.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reported-by: kbuild test robot <lkp@intel.com>
media: bcm2835-unicam: Add support for get_mbus_config to set num lanes
Use the get_mbus_config pad subdev call to allow a source to use
fewer than the number of CSI2 lanes defined in device tree.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Avoid gcc warning over {0} on endpoint
Older gcc versions object to = { 0 } initialisation if the first
elemtn in the structure is a substructure.
Use = { } to avoid this compiler warning.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Reinstate V4L2_CAP_READWRITE in the caps
v4l2-compliance throws a failure if the device doesn't advertise
V4L2_CAP_READWRITE but allows read or write operations.
We do support read, so reinstate the flag.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Ensure type is VIDEO_CAPTURE in [g|s]_selection
[g|s]_selection pass in a buffer type that needs to be validated
before passing on to the sensor subdev.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835: unicam: Set VPU min clock freq to 250Mhz.
When streaming with Unicam, the VPU must have a clock frequency of at
least 250Mhz. Otherwise, the input fifos could overrun, causing
image corruption.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Drop WARN on uing direct cache alias
Pi 0&1 pass all ARM accesses through the VPU L2 cache, therefore
the dma-ranges property sets the cache alias bits to other
than the direct alias, hence this WARN was firing.
It was overprotective coding, so assume that everything is OK
with the dma-ranges, and remove the WARN.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Always service interrupts
From when bringing up the driver, there was a check in the isr
to ignore interrupts (claiming them handled) should the driver
not be streaming.
The VPU now will not register a camera driver if it finds a
CSI2 node enabled in device tree, therefore this flawed check is
redundant.
https://github.com/raspberrypi/linux/issues/3602
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835: unicam: Fix uninitialized warning
Signed-off-by: Jacko Dirks <jdirks.linuxdev@gmail.com>
media: bcm2835-unicam: Fixup review comments from Hans.
Updates the driver based on the upstream review comments from
Hans Verkuil at https://patchwork.linuxtv.org/patch/63531/
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Retain packing information on G_FMT
The change to retrieve the pixel format always on g_fmt didn't
check whether the native or unpacked version of the format
had been requested, and always returned the packed one.
Correct this so that the packing setting is retained whereever
possible.
Fixes "9d59e89 media: bcm2835-unicam: Re-fetch mbus code from subdev
on a g_fmt call"
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: change minimum number of vb2_queue buffers to 1
Since the unicam driver was modified to write to a dummy buffer when no
user-supplied buffer is available, it can now write to and return a
buffer even when there's only a single one. Enable this by changing the
min_buffers_needed in the vb2_queue; it will be useful for enabling
still captures without allocating more memory than absolutely necessary.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
staging: vc04_services: ISP: Add a more complex ISP processing component
Driver for the BCM2835 ISP hardware block. This driver uses the MMAL
component to program the ISP hardware through the VC firmware.
The ISP component can produce two video stream outputs, and Bayer
image statistics. This can't be encompassed in a simple V4L2
M2M device, so create a new device that registers 4 video nodes.
This patch squashes all the development patches from the earlier
rpi-5.4.y branch into one
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
staging/bcm2835-isp: Add the unpacked (16bpp) raw formats
Now that the firmware supports the unpacked (16bpp) variants
of the MIPI raw formats, add the mappings.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-isp: Log the number of excess supported formats
When logging that the firmware has provided more supported formats
than we had allocated storage for, log the number allocated and
returned.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging: vc04_services: ISP: Add colour denoise control
Add colour denoise control to the bcm2835 driver through a new v4l2
control: V4L2_CID_USER_BCM2835_ISP_CDN.
Add the accompanying MMAL configuration structure definitions as well.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
bcm2835-isp: Allow formats with different colour spaces.
Each supported format now includes a mask showing the allowed colour
spaces, as well as a default colour space for when one was not
specified.
Additionally we translate the colour space to mmal format and pass it
over to the VideoCore.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
media: i2c: add ov9281 driver.
Change-Id: I7b77250bbc56d2f861450cf77271ad15f9b88ab1
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: fix mclk issue when probe multiple camera.
Takes the ov9281 part only from the Rockchip's patch.
Change-Id: I30e833baf2c1bb07d6d87ddb3b00759ab45a90e4
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: add enum_frame_interval function for iq tool 2.2 and hal3
Adds the ov9281 parts of the Rockchip patch adding enum_frame_interval to
a large number of drivers.
Change-Id: I03344cd6cf278dd7c18fce8e97479089ef185a5c
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: Fixup for recent kernel releases, and remove custom code
The Rockchip driver was based on a 4.4 kernel, and had several custom
Rockchip parts.
Update to 5.4 kernel APIs, with the relevant controls required by
libcamera, and remove custom Rockchip parts.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Read chip ID via 2 reads
Vision Components have made an OV9281 module which blocks reading
back the majority of registers to comply with NDAs, and in doing
so doesn't allow auto-increment register reading as used when
reading the chip ID.
Use two reads and manually combine the results.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Add support for 8 bit readout
The sensor supports 8 bit mode as well as 10bit, so add the
relevant code to allow selection of this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: ov9281: Add 1280x720 and 640x480 modes
Breaks out common register set and adds the different registers
for 1280x720 (cropped) and 640x480 (skipped) modes
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Fixed picture line bug in all ov9281 modes
Signed-off-by: Mathias Anhalt <mathiasanhalt@web.de>
Added hflip and vflip controls to ov9281
Signed-off-by: Mathias Anhalt <mathiasanhalt@web.de>
media: i2c: ov9281: Remove override of subdev name
From the original Rockchip driver, the subdev was renamed
from the default to being "mov9281 <dev_name>" whereas the
default would have been "ov9281 <dev_name>".
Remove the override to drop back to the default rather than
a vendor custom string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: v4l2-subdev: add subdev-wide state struct
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
media: i2c: ov9281: Add fwnode properties controls
Add call to v4l2_ctrl_new_fwnode_properties to read and
create the fwnode based controls.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Sensor should report RAW color space
Tested on Raspberry Pi running libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Partial revert "media: i2c: add ov9281 driver."
This partially reverts commit 84e98e3a4f.
The commit had merged some changes to other drivers with adding the ov9281
driver. Only the ov9281 parts have been reverted.
staging/bcm2835-isp: Fix compiler warning
The result of dividing a u32 by a size_t is an unsigned int on arm32
and a long unsigned int on arm64. Use "%zu" (the size_t format) to
remove the build warning for 64-bit builds.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
staging: vc04_services: isp: Set the YUV420/YVU420 format stride to 64 bytes
The bcm2835 ISP requires the base address of all input/output planes to have 32
byte alignment. Using a Y stride of 32 bytes would not guarantee that the V
plane would fulfil this, e.g. a height of 650 lines would mean the V plane
buffer is not 32 byte aligned for YUV420 formats.
Having a Y stride of 64 bytes would ensure both U and V planes have a 32 byte
alignment, as the luma height will always be an even number of lines.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
vc04_services: isp: Report input node as wanting full range RAW color space
RAW color spaces are more usually reported as having full range
quantization.
Tested using libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
drivers: bcm2835_isp: Allow multiple users for the ISP driver.
Add a second (identical) set of device nodes to allow concurrent use of the ISP
hardware by another user. This change effectively creates a second state
structure (struct bcm2835_isp_dev) to maintain independent state for the second
user. Node and media entity names are appened with the instance index
appropriately.
Further users can be added by changing the BCM2835_ISP_NUM_INSTANCES define.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: bcm2835_isp: Fix div by 0 bug.
Fix a possible division by 0 bug when setting up the mmal port for the stats
port.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
staging/bcm2835-isp: Fix cleanup after init fail
bcm2835_isp_remove is called on an initialisation failure, but at that
point the drvdata hasn't been set. This causes a crash when e.g. using
the cutdown firmware (gpu_mem=16).
Move platform_set_drvdata before the instance probing loop to avoid the
problem.
See: https://github.com/raspberrypi/linux/issues/4774
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2835-v4l2-isp: Add missing lock initialization
ISP device allocation is dynamic hence the locks too.
struct mutex queue_lock is not initialized which result in bug.
Fixing same by initializing it.
[ 29.847138] INFO: trying to register non-static key.
[ 29.847156] The code is fine but needs lockdep annotation, or maybe
[ 29.847159] you didn't initialize this object before use?
[ 29.847161] turning off the locking correctness validator.
[ 29.847167] CPU: 1 PID: 343 Comm: v4l_id Tainted: G C 5.15.11-rt24-v8+ #8
[ 29.847187] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[ 29.847194] Call trace:
[ 29.847197] dump_backtrace+0x0/0x1b8
[ 29.847227] show_stack+0x20/0x30
[ 29.847240] dump_stack_lvl+0x8c/0xb8
[ 29.847254] dump_stack+0x18/0x34
[ 29.847263] register_lock_class+0x494/0x4a0
[ 29.847278] __lock_acquire+0x80/0x1680
[ 29.847289] lock_acquire+0x214/0x3a0
[ 29.847300] mutex_lock_nested+0x70/0xc8
[ 29.847312] _vb2_fop_release+0x3c/0xa8 [videobuf2_v4l2]
[ 29.847346] vb2_fop_release+0x34/0x60 [videobuf2_v4l2]
[ 29.847367] v4l2_release+0xc8/0x108 [videodev]
[ 29.847453] __fput+0x8c/0x258
[ 29.847476] ____fput+0x18/0x28
[ 29.847487] task_work_run+0x98/0x180
[ 29.847502] do_notify_resume+0x228/0x3f8
[ 29.847515] el0_svc+0xec/0xf0
[ 29.847523] el0t_64_sync_handler+0x90/0xb8
[ 29.847531] el0t_64_sync+0x180/0x184
Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
staging: vc04_services: isp: Permit all sRGB colour spaces on ISP outputs
ISP outputs actually support all colour spaces that are fundamentally
sRGB underneath, regardless of whether an RGB or YUV output format is
actually requested.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
drivers: staging: bcm2835-isp: Do not cleanup mmal vcsm buffer on stop_streaming
On stop_streaming() the vcsm buffer handle gets released by the buffer cleanup
code. This will subsequently cause and error if userland re-queues the same
buffer on the next start_streaming() call.
Remove this cleanup code and rely on the vb2_ops->buf_cleanup() call to do the
cleanups instead.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: staging: bcm2835-isp: Clear LS table handle in the firmware
When all nodes have stopped streaming, ensure the firmware has released its
handle on the LS table dmabuf. This is done by passing a null handle in the
LS params.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: staging: bcm2835-isp: Respect caller's stride value
The stride value reported for output image buffers should be at least
as large as any value that was passed in by the caller (subject to
correct alignment for the pixel format). If the value is zero (meaning
no value was passed), or is too small, the minimum acceptable value
will be substituted.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
staging: vc04_services: bcm2835-isp: Drop include Makefile directive
Drop the include directive. They can break the build, when one only
wants to build a subdirectory. Replace with "../" for the includes in
the bcm2835-isp instead.
The fix is equivalent to the four patches between 29d49a76c5
("staging: vc04_services: bcm2835-audio: Drop include Makefile
directive")...2529ca211402 ("staging: vc04_services: interface: Drop
include Makefile directive")
Fixes: c8f89c9551c1 ("staging: vc04_services: ISP: Add a more complex ISP processing component")
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: bcm2835-v4l2-isp: Register with vchiq_bus_type
Register the bcm2835-v4l2-isp driver with the vchiq_bus_type instead of
using the platform driver/device.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
staging: vc04_services: bcm2835-v4l2-isp: Explicitly set DMA mask
The platform model originally handled the DMA mask. Now that
we are on the vchiq_bus we need to explicitly set this.
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
drivers: media: bcm2835_isp: Cache LS table dmabuf
Clients such as libcamera do not change the LS table dmabuf on every
frame. In such cases instead of mapping/remapping the same dmabuf on
every frame to send to the firmware, cache the dmabuf once and only
update and remap if the dmabuf has been changed by the userland client.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Correctly handle error propagation for stream on
On a failure in start_streaming(), the error code would not propagate to
the calling function on all conditions. This would cause the userland
caller to not know of the failure.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Return early from stop_streaming() if stopped
clk_disable_unprepare() is called unconditionally in stop_streaming().
This is incorrect in the cases where start_streaming() fails, and
unprepares all clocks as part of the failure cleanup. To avoid this,
ensure that clk_disable_unprepare() is only called in stop_streaming()
if the clocks are in a prepared state.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Clear clock state when stopping streaming
Commit 65e08c4650 failed to clear the
clock state when the device stopped streaming. Fix this, as it might
again cause the same problems when doing an unprepare.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Fix bug in buffer swapping logic
If multiple sets of interrupts occur simultaneously, it may be unsafe
to swap buffers, as the hardware may already be re-using the current
buffers. In such cases, avoid swapping buffers, and wait for the next
opportunity at the Frame End interrupt to signal completion.
Additionally, check the packet compare status when watching for frame
end for buffers swaps, as this could also signify a frame end event.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Forward input status from subdevice
The vidioc_enum_input() v4l2 ioctl is capable of returning
sensor/input status as well. This is used in current
GStreamer HEAD for signal detection [1].
bcm2835-unicam does handle this syscall, but it didn't ask
the subdevice driver about the input status. The input then
appeared as always present.
This commit adds the necessary query. There is a precedent for
this - the R-Car VIN V4L2 driver does a similar call [2].
[1]: ce0be27caf/sys/v4l2/gstv4l2src.c (L553)
[2]: 7fb9d006d3/drivers/media/platform/rcar-vin/rcar-v4l2.c (L548)
Signed-off-by: Jakub Vaněk <linuxtardis@gmail.com>
media/bcm2835-unicam: Parse pad numbers correctly
The driver was making big assumptions about the source device
using pad 0 and 1, which doesn't follow for more complex
devices where Unicam's source device may be a sink device for
something else.
Read the pad numbers through media controller, and reference
them appropriately.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media/bcm2835-unicam: Add support for configuration via MC API
Adds Media Controller API support for more complex pipelines.
libcamera is about to switch to using this mechanism for configuring
sensors.
This can be enabled by either a module parameter, or device tree.
Various functions have been moved to group video-centric and
mc-centric functions together.
Based on a similar conversion done to ti-vpe.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Fixup for 5.18 and new get_mbus_config struct
The number of active CSI2 data lanes has moved within the struct
v4l2_mbus_config used by the get_mbus_config API call.
Update the driver to match the changes in mainline.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drivers: bcm2835_unicam: Add logging message when a frame is dropped.
If a dummy buffer is still active on a frame start, it indicates that this frame
will be dropped. The explicit logging helps users identify performance issues.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: bcm2835_unicam: Disable trigger mode operation
On a Pi3 B/B+ platform the imx219 sensor frequently generates a single corrupt
frame when the sensor first starts. This can either be a missing line, or
invalid samples within the line. This only occurrs using the Unicam kernel
driver.
Disabling trigger mode elimiates this corruption. Since trigger mode is a
legacy feature copied from the firmware driver and not expected to be needed,
remove it. Tested on the Raspberry Pi cameras and shows no ill effects.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Set ret on error path in unicam_async_complete()
Clang warns:
drivers/media/platform/bcm2835/bcm2835-unicam.c:3109:6: warning: variable 'ret' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (!source_pads) {
^~~~~~~~~~~~
drivers/media/platform/bcm2835/bcm2835-unicam.c:3152:9: note: uninitialized use occurs here
return ret;
^~~
drivers/media/platform/bcm2835/bcm2835-unicam.c:3109:2: note: remove the 'if' if its condition is always false
if (!source_pads) {
^~~~~~~~~~~~~~~~~~~
drivers/media/platform/bcm2835/bcm2835-unicam.c:3091:9: note: initialize the variable 'ret' to silence this warning
int ret;
^
= 0
1 warning generated.
When the if condition is true, ret will be used uninitialized, which
could result in undesirable behavior. Set ret to -ENODEV on the error
path, which is a standard error code for the ->complete() callback.
Fixes: d056e86eb3 ("media/bcm2835-unicam: Parse pad numbers correctly")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
media: bcm2835-unicam: Handle a repeated frame start with no end
In the case of 2 frame starts being received with no frame end
between, the queued buffer held in next_frm was lost as the
pointer was overwritten with the dummy buffer.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Correctly handle FS + FE ISR condtion
If we get a simultaneous FS + FE interrupt for the same frame, it cannot be
marked as completed and returned to userland as the framebuffer will be refilled
by Unicam on the next sensor frame. Additionally, the timestamp will be set to 0
as the FS interrupt handling code will not have run yet.
To avoid these problems, the frame is considered dropped in the FE handler,
and will be returned to userland on the subsequent sensor frame.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Fix for possible dummy buffer overrun
The Unicam hardware has been observed to cause a buffer overrun when using the
dummy buffer as a circular buffer. The conditions that cause the overrun are not
fully known, but it seems to occur when the memory bus is heavily loaded.
To avoid the overrun, program the hardware with a buffer size of 0 when using
the dummy buffer. This will cause overrun into the allocated dummy buffer, but
avoid out of bounds writes.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: bcm2835-unicam: Fix up start/stop api change
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
media: bcm2835-unicam: Use mipi-csi2.h header for data type values
The MIPI CSI2 data type ID values are now defined in the
mipi-csi2.h header, so use those defines instead of hard
coding them in the driver.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Add support for RAW16 formats
With the RAW16 formats now having a defined CSI2 data type ID,
they can be added to the driver.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Start and stop media_pipeline with same node
media_pipeline_start and media_pipeline_stop now validate that
the pipeline is being started and stopped with the same pipe
and pad handles.
When running with embedded metadata (eg imx477 and imx708), the
start typically happens from the metadata pad, whilst stop is
always from the image pad.
Always pass the image pad to media_pipeline_start to ensure
that the calls are balanced.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drivers: media: bcm2835_unicam: Improve frame sequence count handling
Ensure that the frame sequence counter is incremented only if a previous
frame start interrupt has occurred, or a frame start + frame end has
occurred simultaneously.
This corresponds the sequence number with the actual number of frames
produced by the sensor, not the number of frame buffers dequeued back
to userland.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
bcm2835-unicam: hacks to allow it to build
media: bcm2835-unicam: Fix up async notifier usage
Fixes "8a090fc3e549 bcm2835-unicam: hacks to allow it to build"
media: bcm2835-unicam: Add option for a GPIO to reflect FS/FE timing
The legacy stack had an option to have a GPIO track frame start and
end events to give basic synchronisation to the incoming image stream.
https://forums.raspberrypi.com/viewtopic.php?t=190314
Replicate this in the kernel Unicam driver.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Add support for 12bit mono packed format
Now that V4L2_PIX_FMT_Y12P is defined, allow passing raw 12bit
mono packed data through the peripheral.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Add support for 14bit mono sources
Now that V4L2_PIX_FMT_Y14 and V4L2_PIX_FMT_Y14P are defined,
allow passing 14bit mono data through the peripheral.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Add support for unpacked 14bit Bayer formats
Now that the 14bit non-packed Bayer formats are defined, add them
into the supported formats lookup table.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: bcm2835-unicam: Reinstate old downstream driver as legacy
Whilst the Unicam driver has now been upstreamed it only supports
configuration via Media Controller (not driven from the /dev/videoN
node), which makes life significantly harder for simple devices such
as mono sensors, and HDMI or analogue video to CSI2 bridge chips
(eg TC358743 and ADV7282M).
Fix up the downstream driver so that it builds, reinstate the links
from Kconfig and Makefile to it, and give it a new Kconfig name
(VIDEO_BCM2835_UNICAM_LEGACY).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Upstream Linux deems using output GPIOs to generate IRQs as a bogus
use case, even though the BCM2835 GPIO controller is capable of doing
so. A number of users would like to make use of this facility, so
disable the checks.
See: https://github.com/raspberrypi/linux/issues/2527
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
V4L2 wishes to have the codec header bytes in the same buffer as the
first encoded frame, so it does become 1-in 1-out for encoding.
The firmware now has an option to do this, so enable it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add V4L2_META_FMT_BCM2835_ISP_STATS V4L2 format type.
This new format will be used by the BCM2835 ISP device to return
out ISP statistics for 3A.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
This patch adds MEDIA_BUS_FMT_SENSOR_DATA used by the bcm2835-unicam
driver to support CSI-2 embedded data streams from camera sensors.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add V4L2_META_FMT_SENSOR_DATA format 4CC.
This new format will be used by the BCM2835 Unicam device to return
out camera sensor embedded data.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Commit f3186dd876 ("spi: Optionally use GPIO descriptors for CS GPIOs")
amended of_spi_parse_dt() to always set SPI_CS_HIGH for SPI slaves whose
Chip Select is defined by a "cs-gpios" devicetree property.
This change breaks drivers whose probe functions set the mode field of
the spi_device because in doing so they clear the SPI_CS_HIGH flag.
Fix by setting SPI_CS_HIGH in spi_setup (under the same conditions as
in of_spi_parse_dt()).
See also: 83b2a8fe43 ("spi: spidev: Fix CS polarity if GPIO descriptors are used")
Fixes: f3186dd876 ("spi: Optionally use GPIO descriptors for CS GPIOs")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
SQUASH: spi: Demote SPI_CS_HIGH warning to KERN_DEBUG
This warning is unavoidable from a client's perspective and
doesn't indicate anything wrong (just surprising).
SQUASH with "spi: use_gpio_descriptor fixup moved to spi_setup"
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The DT bindings description of the Brcmstb PCIe device is described. This
node can be used by almost all Broadcom settop box chips, using
ARM, ARM64, or MIPS CPU architectures.
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
A failure in gpiochip_irqchip_add leads to a leak of a gpiochip. Fix
the leak with the use of devm_gpiochip_add_data.
Fixes: 85ae9e512f ("pinctrl: bcm2835: switch to GPIOLIB_IRQCHIP")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcmn2835_isp is a platform driver dependent on vchiq,
therefore add the load/unload functions for it to vchiq.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
During a bulk transfer we request a DMA allocation to hold the
scatter-gather list. Most of the time, this allocation is small
(<< PAGE_SIZE), however it can be requested at a high enough frequency
to cause fragmentation and/or stress the CMA allocator (think time
spent in compaction here, or during allocations elsewhere).
Implement a pool to serve up small DMA allocations, falling back
to a coherent allocation if the request is greater than
VCHIQ_DMA_POOL_SIZE.
Signed-off-by: Oliver Gjoneski <ogjoneski@gmail.com>
The VCHIQ driver now loads the audio, camera, codec, and vc-sm
drivers as platform drivers. However they were not being given
the correct DMA configuration.
Call of_dma_configure with the parent (VCHIQ) parameters to be
inherited by the child.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
staging: vchiq: Use the old dma controller for OF config on platform devices
vchiq on Pi4 is no longer under the soc node, therefore it
doesn't get the dma-ranges for the VPU.
Switch to using the configuration of the old dma controller as
that will set the dma-ranges correctly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
staging: vchiq_arm: Give vchiq children DT nodes
vchiq kernel clients are now instantiated as platform drivers rather
than using DT, but the children of the vchiq interface may still
benefit from access to DT properties. Give them the option of a
a sub-node of the vchiq parent for configuration and to allow
them to be disabled.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
staging: vchiq_arm: Add 36-bit address support
Conditional on a new compatible string, change the pagelist encoding
such that the top 24 bits are the pfn, leaving 8 bits for run length
(-1), giving a 36-bit address range.
Manage the split between addresses for the VPU and addresses for the
40-bit DMA controller with a dedicated DMA device pointer that on non-
BCM2711 platforms is the same as the main VCHIQ device. This allows
the VCHIQ node to stay in the usual place in the DT.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
staging: vchiq_arm: children inherit DMA config
Although it is no longer necessary for vchiq's children to have a
different DMA configuration to the parent, they do still need to
explicitly to have their DMA configuration set - to be that of the
parent.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
BCM54213PE is an Ethernet PHY that supports PTP hardware timestamping.
BCM54210PW ia another Ethernet PHY, but one without PTP support.
Unfortunately the two PHYs return the same ID when queried, so some
extra information is required to determine whether the PHY is PTP-
capable.
There are two Raspberry Pi products that use these PHYs - Pi 4B and
CM4 - and fortunately they use different PHY addresses, so use that as
a differentiator. Choose to treat a PHY with the same ID but another
address as a BCM54210PE, which seems more common.
See: https://github.com/raspberrypi/linux/issues/5104
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
It's really a function of the board whether or not to use this feature
as it may require MAC compatibility as well as interop testing.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
The last nibble is a revision ID, and the 54213pe is a later rev
than the 54210e. Running the 54210e setup code on a 54213pe results
in a broken RGMII interface.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Add device tree entries and code to allow the specification of
the lighting modes for the LED's on the ethernet connector.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
net:phy:2711 Change the default ethernet LED actions
This should return default behaviour back to that of previous
releases.
net: phy: broadcom: Make LEDs 3+4 shadow LEDs 1+2
CM4 uses BCM54210PE, which supports 2 additional LEDs, choosing LED3
for the amber LED because it shows activity by default (LED4 is not
connected). However, this makes it uncontrollable by the eth_led<n>
dtparams which target LEDs 1+2.
Solve the problem by making LEDs 3+4 mirror LEDs 1+2 (which is much
simpler than adding baseboard-specific overrides, but comes with a
risk of making one of the LEDs redundant).
See: https://github.com/raspberrypi/linux/issues/5289
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Following the same pattern as bcm2835-camera and bcm2835-audio,
register the V4L2 codec driver as a platform driver
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Following the same pattern as bcm2835-camera and bcm2835-audio,
register the vcsm-cma driver as a platform driver
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Infineon IRS1125 is a time of flight depth sensor that
has a CSI-2 interface.
Add a V4L2 subdevice driver for this device.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
media: irs1125: Using i2c_transfer for ic2 reads
Reading data over i2c is done by using i2c_transfer to ensure that this
operation can't be interrupted.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
media: irs1125: Refactoring and debug messages
Changed some variable names to comply with checkpatch --strict mode.
Debug messages added.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
media: irs1125: Atomic access to imager reconfiguration
Instead of changing the exposure and framerate settings for all sequences,
they can be changed for every sequence individually now. Therefore the
IRS1125_CID_SAFE_RECONFIG ctrl has been removed and replaced by
IRS1125_CID_SAFE_RECONFIG_S<seq_num>_EXPO and *_FRAME ctrls.
The consistency check in the sequence ctrl IRS1125_CID_SEQ_CONFIG
is removed.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
media: irs1125: Keep HW in sync after imager reset
When closing the video device, the irs1125 is put in power down state.
To keep V4L2 ctrls and the HW in sync, v4l2_ctrl_handler_setup is
called after power up.
The compound ctrl IRS1125_CID_MOD_PLL however has a default value
of all zeros, which puts the imager into a non responding state.
Thus, this ctrl is not written by the driver into HW after power up.
The userspace has to take care to write senseful data.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
media: i2c: add ov9281 driver.
Change-Id: I7b77250bbc56d2f861450cf77271ad15f9b88ab1
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: fix mclk issue when probe multiple camera.
Takes the ov9281 part only from the Rockchip's patch.
Change-Id: I30e833baf2c1bb07d6d87ddb3b00759ab45a90e4
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: add enum_frame_interval function for iq tool 2.2 and hal3
Adds the ov9281 parts of the Rockchip patch adding enum_frame_interval to
a large number of drivers.
Change-Id: I03344cd6cf278dd7c18fce8e97479089ef185a5c
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
media: i2c: ov9281: Fixup for recent kernel releases, and remove custom code
The Rockchip driver was based on a 4.4 kernel, and had several custom
Rockchip parts.
Update to 5.4 kernel APIs, with the relevant controls required by
libcamera, and remove custom Rockchip parts.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Read chip ID via 2 reads
Vision Components have made an OV9281 module which blocks reading
back the majority of registers to comply with NDAs, and in doing
so doesn't allow auto-increment register reading as used when
reading the chip ID.
Use two reads and manually combine the results.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Add support for 8 bit readout
The sensor supports 8 bit mode as well as 10bit, so add the
relevant code to allow selection of this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: ov9281: Add 1280x720 and 640x480 modes
Breaks out common register set and adds the different registers
for 1280x720 (cropped) and 640x480 (skipped) modes
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Fixed picture line bug in all ov9281 modes
Signed-off-by: Mathias Anhalt <mathiasanhalt@web.de>
Added hflip and vflip controls to ov9281
Signed-off-by: Mathias Anhalt <mathiasanhalt@web.de>
media: i2c: ov9281: Remove override of subdev name
From the original Rockchip driver, the subdev was renamed
from the default to being "mov9281 <dev_name>" whereas the
default would have been "ov9281 <dev_name>".
Remove the override to drop back to the default rather than
a vendor custom string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: v4l2-subdev: add subdev-wide state struct
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
media: i2c: ov9281: Add fwnode properties controls
Add call to v4l2_ctrl_new_fwnode_properties to read and
create the fwnode based controls.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: ov9281: Sensor should report RAW color space
Tested on Raspberry Pi running libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Partial revert "media: i2c: add ov9281 driver."
This partially reverts commit 84e98e3a4f.
The commit had merged some changes to other drivers with adding the ov9281
driver. Only the ov9281 parts have been reverted.
media: i2c: Update irs1125 Kconfig entry
Bring the IRS1125 Kconfig declaration in line with upstream entries.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add name for greyworld to white_balance preset names.
This patch previously applied to v4l2-ctrl.c but that was split
and deleted.
Signed-off-by: John Cox <jc@kynesim.co.uk>
This is mainly used for the NoIR camera which has no IR
filter and can completely confuse normal AWB presets.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Adds a simple greyworld white balance preset, mainly for use
with cameras without an IR filter (eg Raspberry Pi NoIR)
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Through emperical testing, the sensor can crop upto a 96x88 window to
produce a valid Bayer frame. Adjust the ROIWH1_MIN ROIWV1_MIN
appropriately for this limit.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add support for setting horizontal and/or vertial flips in the IMX296
sensor through the V4L2_CID_HFLIP and V4L2_CID_VFLIP controls.
Add a new helper function to return the media bus format code that
depends on the sensor flips.
Grab the V4L2_CID_HFLIP and V4L2_CID_VFLIP controls on stream on, and
release on stream off to ensure flips cannot be changed while the sensor
is streaming.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Set the gain delay to 1 frame in the sensor. This avoids any race
condition or ambiguity over when the setting is applied through
userland.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add a helper function to setup the horizontal blanking control. Update
the control limits on set_format as the horizontal blanking time must
remain constant regardless of sensor output width.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add a missing register write (MIPIC_AREA3W) when setting up a crop
window in the sensor to get this functionality working.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
In Fast Trigger mode (external shutter control), FE packet was
not sent at end of frame. Sony recommend this change to fix it.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Don't assume the camera has been reset each time we start streaming,
but always write registers relating to trigger_mode, even in mode 0.
IMX477: Stop driving XVS on stop streaming, to avoid spurious pulses.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Updated imx296 driver to support external trigger mode via XTR pin.
Added module parameter to control this mode.
Signed-off-by: Ben Benson <ben.benson@raspberrypi.com>
Disable enumerating and setting of the 2x2 binned mode entirely as it
does not seem to work for either mono or colour sensor variants.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
dt-bindings: media: i2c: Add IMX519 CMOS sensor binding
Add YAML device tree binding for IMX519 CMOS image sensor, and
the relevant MAINTAINERS entries.
Signed-off-by: Lee Jackson <info@arducam.com>
media: i2c: Add driver for IMX519 sensor
Adds a driver for the 16MPix IMX519 CSI2 sensor.
Whilst the sensor supports 2 or 4 CSI2 data lanes, this driver
currently only supports 2 lanes.
The following Bayer modes are currently available:
4656x3496 10-bit @ 10fps
3840x2160 10-bit (cropped) @ 21fps
2328x1748 10-bit (binned) @ 30fps
1920x1080 10-bit (cropped/binned) @ 60fps
1280x720 10-bit (cropped/binned) @ 120fps
Signed-off-by: Lee Jackson <info@arducam.com>
media: i2c: imx519: Advertise embedded data node on media pad 1
This commit updates the imx519 driver to adverise support for embedded
data streams.
The imx519 sensor subdevice overloads the media pad to differentiate
between image stream (pad 0) and embedded data stream (pad 1) when
performing the v4l2_subdev_pad_ops functions.
Signed-off-by: Lee Jackson <info@arducam.com>
media: i2c: imx519: Sensor should report RAW color space
Tested on Raspberry Pi running libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
media: i2c: Update imx519 Kconfig entry
Bring the IMX519 Kconfig declaration in line with the upstream entries.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
media: i2c: Add PDAF support for IMX519
Add PDAF support for IMX519, and reduce the pixel rate to 426666667,
link freq to 408000000.
Signed-off-by: Lee Jackson <lee.jackson@arducam.com>
drivers: media: imx519: Add V4L2_CID_LINK_FREQ control
Add V4L2_CID_LINK_FREQ as a read-only control with a value of 408 Mhz.
This will be used by the CFE driver to corretly setup the DPHY timing
parameters in the CSI-2 block.
Signed-off-by: Lee Jackson <lee.jackson@arducam.com>
media: i2c: imx519: Squash fixes
dt-bindings: media: i2c: Add IMX477 CMOS sensor binding
Add YAML device tree binding for IMX477 CMOS image sensor.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: Add driver for Sony IMX477 sensor
Adds a driver for the 12MPix Sony IMX477 CSI2 sensor.
Whilst the sensor supports 2 or 4 CSI2 data lanes, this driver
currently only supports 2 lanes.
The following Bayer modes are currently available:
4056x3040 12-bit @ 10fps
2028x1520 12-bit (binned) @ 40fps
2028x1050 12-bit (cropped/binned) @ 50fps
1012x760 10-bit (scaled) @ 120 fps
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Add support for adaptive frame control
Use V4L2_CID_EXPOSURE_AUTO_PRIORITY to control if the driver should
automatically adjust the sensor frame length based on exposure time,
allowing variable frame rates and longer exposures.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Return correct result on sensor id verification
The test should return -EIO if the register read id does not match
the expected sensor id.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Parse and register properties
Parse device properties and register controls for them using the V4L2
fwnode properties helpers.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
media: i2c: imx477: Selection compliance fixes
To comply with the intended usage of the V4L2 selection target when
used to retrieve a sensor image properties, adjust the rectangles
returned by the imx477 driver.
The top/left crop coordinates of the TGT_CROP rectangle were set to
(0, 0) instead of (8, 16) which is the offset from the larger physical
pixel array rectangle. This was also a mismatch with the default values
crop rectangle value, so this is corrected. Found with v4l2-compliance.
While at it, add V4L2_SEL_TGT_CROP_BOUNDS support: CROP_DEFAULT and
CROP_BOUNDS have the same size as the non-active pixels are not readable
using the selection API. Found with v4l2-compliance.
This commit mirrors 543790f777 done for
the imx219 sensor.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Remove auto frame length adjusting
The V4L2_CID_EXPOSURE_AUTO_PRIORITY was used to let the sensor control
frame length (effectively framerate) based on the requested exposure
time requested. Remove this feature as it is never used, and goes
against how V4L2 likes to handle exposure and vblank controls.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Add very long exposure control to the driver
Add support for very long exposures by using the exposure multiplier
register. Userland does not need to pass any additional controls to
enable long exposures, it simply requests a larger vblank to extend the
exposure control range appropriately.
Currently, since hblank is fixed, a maximum of approximately 124 seconds
of exposure time can be used. In a future change, hblank could also be
controlled in userland to give over 200 seconds of exposure time.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Fix crop height for 2028x1080 mode
The crop height for this mode was set at 2600 lines, it should be 2160
lines instead.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Replace existing 1012x760 mode
The existing 1012x760 120 fps mode has significant IQ problem using
the internal sensor scaler. Replace this mode with a 1332x990 120 fps
mode instead. This new mode has a smaller field of view, but does not
suffer from the bad IQ of the original mode.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Remove internal v4l2_mbus_framefmt from the state
The only field in this struct that is used is the format code, so
replace the struct with this single field.
Save the format code in imx477_set_pad_format() when setting up a new
mode so that imx477_get_pad_format() performs the right lookup.
Otherwise, this caused a bug where the mode lookup occurred on the
12-bit table rather than the 10-bit table.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Remove unused function parameter
The struct imx477 *ctrl parameter is not used in the function
imx477_adjust_exposure_range(), so remove it.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Fix for long exposure limit calculations
Do not scale IMX477_EXPOSURE_OFFSET with the long exposure factor during
the limit calculations. This allows larger exposure times, and does seem to be
what the sensor is doing internally.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Extend driver to support imx378 sensor
The imx378 sensor is almost identical to the imx477 and can be
supported as a "compatible" sensor with just a few extra register
writes.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
media: i2c: imx477: Fix framerates for 1332x990 mode
The imx477 driver's line length for this mode had not been updated to
the value supplied to us by the sensor manufacturer. With this
correction the sensor delivers the framerates that are expected.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
media: i2c: imx477: Allow control of on-sensor DPC
A module parameter "dpc_enable" is added to allow the control of the
sensor's on-board DPC (Defective Pixel Correction) function.
This is a global setting to be configured before using the sensor;
there is no intention that this would ever be changed on-the-fly.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
media: i2c: imx477: Sensor should report RAW color space
Tested on Raspberry Pi running libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
media: i2c: imx477: Add vsync trigger_mode parameter
trigger_mode == 0 (default) => no effect / no registers written
trigger_mode == 1 => source
trigger_mode == 2 => sink
This can be set e.g. in /boot/cmdline.txt as imx477.trigger_mode=N
Signed-off-by: Jonas Jacob <jonas.jacob@neocortexvision.com>
media: i2c: Update imx477 Kconfig entry
Bring the IMX477 Kconfig declaration in line with upstream entries.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
media: i2c: imx477: Correct minimum exposure lines
The minimum number of exposure lines value (IMX477_EXPOSURE_MIN) was
previously 20 but this is not correct. The datasheet is not completely
explicit, however the new value of 4 has been tested with all the
sensor modes supported by this driver, and matches the lowest exposure
value of 114us that could be achieved wtih Raspberry Pi's legacy
firmware driver.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
media: i2c: imx477: Allow dynamic horizontal blanking control
Currently, the V4L2_CID_HBLANK control is marked as read-only. Remove this
restriction and allow userland to modify the control if needed.
Set the maximum limit of the line length to 0xfff0.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Reset hblank on mode switch
Reset the hblank control to the minimum value on every mode switch. This is to
account for userland instances that do not yet control hblank, otherwise it
gets set to a non-optimal value.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Do not unconditionally adjust hblank and vblank limits
On a mode change, only call imx477_set_framing_limits() to adjust the hblank
and vblank limits if the new mode is different from the existing mode. This
preserves any manual control values the user might have set.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
driver: media: i2c: imx477: Re-enable temperature sensor
The temperature sensor enable register write got lost at some point.
Re-enable it.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Fix locking in imx477_init_controls()
The driver does not lock the imx477 mutex when calling
imx477_set_framing_limits(), leading to:
WARNING: CPU: 3 PID: 426 at drivers/media/v4l2-core/v4l2-ctrls-api.c:934 __v4l2_ctrl_modify_range+0x1a0/0x210 [
videodev]
Fix this by taking the lock.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
drivers: media: imx477: Disable the scaler
The horizontal scaler was enabled for the 2028x1520 and 2028x1080 modes,
with a scale factor of 1. It caused a single column of bad pixels on the
right edge of the image. Since scaling is not needed for these modes,
disable it entirely.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx477: Set horizontal binning when disabling the scaler
The horizontal scaler has been disabled but actually the sensor is not
binning horizontally, resulting in images that are stretched 2x
horizontally (missing the right half of the field of view completely).
Therefore we must additionally set the horizontal binning mode. There
is only marginal change in output quality and noise levels.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Fixes: f075893e9b ("drivers: media: imx477: Disable the scaler")
drivers: media: imx477: Add V4L2_CID_LINK_FREQ control
Add V4L2_CID_LINK_FREQ as a read-only control with a value of 450 Mhz.
This will be used by the CFE driver to corretly setup the DPHY timing
parameters in the CSI-2 block.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: imx477: Correctly set IMX477_PIXEL_RATE as a r/o control
This control is meant to be read-only, mark it as such.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
drivers: media: i2c: imx296,imx477: Configure tigger_mode every time
Don't assume the camera has been reset each time we start streaming,
but always write registers relating to trigger_mode, even in mode 0.
IMX477: Stop driving XVS on stop streaming, to avoid spurious pulses.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
media: i2c: imx477: Squash fixes
imx477: make trigger-mode more configurable
Allow trigger-mode to be overridden using device tree so that it can be
set per camera. Previously the mode could only be changed using a module
parameter, which would then affect all cameras.
Signed-off-by: Erik Botö <erik.boto@gmail.com>
drivers: media: imx477: Add V4L2_CID_LINK_FREQ control
Add V4L2_CID_LINK_FREQ as a read-only control with a value of 450 Mhz.
This will be used by the CFE driver to corretly setup the DPHY timing
parameters in the CSI-2 block.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
media: i2c: imx477: Add options for slightly modifying the link freq
The default link frequency of 450MHz has been noted to interfere
with GPS if they are in close proximty.
Add the option for 453 and 456MHz to move the signal slightly out
of the band. (447MHz can not be offered as corruption is then observed
on the 133x992 10bit mode).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
fixup imx477 gps
media: i2c: imx477: Fix link frequency menu
"media: i2c: imx477: Add options for slightly modifying the link freq"
created a link frequency menu with 2 items in instead of one.
Correct this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: imx477: Add further link frequency options
https://github.com/raspberrypi/linux/issues/6004 reports further
issues with GPS interference.
Untested, but adds further link frequency options.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: imx477: Fix lockdep errors
imx477_get_format_code has a lockdep_assert_held test, however
the call paths from enum_mbus_code and enum_frame_size don't
lock the mutex before calling it.
Add in the relevant mutex locking.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: i2c: imx477: Disable temperature sensor when enabling XVS
On IMX477 it appears that the on-chip temperature sensor causes
XVS (external sync out) to pulse every ~2ms when not streaming.
So now we do a little dance: Temperature sensor is enabled during
common register setup, giving it time to warm up (almost literally;
otherwise the first frame's reading might be 0C), disabled before
enabling sync out, then enabled again once the camera is streaming.
We already took care to disable XVS output in stop_streaming()
(though previously it wasn't understood why this was needed).
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Some combinations of Pi 4Bs and Ethernet switches don't reliably get a
DCHP-assigned IP address, leaving the unit with a self=assigned 169.254
address. In the failure case, the Pi is left able to receive packets
but not send them, suggesting that the MAC<->PHY link is getting into
a bad state.
It has been found empirically that skipping a reset step by the genet
driver prevents the failures. No downsides have been discovered yet,
and unlike the forced renegotiation it doesn't increase the time to
get an IP address, so the workaround is enabled by default; add
genet.skip_umac_reset=n
to the command line to disable it.
See: https://github.com/raspberrypi/linux/issues/3108
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
These wireless mouse/keyboard combo remote control devices specify
multiple "wheel" events in their report descriptors. The wheel events
are incorrectly defined and apparently map to accelerometer data, leading
to spurious mouse scroll events being generated at an extreme rate when
the device is moved.
As a workaround, use HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE to mask
feeding the extra wheel events to the input subsystem.
See: https://github.com/raspberrypi/firmware/issues/1189
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
The v3d driver currently encounters a lot of MMU PTE exceptions, so
only log the first to avoid swamping the kernel log.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
My various attempts at re-enabling runtime PM have failed, so just
crank the clock down when V3D is idle to reduce power consumption.
Signed-off-by: Eric Anholt <eric@anholt.net>
drm/v3d: Plug dma_fence leak
The irq_fence and done_fence are given a reference that is never
released. The necessary dma_fence_put()s seem to have been
deleted in error in an earlier commit.
Fixes: 0b73676836b2 ("drm/v3d: Clock V3D down when not in use.")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
v3d_drv: Handle missing clock more gracefully
Signed-off-by: popcornmix <popcornmix@gmail.com>
v3d_gem: Kick the clock so firmware knows we are using firmware clock interface
Setting the v3d clock to low value allows firmware to handle dvfs in case
where v3d hardware is not being actively used (e.g. console use).
Signed-off-by: popcornmix <popcornmix@gmail.com>
drm/v3d: Switch clock setting to new api
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
drm/v3d: Convert to new clock range API
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
drm/v3d: Correct clock settng calls to new APIs
There was a report that 6.12 kernel has lower benchmark
scores than 6.6.
I can confirm, and found it started with 6.8 kernel
which moved some code into a new file (v3d_submit.c)
and in two places the change to the clock api were missed.
The effect of the bug is the v3d clock sometimes
unwantedly drops to a lower rate.
With this patch the benchmark scores are good again.
Fixes: 86963038cb
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
drm/v3d: CPU job submissions shouldn't affect V3D GPU clock
We can avoid calling the v3d_clock_up_put and v3d_clock_up_get
when a job is submitted to a CPU queue. We don't need to change
the V3D core frequency to run a CPU job as it is executed on
the CPU. This way we avoid delaying timestamps CPU jobs by 4.5ms
that is the time that it takes the firmware to increase the V3D
core frequency.
Fixes: fe6a858096 ("drm/v3d: Correct clock settng calls to new APIs")
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
The BCM2835 I2C blocks have a register to set the clock-stretch
timeout - how long the device is allowed to hold SCL low - in bus
cycles. The current driver doesn't write to the register, therefore
the default value of 64 cycles is being used for all devices.
Set the timeout to the value recommended for SMBus - 35ms.
See: https://github.com/raspberrypi/linux/issues/3064
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
i2c: bcm2835: Make clock-stretch timeout configurable
The default clock-stretch timeout is 35 mS, which works well for
SMBus, but there are some I2C devices which can stretch the clock even
longer. Rather than trying to prescribe a safe default for everyone,
allow the timeout to be configured.
Signed-off-by: Alex Crawford <raspberrypi/linux@code.acrawford.com>
i2c-bcm2835: Flush FIFOs cleanly on error
On error condition, note the error return code, but still
handle the FIFOs in the normal way rather than relying on
C_CLEAR flushing everything cleanly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
i2c-bcm2835: Do not abort transfers on ERR if still active
If a transaction is aborted immediately on ERR being reported,
then the bus is not returned to the STOP condition, and devices
generally get very upset.
Handle the ERR and CLKT conditions only when TA is not set.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
i2c-bcm2835: Implement I2C_M_IGNORE_NAK
Now that transfers aren't aborted immediately (and uncleanly) on
errors, and the FIFOs are always drained after all transfers, we
can implement I2C_M_IGNORE_NAK by ignoring the returned error
value.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Make the BCM2711 a different machine, but keep it in board_bcm2835.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
arm: bcm2835: Add bcm2838 compatible string.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
ARM: bcm: Switch board, clk and pinctrl to bcm2711 compatible
After the decision to use bcm2711 compatible for upstream, we should
switch all accepted compatibles to bcm2711. So we can boot with
one DTB the down- and the upstream kernel.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Setting both the Drop and Add bits on the input context prevents the
corruption of split transactions seen with the BCM2711 XHCI controller,
which is a dwc3 variant.
This is a downstream feature that allows usbhid to restrict polling
intervals on mice and keyboards, and was only tested on a VL805 which
didn't complain about the fact the endpoint got added twice.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
See https://github.com/raspberrypi/linux/issues/3981
An unknown unsafe memory access can result in the ep_state variable
in xhci_virt_ep being trampled with a stuck SET_DEQ_PENDING state
despite successful completion of a Set TR Deq Pointer command.
All URB enqueue/dequeue calls for the endpoint will fail in this state
so no transfers are possible until the device is reconnected.
As a workaround, clear the flag if we see it set and issue a new Set
TR Deq command anyway - this should be harmless, as a prior Set TR Deq
command will only have been issued in the Stopped state, and if the
endpoint is Running then the controller is required to ignore it and
respond with a Context State Error event TRB.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Must be called in a non-atomic context, after the endpoint
has been registered with the hardware via xhci_add_endpoint
and before the first URB is submitted for the endpoint.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
xHCI caches device and endpoint data after the interface is configured,
so an explicit command needs to be issued for any device driver wanting
to alter the polling interval of an endpoint.
Add usb_fixup_endpoint() to allow drivers to do this. The fixup must be
called after calculating endpoint bandwidth requirements but before any
URBs are submitted.
If polling intervals are shortened, any bandwidth reservations are no
longer valid but in practice polling intervals are only ever relaxed.
Limit the scope to interrupt transfers for now.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
There are several warts surrounding bcmgenet_mii_probe() as this
function is called from ndo_open, but it's calling registration-type
functions. The probe should be called at probe time and refactored
such that the PHY device data can be extracted to limit the scope
of this flag to Broadcom PHYs.
For now, pass this flag in as it puts our attached PHY into a low-power
state when disconnected.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Set defaults for TX and RX packet coalescing to be equivalent to:
# ethtool -C eth0 tx-frames 10
# ethtool -C eth0 rx-usecs 50
This may be something we want to set via DT parameters in the
future.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The HWRNG on the BCM2838 is compatible to iproc-rng200, so add the
support to this driver instead of bcm2835-rng.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
hwrng: iproc-rng200: Correct SoC name
The Pi 4 SoC is called BCM2711, not BCM2838.
Fixes: "hwrng: iproc-rng200: Add BCM2838 support"
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The legacy peripherals can only address the first gigabyte of RAM, so
ensure that DMA allocations are restricted to that region.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The ioremapping creates mappings within the vmalloc area. The
equivalent early function, create_mapping, now checks that the
requested explicit virtual address is between VMALLOC_START and
VMALLOC_END. As there is no reason to have any correlation between
the physical and virtual addresses, put the required mappings at
VMALLOC_START and above.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The logic to drive the data line high to implement a strong pullup
assumed that the pin was already an output - setting a value does
not change an input.
See: https://github.com/raspberrypi/firmware/issues/1143
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
drivers: w1-gpio: add flag to force read-polling while delaying
On Pi 5, the link to RP1 will bounce in and out of L1 depending on
inactivity timers at both the RC and EP end. Unfortunately for
bitbashing 1-wire, this means that on an otherwise idle Pi 5 many of the
reads/writes to GPIO registers are delayed by up to 8us which causes
mis-sampling of read data and trashes write bits.
By issuing dummy reads at a rate greater than the link inactivity
timeout while spinning on a delay, PCIe stays in L0 which does not incur
additional latency.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: w1-gpio: Fixup uninitialised variable use in w1_gpio_probe
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
On error, vchiq_mmal_component_init could leave the
event context allocated for ports.
Clean them up in the error path.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
vchiq_mmal_component_init calls init_event_context for the
control port, but vchiq_mmal_component_finalise didn't free
it, causing a memory leak..
Add the free call.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
mmal_parameters.h hasn't been updated to reflect additions made
over the last few years. Update it to reflect the currently
supported parameters.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The list of formats was copied before Bayer support was added.
The ISP supports Bayer and is being supported by the bcm2835_codec
driver, so add in the encodings for them.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The MMAL client_component field is used with the event
mechanism to allow the client to identify the component for
which the event is generated.
The field is only 32bits in size, therefore we can't use a
pointer to the component in a 64 bit kernel.
Component handles are already held in an array per VCHI
instance, so use the array index as the client_component handle
to avoid having to create a new IDR for this purpose.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
videobuf2 only allowed exporting a dmabuf as a file descriptor,
but there are instances where having the struct dma_buf is
useful within the kernel.
Split the current implementation into two, one step which
exports a struct dma_buf, and the second which converts that
into an fd.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Add the ability to send data to ports. This only supports
zero copy mode as the required bulk transfer setup calls
are not done.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
(Preparation for the codec driver).
The codec uses the event mechanism to report things such as
resolution changes. It is signalled by the cmd field of the buffer
being non-zero.
Add support for passing this information out to the client.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Fixes up a checkpatch error "Avoid using bool structure members
because of possible alignment issues".
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Document the DT bindings for the CSI2/CCP2 receiver peripheral
(known as Unicam) on BCM283x SoCs.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Acked-by: Rob Herring <robh@kernel.org>
dt-bindings: bcm2835-unicam: Update documentation with new clock params
Update the documentation to reflect the new "VPU" clock needed
by the bcm2835-unicam driver.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
New helper defines that allow printing of a FOURCC using
printf(V4L2_FOURCC_CONV, V4L2_FOURCC_CONV_ARGS(fourcc));
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The ADV7282M can support YPbPr on AIN1-3, but this was
not selectable from the driver. Add it to the list of
supported input modes.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The hardware default is differential CVBS on AIN1 & 2, which
isn't very useful.
Select the first input that is defined as valid for the
chip variant (typically CVBS_AIN1).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The bInterval is set to 4 (i.e. 8 microframes => 1ms) and the only bit
that the driver pays attention to is "link was reset". If there's a
flapping status bit in that endpoint data, (such as if PHY negotiation
needs a few tries to get a stable link) then polling at a slower rate
would act as a de-bounce.
See: https://github.com/raspberrypi/linux/issues/2447
Ethernet cables with faulty or missing pairs (specifically pairs C and
D) allow auto-negotiation to 1000Mbs, but do not support the successful
establishment of a link. Add a DT property, "microchip,downshift-after",
to configure the number of auto-negotiation failures after which it
falls back to 100Mbs. Valid values are 2, 3, 4, 5 and 0, where 0 means
never downshift.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The driver already reported the firmware build date during probe.
The mailbox calls have been extended to also report the variant
1 = standard start.elf
2 = start_x.elf (includes camera stack)
3 = start_db.elf (includes assert logging)
4 = start_cd.elf (cutdown version for smallest memory footprint).
Log the variant during probe.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
firmware: raspberrypi: Report the fw git hash during probe
The firmware can now report the git hash from which it was built
via the mailbox, so report it during probe.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
ad83c7cb2f ("irqchip/irq-bcm2836: Add support for DT interrupt polarity")
changed the way that the BCM2836/7 local interrupts are mapped; instead
of being pre-mapped they are now mapped on-demand. A side effect of this
change is that the call to irq_of_parse_and_map from armctrl_of_init
creates a new mapping, forming a gap between the IRQs and the FIQs. This
gap breaks the FIQ<->IRQ mapping which up to now has been done by assuming:
1) that the value of FIQ_START is the same as the number of normal IRQs
that will be mapped (still true), and
2) that this value is also the offset between an IRQ and its equivalent
FIQ (which is no longer the case).
Remove both assumptions by measuring the interval between the last IRQ
and the last FIQ, passing it as the parameter to init_FIQ().
Fixes: https://github.com/raspberrypi/linux/issues/2432
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Register for reboot notifications, sending RPI_FIRMWARE_NOTIFY_REBOOT
over the mailbox interface on reception.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Reduces overhead when using X
usbhid: call usb_fixup_endpoint after mangling intervals
Lets the mousepoll override mechanism work with xhci.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
I2C busses can be assigned specific bus numbers using aliases in
Device Tree - string properties where the name is the alias and the
value is the path to the node. The current DT parameter mechanism
does not allow property names to be derived from a parameter value
in any way, so it isn't possible to generate unique or matching
aliases for nodes from an overlay that can generate multiple
instances, e.g. i2c-gpio.
Work around this limitation (at least temporarily) by allowing
the i2c adapter number to be initialised from the "reg" property
if present.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
There is a new test in __irq_startup that the IRQ is activated, which
hasn't been the case for FIQs since they bypass some of the usual setup.
Augment enable_fiq to include a call to irq_activate to avoid the
warning.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Create a semi-static mapping for the USB registers early in the boot
process, before additional kernel threads are started, so all threads
will have the mappings from the start. This avoids the need for
data aborts to lazily update them.
See: https://github.com/raspberrypi/linux/issues/2450
Signed-off-by: Floris Bos <bos@je-eigen-domein.nl>
The VideoCore bootloader passes in Serial number and
Revision number through Device Tree. Make these available to
userspace through /proc/cpuinfo.
Mainline status:
There is a commit in linux-next that standardize passing the serial
number through Device Tree (string: /serial-number):
ARM: 8355/1: arch: Show the serial number from devicetree in cpuinfo
There was an attempt to do the same with the revision number, but it
didn't get in:
[PATCH v2 1/2] arm: devtree: Set system_rev from DT revision
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Uses the debugfs I/F to provide access to the AXI
bus performance monitors.
Requires the new mailbox peripheral access for access
to the VPU performance registers, system bus access
is done using direct register reads.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
raspberrypi_axi_monitor: suppress warning
Suppress the following warning by casting the pointer to and uintptr_t
before to u32:
Signed-off-by: Matteo Croce <mcroce@redhat.com>
perf/raspberry: Add support for 2712 axi performance monitors
Also handle 2711 correctly which has a different configuration
from 2835.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
IRQ-CPU mapping is round robined on ARM64 to increase
concurrency and allow multiple interrupts to be serviced
at a time. This reduces the need for FIQ.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
drivers: irqchip: irq-bcm2835: Concurrency fix
The commit shown in Fixes: aims to improve interrupt throughput by
getting the handlers invoked on different CPU cores. It does so (*) by
using an irq_ack hook to change the interrupt routing.
Unfortunately, the IRQ status bits must be cleared at source, which only
happens once the interrupt handler has run - there is no easy way for
one core to claim one of the IRQs before sending the remainder to the
next core on the list, so waking another core immediately results in a
race with a chance of both cores handling the same IRQ. It is probably
for this reason that the routing change is deferred to irq_ack, but that
doesn't guarantee no clashes - after irq_ack is called, control returns
to bcm2836_chained_handler_irq which proceeds to check for other pending
IRQs at a time when the next core is probably doing the same thing.
Since the whole point of the original commit is to distribute the IRQ
handling, there is no reason to attempt to handle multiple IRQs in one
interrupt callback, so the problem can be solved (or at least made much
harder to reproduce) by changing a "while" into an "if", so that each
invocation only handles one IRQ.
(*) I'm not convinced it's as effective as claimed since irq_ack is
called _after_ the interrupt handler, but the author thought it made a
difference.
See: https://github.com/raspberrypi/linux/issues/5214https://github.com/raspberrypi/linux/pull/1794
Fixes: fd4c9785bd ("ARM64: Round-Robin dispatch IRQs between CPUs.")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
irqchip: irq-bcm2836: Avoid prototype warning
Declare bcm2836_arm_irqchip_spin_gpu_irq in irq-bcm2836.h to avoid a
compiler warning about a missing prototype.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
This is a port of Pantelis Antoniou's v3 port that makes use of the
new upstreamed configfs support for binary attributes.
Original commit message:
Add a runtime interface to using configfs for generic device tree overlay
usage. With it its possible to use device tree overlays without having
to use a per-platform overlay manager.
Please see Documentation/devicetree/configfs-overlays.txt for more info.
Changes since v2:
- Removed ifdef CONFIG_OF_OVERLAY (since for now it's required)
- Created a documentation entry
- Slight rewording in Kconfig
Changes since v1:
- of_resolve() -> of_resolve_phandles().
Originally-signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
DT configfs: Fix build errors on other platforms
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
DT configfs: fix build error
There is an error when compiling rpi-4.6.y branch:
CC drivers/of/configfs.o
drivers/of/configfs.c:291:21: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
.default_groups = of_cfs_def_groups,
^
drivers/of/configfs.c:291:21: note: (near initialization for 'of_cfs_subsys.su_group.default_groups.next')
The .default_groups is linked list since commit
1ae1602de0.
This commit uses configfs_add_default_group to fix this problem.
Signed-off-by: Slawomir Stepien <sst@poczta.fm>
configfs: New of_overlay API
of: configfs: Use of_overlay_fdt_apply API call
The published API to the dynamic overlay application mechanism now
takes a Flattened Device Tree blob as input so that it can manage the
lifetime of the unflattened tree. Conveniently, the new API call -
of_overlay_fdt_apply - is virtually a drop-in replacement for
create_overlay, which can now be deleted.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add a virtual GPIO driver that uses the firmware mailbox interface to
request that the VPU toggles LEDs.
gpio: bcm-virt: Fix the get() method
The get() method does not understand the on-the-wire encoding of the
remote GPIO states, thinking they are simple on/off bits when they are
really pairs of 16-bit counts. Rewrite the get() handler to return the
value last written, which will eventually match the actual GPIO state
if there are no other changes.
See: https://github.com/raspberrypi/linux/issues/4638
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2835-virtgpio: Update for Linux 6.6
The gpio subsystem is happier if the gpiochip is given a parent, and
if it doesn't have a fixed base gpio number. While we're in here,
use the fact that the firmware node is the parent to locate it,
and use the devm_ version of rpi_firmware_get.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add a mailbox-driven backlight controller for the Raspberry Pi DSI
touchscreen display. Requires updated GPU firmware to recognise the
mailbox request.
Signed-off-by: Gordon Hollingworth <gordon@raspberrypi.org>
Add Raspberry Pi firmware driver to the dependencies of backlight driver
Otherwise the backlight driver fails to build if the firmware
loading driver is not in the kernel
Signed-off-by: Alex Riesen <alexander.riesen@cetitec.com>
ASoC: Add support for Rpi-DAC
ASoC: Add prompt for ICS43432 codec
Without a prompt string, a config setting can't be included in a
defconfig. Give CONFIG_SND_SOC_ICS43432 a prompt so that Pi soundcards
can use the driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add IQaudIO Sound Card support for Raspberry Pi
Set a limit of 0dB on Digital Volume Control
The main volume control in the PCM512x DAC has a range up to
+24dB. This is dangerously loud and can potentially cause massive
clipping in the output stages. Therefore this sets a sensible
limit of 0dB for this control.
Allow up to 24dB digital gain to be applied when using IQAudIO DAC+
24db_digital_gain DT param can be used to specify that PCM512x
codec "Digital" volume control should not be limited to 0dB gain,
and if specified will allow the full 24dB gain.
Modify IQAudIO DAC+ ASoC driver to set card/dai config from dt
Add the ability to set the card name, dai name and dai stream name, from
dt config.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
IQaudIO: auto-mute for AMP+ and DigiAMP+
IQAudIO amplifier mute via GPIO22. Add dt params for "one-shot" unmute
and auto mute.
Revision 2, auto mute implementing HiassofT suggestion to mute/unmute
using set_bias_level, rather than startup/shutdown....
"By default DAPM waits 5 seconds (pmdown_time) before shutting down
playback streams so a close/stop immediately followed by open/start
doesn't trigger an amp mute+unmute."
Tested on both AMP+ (via DAC+) and DigiAMP+, with both options...
dtoverlay=iqaudio-dacplus,unmute_amp
"one-shot" unmute when kernel module loads.
dtoverlay=iqaudio-dacplus,auto_mute_amp
Unmute amp when ALSA device opened by a client. Mute, with 5 second delay
when ALSA device closed. (Re-opening the device within the 5 second close
window, will cancel mute.)
Revision 4, using gpiod.
Revision 5, clean-up formatting before adding mute code.
- Convert tab plus 4 space formatting to 2x tab
- Remove '// NOT USED' commented code
Revision 6, don't attempt to "one-shot" unmute amp, unless card is
successfully registered.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
ASoC: iqaudio-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: iqaudio-dac: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
Added support for HiFiBerry DAC+
The driver is based on the HiFiBerry DAC driver. However HiFiBerry DAC+ uses
a different codec chip (PCM5122), therefore a new driver is necessary.
Add support for the HiFiBerry DAC+ Pro.
The HiFiBerry DAC+ and DAC+ Pro products both use the existing bcm sound driver with the DAC+ Pro having a special clock device driver representing the two high precision oscillators.
An addition bug fix is included for the PCM512x codec where by the physical size of the sample frame is used in the calculation of the LRCK divisor as it was found to be wrong when using 24-bit depth sample contained in a little endian 4-byte sample frame.
Limit PCM512x "Digital" gain to 0dB by default with HiFiBerry DAC+
24db_digital_gain DT param can be used to specify that PCM512x
codec "Digital" volume control should not be limited to 0dB gain,
and if specified will allow the full 24dB gain.
Add dt param to force HiFiBerry DAC+ Pro into slave mode
"dtoverlay=hifiberry-dacplus,slave"
Add 'slave' param to use HiFiBerry DAC+ Pro in slave mode,
with Pi as master for bit and frame clock.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
Fixed a bug when using 352.8kHz sample rate
Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>
ASoC: pcm512x: revert downstream changes
This partially reverts commit 185ea05465
which was added by https://github.com/raspberrypi/linux/pull/1152
The downstream pcm512x changes caused a regression, it broke normal
use of the 24bit format with the codec, eg when using simple-audio-card.
The actual bug with 24bit playback is the incorrect usage
of physical_width in various drivers in the downstream tree
which causes 24bit data to be transmitted with 32 clock
cycles. So it's not the pcm512x that needs fixing, it's the
soundcard drivers.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplus: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplus: transmit S24_LE with 64 BCLK cycles
Signed-off-by: Matthias Reichl <hias@horus.com>
hifiberry_dacplus: switch to snd_soc_dai_set_bclk_ratio
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplus: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add driver for rpi-proto
Forward port of 3.10.x driver from https://github.com/koalo
We are using a custom board and would like to use rpi 3.18.x
kernel. Patch works fine for our embedded system.
URL to the audio chip:
http://www.mikroe.com/add-on-boards/audio-voice/audio-codec-proto/
Playback tested with devicetree enabled.
Signed-off-by: Waldemar Brodkorb <wbrodkorb@conet.de>
ASoC: rpi-proto: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add Support for JustBoom Audio boards
justboom-dac: Adjust for ALSA API change
As of 4.4, snd_soc_limit_volume now takes a struct snd_soc_card *
rather than a struct snd_soc_codec *.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
ASoC: justboom-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params as it's no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: justboom-dac: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
New AudioInjector.net Pi soundcard with low jitter audio in and out.
Contains the sound/soc/bcm ALSA machine driver and necessary alterations to the Kconfig and Makefile.
Adds the dts overlay and updates the Makefile and README.
Updates the relevant defconfig files to enable building for the Raspberry Pi.
Thanks to Phil Elwell (pelwell) for the review, simple-card concepts and discussion. Thanks to Clive Messer for overlay naming suggestions.
Added support for headphones, microphone and bclk_ratio settings.
This patch adds headphone and microphone capability to the Audio Injector sound card. The patch also sets the bit clock ratio for use in the bcm2835-i2s driver. The bcm2835-i2s can't handle an 8 kHz sample rate when the bit clock is at 12 MHz because its register is only 10 bits wide which can't represent the ch2 offset of 1508. For that reason, the rate constraint is added.
ASoC: audioinjector-pi-soundcard: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
New driver for RRA DigiDAC1 soundcard using WM8741 + WM8804
ASoC: digidac1-soundcard: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for Dion Audio LOCO DAC-AMP HAT
Using dedicated machine driver and pcm5102a codec driver.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
ASoC: dionaudio_loco: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Allo Piano DAC boards: Initial 2 channel (stereo) support (#1645)
Add initial 2 channel (stereo) support for Allo Piano DAC (2.0/2.1) boards,
using allo-piano-dac-pcm512x-audio overlay and allo-piano-dac ALSA ASoC
machine driver.
NB. The initial support is 2 channel (stereo) ONLY!
(The Piano DAC 2.1 will only support 2 channel (stereo) left/right output,
pending an update to the upstream pcm512x codec driver, which will have
to be submitted via upstream. With the initial downstream support,
provided by this patch, the Piano DAC 2.1 subwoofer outputs will
not function.)
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Signed-off-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
ASoC: allo-piano-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params and ops as they are no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: allo-piano-dac: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for Allo Piano DAC 2.1 plus add-on board for Raspberry Pi.
The Piano DAC 2.1 has support for 4 channels with subwoofer.
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com>
Add clock changes and mute gpios (#1938)
Also improve code style and adhere to ALSA coding conventions.
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com>
PianoPlus: Dual Mono & Dual Stereo features added (#2069)
allo-piano-dac-plus: Master volume added + fixes
Master volume added, which controls both DACs volumes.
See: https://github.com/raspberrypi/linux/pull/2149
Also fix initial max volume, default mode value, and unmute.
Signed-off-by: allocom <sparky-dev@allo.com>
ASoC: allo-piano-dac-plus: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
sound: bcm: Fix memset dereference warning
This warning appears with GCC 6.4.0 from toolchains.bootlin.com:
../sound/soc/bcm/allo-piano-dac-plus.c: In function ‘snd_allo_piano_dac_init’:
../sound/soc/bcm/allo-piano-dac-plus.c:711:30: warning: argument to ‘sizeof’ in ‘memset’ call is the same expression as the destination; did you mean to dereference it? [-Wsizeof-pointer-memaccess]
memset(glb_ptr, 0x00, sizeof(glb_ptr));
^
Suggested-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
ASoC: allo-piano-dac-plus: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for Allo Boss DAC add-on board for Raspberry Pi. (#1924)
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Deepak <deepak@zilogic.com>
Reviewed-by: BabuSubashChandar <babusubashchandar@zilogic.com>
Add support for new clock rate and mute gpios.
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Deepak <deepak@zilogic.com>
Reviewed-by: BabuSubashChandar <babusubashchandar@zilogic.com>
ASoC: allo-boss-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: allo-boss-dac: transmit S24_LE with 64 BCLK cycles
Signed-off-by: Matthias Reichl <hias@horus.com>
allo-boss-dac: switch to snd_soc_dai_set_bclk_ratio
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: allo-boss-dac: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Support for Blokas Labs pisound board
Pisound dynamic overlay (#1760)
Restructuring pisound-overlay.dts, so it can be loaded and unloaded dynamically using dtoverlay.
Print a logline when the kernel module is removed.
pisound improvements:
* Added a writable sysfs object to enable scripts / user space software
to blink MIDI activity LEDs for variable duration.
* Improved hw_param constraints setting.
* Added compatibility with S16_LE sample format.
* Exposed some simple placeholder volume controls, so the card appears
in volumealsa widget.
Add missing SND_PISOUND selects dependency to SND_RAWMIDI
Without it the Pisound module fails to compile.
See https://github.com/raspberrypi/linux/issues/2366
Updates for Pisound module code:
* Merged 'Fix a warning in DEBUG builds' (1c8b82b).
* Updating some strings and copyright information.
* Fix for handling high load of MIDI input and output.
* Use dual rate oversampling ratio for 96kHz instead of single
rate one.
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
Fixing memset call in pisound.c
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
Fix for Pisound's MIDI Input getting blocked for a while in rare cases.
There was a possible race condition which could lead to Input's FIFO queue
to be underflown, causing high amount of processing in the worker thread for
some period of time.
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
Fix for Pisound kernel module in Real Time kernel configuration.
When handler of data_available interrupt is fired, queue_work ends up
getting called and it can block on a spin lock which is not allowed in
interrupt context. The fix was to run the handler from a thread context
instead.
Pisound: Remove spinlock usage around spi_sync
ASoC: pisound: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
ASoC: pisound: fix the parameter for spi_device_match
Signed-off-by: Hui Wang <hui.wang@canonical.com>
ASoC: Add driver for Cirrus Logic Audio Card
Note: due to problems with deferred probing of regulators
the following softdep should be added to a modprobe.d file
softdep arizona-spi pre: arizona-ldo1
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: rpi-cirrus: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
sound: Support for Dion Audio LOCO-V2 DAC-AMP HAT
Signed-off-by: Miquel Blauw <info@dionaudio.nl>
ASoC: dionaudio_loco-v2: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params and ops as they are no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: dionaudio_loco-v2: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for Fe-Pi audio sound card. (#1867)
Fe-Pi Audio Sound Card is based on NXP SGTL5000 codec.
Mechanical specification of the board is the same the Raspberry Pi Zero.
3.5mm jacks for Headphone/Mic, Line In, and Line Out.
Signed-off-by: Henry Kupis <fe-pi@cox.net>
ASoC: fe-pi-audio: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for the AudioInjector.net Octo sound card
AudioInjector Octo: sample rates, regulators, reset
This patch adds new sample rates to the Audioinjector Octo sound card. The
new supported rates are (in kHz) :
96, 48, 32, 24, 16, 8, 88.2, 44.1, 29.4, 22.05, 14.7
Reference the bcm270x DT regulators in the overlay.
This patch adds a reset GPIO for the AudioInjector.net octo sound card.
Audioinjector octo : Make the playback and capture symmetric
This patch ensures that the sample rate and channel count of the audioinjector
octo sound card are symmetric.
audioinjector-octo: Add continuous clock feature
By user request, add a switch to prevent the clocks being stopped when
the stream is paused, stopped or shutdown. Provide access to the switch
by adding a 'non-stop-clocks' parameter to the audioinjector-addons
overlay.
See: https://github.com/raspberrypi/linux/issues/2409
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
sound: Fixes for audioinjector-octo under 4.19
1. Move the DT alias declaration to the I2C shim in the cases
where the shim is enabled. This works around a problem caused by a
4.19 commit [1] that generates DT/OF uevents for I2C drivers.
2. Fix the diagnostics in an error path of the soundcard driver to
correctly identify the reason for the failure to load.
3. Move the declaration of the clock node in the overlay outside
the I2C node to avoid warnings.
4. Sort the overlay nodes so that dependencies are only to earlier
fragments, in an attempt to get runtime dtoverlay application to
work (it still doesn't...)
See: https://github.com/Audio-Injector/Octo/issues/14
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
[1] af503716ac ("i2c: core: report OF style module alias for devices registered via OF")
ASoC: audioinjector-octo-soundcard: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Driver support for Google voiceHAT soundcard.
ASoC: googlevoicehat-codec: Use correct device when grabbing GPIO
The fixup for the VoiceHAT in 4.18 incorrectly tried to find the
sdmode GPIO pin under the card device, not the codec device.
This failed, and therefore caused the device probe to fail.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
ASoC: googlevoicehat-codec: Reformat for kernel coding standards
Fix all whitespace, indentation, and bracing errors.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
ASoC: googlevoicehat-codec: Make driver function structure const
Make voicehat_component_driver a const structure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
ASoC: googlevoicehat-codec: Only convert from ms to jiffies once
Minor optimisation and allows to become checkpatch clean.
A msec value is read out of DT or from a define, and convert once to
jiffies, rather than every time that it is used.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Driver and overlay for Allo Katana DAC
Allo Katana DAC: Updated default values
Signed-off-by: Jaikumar <jaikumar@cem-solutions.com>
Added mute stream func
Signed-off-by: Jaikumar <jaikumar@cem-solutions.net>
codecs: Correct Katana minimum volume
Update Katana minimum volume to get the exact 0.5 dB value in each step.
Signed-off-by: Sudeep Kumar <sudeepkumar@cem-solutions.net>
ASoC: Add generic RPI driver for simple soundcards.
The RPI simple sound card driver provides a generic ALSA SOC card driver
supporting a variety of Pi HAT soundcards. The intention is to avoid
the duplication of code for cards that can't be fully supported by
the soc simple/graph cards but are otherwise almost identical.
This initial commit adds support for the ADAU1977 ADC, Google VoiceHat,
HifiBerry AMP, HifiBerry DAC and RPI DAC.
Signed-off-by: Tim Gover <tim.gover@raspberrypi.org>
ASoC: Use correct card name in rpi-simple driver
Use the specific card name from drvdata instead of the snd_rpi_simple
rpi-simple-soundcard: Use nicer driver name "RPi-simple"
Rename the driver from "RPI simple soundcard" to "RPi-simple" so that
the driver name won't be mangled allowing to be used unaltered as the
card conf filename.
ASoC: rpi-simple-soundcard: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
ASoC: Add Kconfig and Makefile for sound/soc/bcm
Signed-off-by: popcornmix <popcornmix@gmail.com>
ASoC: Create a generic Pi Hat WM8804 driver
Reduce the amount of duplicated code by creating a generic driver for
Pi Hat digi cards using the WM8804 codec.
This replaces the
Allo DigiOne, Hifiberry Digi/Pro, JustBoom Digi and IQAudIO Digi
dedicate soundcard drivers with a generic driver.
There are no significant changes to the runtime behavior of the drivers
and end users should not have to change any configuration settings
after upgrading.
Minor changes
* Check the return value of snd_soc_component_update_bits
* Added some pr_debug tracing
* Various checkpatch tidyups
* Updated allodigi-one to use use 128FS at > 96 Khz. This appears to
be an omission in the original driver code so followed the Hifiberry
DAC driver approach.
ASoC: rpi-wm8804-soundcard: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
rpi-wm8804-soundcard: drop PWRDN register writes
Since kernel 4.0 the PWRDN register bits are under DAPM
control from the wm8804 driver.
Drop code that modifies that register to avoid interfering
with DAPM.
Signed-off-by: Matthias Reichl <hias@horus.com>
rpi-wm8804-soundcard: configure wm8804 clocks only on rate change
This should avoid clicks when stopping and immediately afterwards
starting a stream with the same samplerate as before.
Signed-off-by: Matthias Reichl <hias@horus.com>
rpi-wm8804-soundcard: Fixed MCLKDIV for Allo Digione
The Allo Digione board wants a fixed MCLKDIV of 256.
See: https://github.com/raspberrypi/linux/issues/3296
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
ASoC: Add support for AudioSense-Pi add-on soundcard
AudioSense-Pi is a RPi HAT based on a TI's TLV320AIC32x4 stereo codec
This hardware provides multiple audio I/O capabilities to the RPi.
The codec connects to the RPi's SoC through the I2S Bus.
The following devices can be connected through a 3.5mm jack
1. Line-In: Plain old audio in from mobile phones, PCs, etc.,
2. Mic-In: Connect a microphone
3. Line-Out: Connect the output to a speaker
4. Headphones: Connect a Headphone w or w/o microphones
Multiple Inputs:
It supports the following combinations
1. Two stereo Line-Inputs and a microphone
2. One stereo Line-Input and two microphones
3. Two stereo Line-Inputs, a microphone and
one mono line-input (with h/w hack)
4. One stereo Line-Input, two microphones and
one mono line-input (with h/w hack)
Multiple Outputs:
Audio output can be routed to the headphones or
speakers (with additional hardware)
Signed-off-by: b-ak <anur.bhargav@gmail.com>
ASoC: audiosense-pi: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Added driver for the HiFiBerry DAC+ ADC (#2694)
Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>
hifiberry_dacplusadc: switch to snd_soc_dai_set_bclk_ratio
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplusadc: fix DAI link setup
The driver only defines a single DAI link and the code that tries
to setup the second (non-existent) DAI link looks wrong - using dmic
as a CPU/platform driver doesn't make any sense.
The DT overlay doesn't define a dmic property, so the code was never
executed (otherwise it would have resulted in a memory corruption).
So drop the offending code to prevent issues if a dmic property
should be added to the DT overlay.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplusadc: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
Audiophonics I-Sabre 9038Q2M DAC driver
Signed-off-by: Audiophonics <contact@audiophonics.fr>
ASoC: i-sabre-q2m: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Added IQaudIO Pi-Codec board support (#2969)
Add support for the IQaudIO Pi-Codec board.
Signed-off-by: Gordon <gordon@iqaudio.com>
Fixed 48k timing issue
ASoC: iqaudio-codec: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
adds the Hifiberry DAC+ADC PRO version
This adds the driver for the DAC+ADC PRO version of the Hifiberry soundcard with software controlled PCM1863 ADC
Signed-off-by: Joerg Schambacher joerg@i2audio.com
Add Hifiberry DAC+DSP soundcard driver (#3224)
Adds the driver for the Hifiberry DAC+DSP. It supports capture and
playback depending on the DSP firmware.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
Allow simultaneous use of JustBoom DAC and Digi
Signed-off-by: Johannes Krude <johannes@krude.de>
Pisound: MIDI communication fixes for scaled down CPU.
* Increased maximum SPI communication speed to avoid running too slow
when the CPU is scaled down and losing MIDI data.
* Keep track of buffer usage in millibytes for higher precision.
Signed-off-by: Giedrius Trainavičius <giedrius@blokas.io>
sound: Add the HiFiBerry DAC+HD version
This adds the driver for the DAC+HD version supporting HiFiBerry's
PCM179x based DACs. It also adds PLL control for clock generation.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
Fix master mode settings of HiFiBerry DAC+ADC PRO card (#3424)
This patch fixes the board DAI setting when in master-mode.
Wrong setting could have caused random pop noise.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
adds LED OFF feature to HiFiBerry DAC+ADC PRO sound card
This adds a DT overlay parameter 'leds_off' which allows
to switch off the onboard activity LEDs at all times
which has been requested by some users.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
adds LED OFF feature to HiFiBerry DAC+ADC sound card
This adds a DT overlay parameter 'leds_off' which allows
to switch off the onboard activity LEDs at all times
which has been requested by some users.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
adds LED OFF feature to HiFiBerry DAC+/DAC+PRO sound cards
This adds a DT overlay parameter 'leds_off' which allows
to switch off the onboard activity LEDs at all times
which has been requested by some users.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
pisound: Added reading Pisound board hardware revision and exposing it (#3425)
pisound: Added reading Pisound board hardware revision and exposing it in kernel log and sysfs file:
/sys/kernel/pisound/hw_version
Signed-off-by: Giedrius <giedrius@blokas.io>
Added driver for HiFiBerry Amp amplifier add-on board
The driver contains a low-level hardware driver for the TAS5713 and the
drivers for the Raspberry Pi I2S subsystem.
TAS5713: return error if initialisation fails
Existing TAS5713 driver logs errors during initialisation, but does not return
an error code. Therefore even if initialisation fails, the driver will still be
loaded, but won't work. This patch fixes this. I2C communication error will now
reported correctly by a non-zero return code.
HiFiBerry Amp: fix device-tree problems
Some code to load the driver based on device-tree-overlays was missing. This is added by this patch.
According to 5713 pdf doc CLOCK_CTRL is a readonly status register, and it behaves so. Remove useless setting
sound: pcm512x-codec: Adding 352.8kHz samplerate support
sound/soc: only first codec is master in multicodec setup
When using multiple codecs, at most one codec should generate the master
clock. All codecs except the first are therefore configured for slave
mode.
Signed-off-by: Johannes Krude <johannes@krude.de>
ASoC: Fix snd_soc_get_pcm_runtime usage
Commit [1] changed the snd_soc_get_pcm_runtime to take a dai_link
pointer instead of a string. Patch up the downstream drivers to use
the modified API.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[1] 4468189ff3 ("ASoC: soc-core: find rtd via dai_link pointer at snd_soc_get_pcm_runtime()")
Add support for the AudioInjector.net Isolated sound card
This patch adds support for the Audio Injector Isolated sound card.
Signed-off-by: Matt Flax <flatmax@flatmax.org>
Add support for merus-amp soundcard and ma120x0p codec
Add 96KHz rate support to MA120X0P codec and make enable and mute gpio
pins optional.
Signed-off-by: AMuszkat <ariel.muszkat@gmail.com>
Fixes a problem with clock settings of HiFiBerry DAC+ADC PRO (#3545)
This patch fixes a problem of the re-calculation of
i2s-clock and -parameter settings when only the ADC is activated.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
configs: Enable the AD193x codecs
See: https://github.com/raspberrypi/linux/issues/2850
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Switch to snd_soc_dai_set_bclk_ratio
Replaces obsolete function snd_soc_dai_set_tdm_slot
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
Enhances the DAC+ driver to control the optional headphone amplifier
Probes on the I2C bus for TPA6130A2, if successful, it sets DT-parameter
'status' from 'disabled' to 'okay' using change_sets to enable
the headphone control.
Signed-off-by: Joerg Schambacher joerg@i2audio.com
Update Allo Piano Dac Driver
Add unique names to the individual dac coded drivers
Remove some of the codec controls that are not used.
Signed-off-by: Paul Hermann <paul@picoreplayer.org>
Fixes an onboard clock detection problem of the PRO versions
Increasing the sleep time after clock selection to 3-4ms
allows the correct detection of all combinations of DAC+ Pro
and DAC+ADC Pro sound cards and the various PI revisions.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
ASoC:ma120x0p: Increase maximum sample rate to 192KHz
Change the maximum sample rate for the amplifier to
192KHz as given in the Infineon specification.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
ASoC: ma120x0p: Remove unnecessary const specifier
Clang warns:
sound/soc/codecs/ma120x0p.c:891:14: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
static const SOC_VALUE_ENUM_SINGLE_DECL(pwr_mode_ctrl,
^
./include/sound/soc.h:362:2: note: expanded from macro 'SOC_VALUE_ENUM_SINGLE_DECL'
SOC_VALUE_ENUM_DOUBLE_DECL(name, xreg, xshift, xshift, xmask, xtexts, xvalues)
^
./include/sound/soc.h:359:2: note: expanded from macro 'SOC_VALUE_ENUM_DOUBLE_DECL'
const struct soc_enum name = SOC_VALUE_ENUM_DOUBLE(xreg, xshift_l, xshift_r, xmask, \
^
1 warning generated.
SOC_VALUE_ENUM_DOUBLE_DECL already has a const specifier. Remove the duplicate
const to clean up the warning.
Fixes: 42444979e7 ("Add support for all the downstream rpi sound card drivers")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
ASoC: bcm: allo-piano-dac-plus: Remove unnecessary const specifiers
Clang warns:
sound/soc/bcm/allo-piano-dac-plus.c:66:14: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
static const SOC_ENUM_SINGLE_DECL(allo_piano_mode_enum,
^
./include/sound/soc.h:355:2: note: expanded from macro 'SOC_ENUM_SINGLE_DECL'
SOC_ENUM_DOUBLE_DECL(name, xreg, xshift, xshift, xtexts)
^
./include/sound/soc.h:352:2: note: expanded from macro 'SOC_ENUM_DOUBLE_DECL'
const struct soc_enum name = SOC_ENUM_DOUBLE(xreg, xshift_l, xshift_r, \
^
sound/soc/bcm/allo-piano-dac-plus.c:75:14: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
static const SOC_ENUM_SINGLE_DECL(allo_piano_dual_mode_enum,
^
./include/sound/soc.h:355:2: note: expanded from macro 'SOC_ENUM_SINGLE_DECL'
SOC_ENUM_DOUBLE_DECL(name, xreg, xshift, xshift, xtexts)
^
./include/sound/soc.h:352:2: note: expanded from macro 'SOC_ENUM_DOUBLE_DECL'
const struct soc_enum name = SOC_ENUM_DOUBLE(xreg, xshift_l, xshift_r, \
^
sound/soc/bcm/allo-piano-dac-plus.c:96:14: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
static const SOC_ENUM_SINGLE_DECL(allo_piano_enum,
^
./include/sound/soc.h:355:2: note: expanded from macro 'SOC_ENUM_SINGLE_DECL'
SOC_ENUM_DOUBLE_DECL(name, xreg, xshift, xshift, xtexts)
^
./include/sound/soc.h:352:2: note: expanded from macro 'SOC_ENUM_DOUBLE_DECL'
const struct soc_enum name = SOC_ENUM_DOUBLE(xreg, xshift_l, xshift_r, \
^
3 warnings generated.
SOC_VALUE_ENUM_DOUBLE_DECL already has a const specifier. Remove the duplicate
const specifiers to clean up the warnings.
Fixes: 42444979e7 ("Add support for all the downstream rpi sound card drivers")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
rpi-simple-soundcard: Add Dion Audio KIWI streamer
Signed-off-by: Miquel Blauw <miquelblauw@hotmail.com>
rpi-simple-soundcard: adds definitions for the HiFiBerry AMP3 card
Uses Infineon MA120x0 amplifier and supports full sample rate of 192ksps.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
sound: soc: bcm: Added Sound card driver for Dacberry400 Audio card for Raspberry Pi 400
Added Sound card driver for DACberry400 Audio card.
Signed-off-by: Ashish Vara <ashishhvara@gmail.com>
ASoC:ma120x0p: Corrects the volume level display
Fixes the wrongly changed 'limiter volume' display back to -50dB minimum
and sets the correct minimum volume level to -144dB to be aligned with
the controls and display in alsamixer etc.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
ASoC: bcm: Fix Rpi-PROTO and audioinjector.net Pi
As of kernel 5.19 the WM8731 driver has separate I2C and SPI support
modules. Change the Kconfig definitions for the audioinjector.net Pi
and Rpi-PROTO soundcards to select SND_SOC_WM8731_I2C.
See: https://github.com/raspberrypi/linux/issues/5364
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: adau1977: Add correct compatible strings
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: bcm2835-i2s: Use phys addresses for DAI DMA
Contrary to what struct snd_dmaengine_dai_dma_data suggests, the
configuration of addresses of DMA slave interfaces should be done in
CPU physical addresses.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
rpi sound cards: Fix Codec Zero rate switching
The Raspberry Pi Codec Zero (and IQaudIO Codec) don't notify the DA7213
codec when it needs to change PLL frequencies. As a result, audio can
be played at the wrong rate - play a 48kHz sound immediately after a
44.1kHz sound to see the effect, but in some configurations the codec
can lock into the wrong state and always get some rates wrong.
Add the necessary notification to fix the issue.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: dwc: Support set_bclk_ratio
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: dwc: Add DMACR handling
Add control of the DMACR register, which is required for paced DMA
(i.e. DREQ) support.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASOC: dwc: Improve DMA shutdown
Disabling the I2S interface with outstanding transfers prevents the
DMAC from shutting down, so keep it partially active after a stop.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASOC: dwc: Fix 16-bit audio handling
IMO the Synopsys datasheet could be clearer in this area, but it seems
that the DMA data ports (DMATX and DMARX) expect left and right samples
in alternate writes; if a stereo pair is pushed in a single 32-bit
write, the upper half is ignored, leading to double speed audio with a
confused stereo image. Make sure the necessary changes happen by
updating the DMA configuration data in the hw_params method.
The set_bclk_ratio change was made at a time when it looked like it
could be causing an error, but I think the division of responsibilities
is clearer this way (and the kernel log clearer without the info-level
message).
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: bcm: Remove dependency on BCM2835 I2S
These soundcard drivers don't rely on a specific I2S interface, so
remove the dependency declarations.
See: https://github.com/raspberrypi/linux-2712/issues/111
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: bcm: audioinjector_octo: Add soundcard "owner"
See: https://github.com/raspberrypi/linux/issues/5697
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Pisound: Don't export the button GPIO via sysfs GPIO class.
Signed-off-by: Giedrius Trainavičius <giedrius@blokas.io>
Pisound: Read out the SPI speed to use from the Device Tree.
Signed-off-by: Giedrius Trainavičius <giedrius@blokas.io>
ASoC: DACplus - fix 16bit sample support in clock consumer mode
The former code did not adjust the physical sample width when
in clock consumer mode and has taken the fixed 32 bit default.
This has caused the audio to be played at half its frequency due to
the fixed bclk_ratio of 64.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
ASoC: adds support for AMP4 Pro to the DAC Plus driver
The AMP4 Pro is a I2S master mode capable amplifier with
clean onboard clock generators.
We can share the card driver between TAS575x amplifiers
and the PCM512x DACs as they are SW compatible.
From a HW perspective though we need to limit the sample
rates to the standard audio rates to avoid running the
onboard clocks through the PLL. Using the PLL would require
even a different HW.
DAI/stream name are also set accordingly to allow the user
a convenient identification of the soundcard
Needs the pcm512x driver with TAS575x support (already in
upstream kernel).
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
ASoC: DACplusADCPro - fix 16bit sample support in clock consumer mode
The former code did not adjust the physical sample width when in
clock consumer mode and has taken the fixed 32 bit default. This
has caused the audio to be played at half its frequency due to
the fixed bclk_ratio of 64.
Problem appears only on PI5 as on the former PIs the I2S module
did simply run at fixed 64x rate.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
Impliment driver support for Interlude Audio Digital Hat
Implementing driver support for
Interlude audio's WM8805 based digital hat
by leveraging existing drivers
ASOc: Add HiFiBerry DAC8X to the simple card driver
Defines the settings for the 8 channel version of the standard
DAC by overwriting the number of channels in the DAI defs.
It can run in 8ch mode only on PI5 using the 4 lane data output
of the designware I2S0 module.
Signed-off-by: j-schambacher <joerg@hifiberry.com>
ASoC: bcm: Use the correct sample width value
ALSA's concept of the physical width of a sample is how much memory it
occupies, including any padding. This not the same as the count of bits
of actual sample content. In particular, S24_LE has a width of 24 bits
but a physical width of 32 bits because there is a byte of padding with
each sample.
When calculating bclk_ratio, etc., it is width that matters, not
physical width. Correct the error that has been replicated across the
drivers for many Raspberry Pi-compatible soundcards.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: dwc: Correct channel count reporting
The DWC I2S driver treats the channel count register values as if they
encode a power of two (2, 4, 8, 16), but they actually encode a
multiple of 2 (2, 4, 6, 8).
Also improve the error message when asked for an unsupported number
of channels.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: Fix 16bit sample support for Hifiberry DACplusADC
Same issue as #5919.
'width' needs to be set independent of clocking mode.
Signed-off-by: j-schambacher <joerg@hifiberry.com>
allo-boss-dac mute output when changing parameters
Since I noticed that sometimes changing sample rates causes some digital
quirks and noises, I've changed the function to mute the output before
performing the changes and then unmute it when an error occurs or the
parameters got set.
Signed-off-by: Alessandro Marcon <marconalessandro04@gmail.com>
ASoC: bcm: Use power-of-2 bclk_ratios
The soundcard drivers originally used snd_pcm_format_physical_width,
but a later commit changed that to snd_pcm_format_width because the
in-memory sample storage width should not be a factor in determining
the bclk_ratio. However, the physical width rounds the sample bits up
to the nearest power of 2, which makes it easier to find integer clock
divisors.
Restore the old behaviour, but with an implementation that makes it
clear what is going on.
See: https://github.com/raspberrypi/linux/issues/6104
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: bcm: Add "owner" info for more soundcards
See: https://github.com/raspberrypi/linux/issues/5697
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: da7213: Add a set_bclk_ratio method
Following [1], it becomes harder for the CPU DAI to know the correct
BCLK ratio. We can either bake the same knowledge into the sound card
driver, or implement and use set_bclk_ratio on the codec. This commit
does the latter.
[1] commit c89e652e84 ("ASoC: da7213: Add support for mono, set
frame width to 32 when possible")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
iqaudio-codec: Use the codec's new set_bclk_ratio
To ensure that the CPU DAI and codec agree over the BCLK ratio, impose
a fixed value of 64 on both of them.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: add driver for new HiFiBerry ADC only board(s)
Adds the driver for the soon to be released first ADC only board.
It includes the same ADC controls as used by the DAC+ADC Pro driver.
Signed-off-by: j-schambacher <joerg@hifiberry.com>
ASoC: add HiFiBerry ADC8x 8-channel ADC to simple-card-driver
Definitions for the 8 channel ADC card. The card uses only
HW-controlled devices which allows the uses of the 'dummy-dai'.
It will run only on a PI5 as it requires the designware I2S0 module.
The necessary output lanes I2S0_DI[0..3] are claimed from within the
DT overlay.
Signed-off-by: j-schambacher <joerg@hifiberry.com>
sound/soc: dwc-i2s: choose FIFO thresholds based on DMA burst constraints
Valid ranges for the I2S peripheral's FIFO configuration include a depth
of 16 - unconditionally setting the burst length to 16 with a fifo
threshold of size/2 will cause under/overflows.
For DMA engines with restricted capabilities the requested burst length
and FIFO thresholds need to be adjusted downward accordingly.
Both the RX and TX FIFOs operate on "less-than" thresholds. Setting the
TX threshold to fifo_size minus burst means the FIFO is kept nearly-full.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
ASoC: allo-piano-dac-plus: Fix volume limit locking
Calling snd_soc_limit_volume from within a kcontrol put handler seems
to cause a deadlock as it attempts to claim a write lock that is already
held. Call snd_soc_limit_volume from the main initialisation code
instead, to avoid the recursive locking.
See: https://github.com/raspberrypi/linux/issues/6527
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: allo-piano-dac-plus: Suppress -517 errors
Use dev_err_probe to simplify the code and suppress EPROBE_DEFER errors.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
soc: pcm3168a: Add DT binding to force clock consumer mode
ASoC cannot configure the codec correctly when the ADC and DAC share clock
lines and one of them is the clock producer. Add a DT binding that
overrides ASoC and forces the component into clock consumer mode.
Signed-off-by: Stephen Gordon <gordoste@iinet.net.au>
ASoC: pcm512x: Demote "No SCLK" to debug level
Designing a PCM512X-based soundcard with no external SCLK is a valid
choice supported by the driver. Don't alarm users with messages that
say "No SCLK, using BCLK: -2" - reclassify them as debug information.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: allo-piano-dac-plus: Fix volume limiting
Controls which only exist when snd_soc_register_card returns can't be
modified before then. Move the setting of volume limits to just before
the end of the probe function.
Link: https://github.com/raspberrypi/linux/issues/6527
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: allo-piano-dac-plus: Remove pointless code
The codec control Digital Playback Volume is one of the controls deleted
by the allo-piano-dac-plus driver. It is effectively replaced by the
soundcard controls Master Playback Volume and Subwoofer Playback Volume.
Delete the code that sets the volume limit on those codec controls - the
limits on the soundcard volume controls are sufficient.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
ASoC: adds ADC8x support to the Hifiberry DAC8x
The driver probes for the ADC8x which can be stacked on top
of the DAC8x. It enables a symmetric 8 channel capture using
the dummy-dai.
Signed-off-by: j-schambacher <joerg@hifiberry.com>
sound: soc: raspberrypi: RP1 Audio Out driver as an ASOC DAI
Only 48000Hz stereo 16-bit output is currently supported.
It requires some additional OF plumbing to connect it to a
"dummy" codec and generic sound card.
Signed-off-by: Nick Hollinghurst <nick.hollinghurst@raspberrypi.com>
Adding Pimidi kernel module.
Signed-off-by: Giedrius Trainavičius <giedrius@blokas.io>
Adding Pisound Micro kernel module
Signed-off-by: Giedrius Trainavičius <giedrius@blokas.io>
Pisound Micro: Fix for MIDI output under full load.
This fixes MIDI output of Pisound Micro after running for a while under
full load and increases timing stability.
Signed-off-by: Giedrius Trainavičius <giedrius@blokas.io>
ALSA: korg1212: replace del_timer with timer_delete
pisound-micro: Added pin_pull and pin_b_pull sysfs attributes for Pisound Micro.
These attributes are available only for GPIO input and Encoder elements.
Signed-off-by: Giedrius <giedrius@blokas.io>
Pisound Micro: Workaround for snd_soc_dai_set_tdm_slot with slots=0
Even though it's documented that specifying slots=0 can be used to disable
the TDM mode, error checking introduced in 6.12.31 version broke this,
therefore, for the time being, a workaround is to provide a xlate_tdm_slot_mask
operation implementation to return 0 instead of -EINVAL as it does in case
slots argument is 0.
Signed-off-by: Giedrius Trainavičius <giedrius@blokas.io>
Upstream chose to use BTN_DPAD_* and BTN_SELECT, whilst downstream
had used KEY_*.
Revert to the downstream map to avoid any regressions.
(Ideally this would be read from DT)
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This patch adds the compatible string for the Sense HAT device to
the list of compatible strings in the simple_mfd_i2c driver so that
it can match against the device and load its children and their drivers
Co-developed-by: Mwesigwa Guma <mguma@redhat.com>
Signed-off-by: Mwesigwa Guma <mguma@redhat.com>
Co-developed-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Charles Mirabile <cmirabil@redhat.com>
mfd: Add rpi_sense_core of compatible string
rpisense-fb: Set pseudo_pallete to prevent crash on fbcon takeover
Signed-off-by: Serge Schneider <serge@raspberrypi.com>
rpisense-fb: Add explicit fb_deferred_io_mmap hook
As of commit [1], introduced in 5.18, fbdev drivers that use
deferred IO and need mmap support must include an explicit fb_mmap
pointer to the fb_deferred_io_mmap.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[1] 5905585103 ("fbdev: Put mmap for deferred I/O into drivers")
drivers: Remove downstream SenseHAT core and joystick drivers
Parts of a SenseHAT driver have been submitted upstream using the
simple-i2c-mfd framework. The joystick driver has been merged.
It's been noted that there are several issues with the downstream
joystick and core drivers, so remove these in favour of the upstream
approach, and fix up the FB driver to match.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Provide a __copy_from_user that uses memcpy. On BCM2708, use
optimised memcpy/memmove/memcmp/memset implementations.
arch/arm: Add mmiocpy/set aliases for memcpy/set
See: https://github.com/raspberrypi/linux/issues/1082
copy_from_user: CPU_SW_DOMAIN_PAN compatibility
The downstream copy_from_user acceleration must also play nice with
CONFIG_CPU_SW_DOMAIN_PAN.
See: https://github.com/raspberrypi/linux/issues/1381
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Fix copy_from_user if BCM2835_FAST_MEMCPY=n
The change which introduced CONFIG_BCM2835_FAST_MEMCPY unconditionally
changed the behaviour of arm_copy_from_user. The page pinning code
is not safe on ARMv7 if LPAE & high memory is enabled and causes
crashes which look like PTE corruption.
Make __copy_from_user_memcpy conditional on CONFIG_2835_FAST_MEMCPY=y
which is really an ARMv6 / Pi1 optimization and not necessary on newer
ARM processors.
arm: fix mmap unlocks in uaccess_with_memcpy.c
This is a regression that was added with the commit 192a4e923e as of rpi-5.8.y, since that is when the move to the mmap locking API was introduced - d8ed45c5dc
The issue is that when the patch to improve performance for the __copy_to_user and __copy_from_user functions were added for the Raspberry Pi, some of the mmaps were incorrectly mapped to write instead of read. This would cause a verity of issues, and in my case, prevent the booting of a squashfs filesystem on rpi-5.8-y and above. An example of the panic you would see from this can be seen at https://pastebin.com/raw/jBz5xCzL
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Christopher Blake <chrisrblake93@gmail.com>
arch/arm: Add __memset alias to memset_rpi.S
memset_rpi.S is an optimised memset implementation, but doesn't define
__memset (which was just added to memset.S). As a result, building
for the BCM2835 platform causes a link failure.
Add __memset as yet another alias to our common implementation.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
arm: Fix custom rpi __memset32 and __memset64
See: https://github.com/raspberrypi/linux/issues/4798
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
arm: Fix annoying .eh_frame section warnings
Replace the cfi directives with the UNWIND equivalents. This prevents
the .eh_frame section from being created, eliminating the warnings.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The "input" trigger makes the associated GPIO an input. This is to support
the Raspberry Pi PWR LED, which is driven by external hardware in normal use.
N.B. pwr_led is not available on Model A or B boards.
leds-gpio: Implement the brightness_get method
The power LED uses some clever logic that means it is driven
by a voltage measuring circuit when configured as input, otherwise
it is driven by the GPIO output value. This patch wires up the
brightness_get method for leds-gpio so that user-space can monitor
the LED value via /sys/class/gpio/led1/brightness. Using the input
trigger this returns an indication of the system power health,
otherwise it is just whatever value the trigger has written most
recently.
See: https://github.com/raspberrypi/linux/issues/1064
Support booting without Device Tree.
Turn on USB power.
Load driver early because of lacking support for deferred probing
in many drivers.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
firmware: bcm2835: Don't turn on USB power
The raspberrypi-power driver is now used to turn on USB power.
This partly reverts commit:
firmware: bcm2835: Support ARCH_BCM270x
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Based on bcm2835-gpiomem.
We allow export of the "GPIO registers" to userspace via a chardev as
this allows for finer access control (e.g. users must be group gpio, root
not required).
This driver allows access to either rp1-gpiomem or gpiomem, depending on
which nodes are populated in devicetree.
RP1 has a different look-and-feel to BCM283x SoCs as it has split ranges
for IO controls and the parallel registered OE/IN/OUT access. To handle
this, the driver concatenates the ranges for an IO bank and the
corresponding RIO instance into a contiguous buffer.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Add module for accessing the mailbox property channel through
/dev/vcio. Was previously in bcm2708-vcio.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
char: vcio: Add compat ioctl handling
There was no compat ioctl handler, so 32 bit userspace on a
64 bit kernel failed as IOCTL_MBOX_PROPERTY used the size
of char*.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
char: vcio: Fail probe if rpi_firmware is not found.
Device Tree is now the only supported config mechanism, therefore
uncomment the block of code that fails the probe if the
firmware node can't be found.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
drivers: char: vcio: Use common compat header
The definition of compat_ptr is now common for most platforms, but
requires the inclusion of <linux/compat.h>.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
char: vcio: Rewrite as a firmware node child
The old vcio driver is a simple character device that manually locates
the firmware driver. Initialising it before the firmware driver causes
a failure, and no retries are attempted.
Rewrite vcio as a platform driver that depends on a DT node for its
instantiation and the location of the firmware driver, making use of
the miscdevice framework to reduce the code size.
N.B. Using miscdevice changes the udev SUBSYSTEM string, so a change
to the companion udev rule is required in order to continue to set
the correct device permissions, e.g.:
KERNEL="vcio", GROUP="video", MODE="0660"
See: https://github.com/raspberrypi/linux/issues/4620
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
i2c-bcm2708: fixed baudrate
Fixed issue where the wrong CDIV value was set for baudrates below 3815 Hz (for 250MHz bus clock).
In that case the computed CDIV value was more than 0xffff. However the CDIV register width is only 16 bits.
This resulted in incorrect setting of CDIV and higher baudrate than intended.
Example: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0x1704 -> 42430Hz
After correction: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0xffff -> 3815Hz
The correct baudrate is shown in the log after the cdiv > 0xffff correction.
Perform I2C combined transactions when possible
Perform I2C combined transactions whenever possible, within the
restrictions of the Broadcomm Serial Controller.
Disable DONE interrupt during TA poll
Prevent interrupt from being triggered if poll is missed and transfer
starts and finishes.
i2c: Make combined transactions optional and disabled by default
i2c: bcm2708: add device tree support
Add DT support to driver and add to .dtsi file.
Setup pins in .dts file.
i2c is disabled by default.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
bcm2708: don't register i2c controllers when using DT
The devices for the i2c controllers are in the Device Tree.
Only register devices when not using DT.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
I2C: Only register the I2C device for the current board revision
i2c_bcm2708: Fix clock reference counting
Fix grabbing lock from atomic context in i2c driver
2 main changes:
- check for timeouts in the bcm2708_bsc_setup function as indicated by this comment:
/* poll for transfer start bit (should only take 1-20 polls) */
This implies that the setup function can now fail so account for this everywhere it's called
- Removed the clk_get_rate call from inside the setup function as it locks a mutex and that's not ok since we call it from under a spin lock.
i2c-bcm2708: When using DT, leave the GPIO setup to pinctrl
i2c-bcm2708: Increase timeouts to allow larger transfers
Use the timeout value provided by the I2C_TIMEOUT ioctl when waiting
for completion. The default timeout is 1 second.
See: https://github.com/raspberrypi/linux/issues/260
i2c-bcm2708/BCM270X_DT: Add support for I2C2
The third I2C bus (I2C2) is normally reserved for HDMI use. Careless
use of this bus can break an attached display - use with caution.
It is recommended to disable accesses by VideoCore by setting
hdmi_ignore_edid=1 or hdmi_edid_file=1 in config.txt.
The interface is disabled by default - enable using the
i2c2_iknowwhatimdoing DT parameter.
bcm2708-spi: Don't use static pin configuration with DT
Also remove superfluous error checking - the SPI framework ensures the
validity of the chip_select value.
i2c-bcm2708: Remove non-DT support
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Set the BSC_CLKT clock streching timeout to 35ms as per SMBus specs.
Fixes i2c_bcm2708: Write to FIFO correctly - v2 (#1574)
* i2c: fix i2c_bcm2708: Clear FIFO before sending data
Make sure FIFO gets cleared before trying to send
data in case of a repeated start (COMBINED=Y).
* i2c: fix i2c_bcm2708: Only write to FIFO when not full
Check if FIFO can accept data before writing.
To avoid a peripheral read on the last iteration of a loop,
both bcm2708_bsc_fifo_fill and ~drain are changed as well.
Signed-off-by: Luke Wren <wren6991@gmail.com>
MISC: bcm2835: smi: use clock manager and fix reload issues
Use clock manager instead of self-made clockmanager.
Also fix some error paths that showd up during development
(especially missing release of dma resources on rmmod)
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
bcm2835_smi: re-add dereference to fix DMA transfers
bcm2835_smi_dev: Fix handling of word-odd lengths
The read and write functions did not use the correct pointer offset
when dealing with an odd number of bytes after a DMA transfer. Also,
only handle the remaining odd bytes if the DMA transfer completed
successfully.
Submitted-by: @madimario (GitHub)
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
misc: bcm2835_smi: Use proper enum types for dma_{,un}map_single()
Clang warns:
drivers/misc/bcm2835_smi.c:692:4: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion]
DMA_MEM_TO_DEV);
^~~~~~~~~~~~~~~
./include/linux/dma-mapping.h:406:66: note: expanded from macro 'dma_map_single'
#define dma_map_single(d, a, s, r) dma_map_single_attrs(d, a, s, r, 0)
~~~~~~~~~~~~~~~~~~~~ ^
drivers/misc/bcm2835_smi.c:705:35: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion]
(inst->dev, phy_addr, n_bytes, DMA_MEM_TO_DEV);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
./include/linux/dma-mapping.h:407:70: note: expanded from macro 'dma_unmap_single'
#define dma_unmap_single(d, a, s, r) dma_unmap_single_attrs(d, a, s, r, 0)
~~~~~~~~~~~~~~~~~~~~~~ ^
drivers/misc/bcm2835_smi.c:751:12: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion]
DMA_DEV_TO_MEM);
^~~~~~~~~~~~~~~
./include/linux/dma-mapping.h:406:66: note: expanded from macro 'dma_map_single'
#define dma_map_single(d, a, s, r) dma_map_single_attrs(d, a, s, r, 0)
~~~~~~~~~~~~~~~~~~~~ ^
drivers/misc/bcm2835_smi.c:761:50: warning: implicit conversion from enumeration type 'enum dma_transfer_direction' to different enumeration type 'enum dma_data_direction' [-Wenum-conversion]
dma_unmap_single(inst->dev, phy_addr, n_bytes, DMA_DEV_TO_MEM);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
./include/linux/dma-mapping.h:407:70: note: expanded from macro 'dma_unmap_single'
#define dma_unmap_single(d, a, s, r) dma_unmap_single_attrs(d, a, s, r, 0)
~~~~~~~~~~~~~~~~~~~~~~ ^
4 warnings generated.
Use the proper enumerated type to clear up the warning. There is not
actually a bug here because the enumerated types have the same integer
value:
DMA_MEM_TO_DEV = DMA_TO_DEVICE = 1
DMA_DEV_TO_MEM = DMA_FROM_DEVICE = 2
Fixes: 93254d0f7b ("Add SMI driver")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
bcm2835-smi: Use phys addresses for slave DMA config
Contrary to what struct snd_dmaengine_dai_dma_data suggests, the
configuration of addresses of DMA slave interfaces should be done in
CPU physical addresses.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com>
BCM270x: Move vc_mem
Make the vc_mem module available for ARCH_BCM2835 by moving it.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
char: vc_mem: Fix up compat ioctls for 64bit kernel
compat_ioctl wasn't defined, so 32bit user/64bit kernel
always failed.
VC_MEM_IOC_MEM_PHYS_ADDR was defined with parameter size
unsigned long, so the ioctl cmd changes between sizes.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
char: vc_mem: Fix all coding style issues.
Cleans up all checkpatch errors in vc_mem.c and vc_mem.h
No functional change to the code.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
char: vc_mem: Delete dead code
There are no error exists once device_create has succeeded, and
therefore no need to call device_destroy from vc_mem_init.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
char: broadcom: vc_mem: Fix preprocessor conditional
Signed-off-by: Alexander Winkowski <dereference23@outlook.com>
vc_mem: Add the DMA memcpy support from bcm2708_fb
bcm2708_fb is disabled by the vc4-kms-v3d overlay, which means that the
DMA memcpy support it provides is not available to allow vclog to read
the VC logs from the top 16MB on Pi 2 and Pi 3. Add the code to the
vc_mem driver, which will still be enabled.
It ought to be possible to do a proper DMA_MEM_TO_MEM copy via the
generic DMA customer API, but that can be a later step.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
See https://github.com/raspberrypi/linux/issues/5019
If an SD card has degraded performance such that IO operations time out
then the MMC block layer will leak SG DMA mappings in the swiotlb during
recovery. It retries the same SG and this causes the leak, as it is
mapped twice - once in sdhci_pre_req() and again during single-block
reads in sdhci_prepare_data().
Resetting the card (including power-cycling if a regulator for vmmc is
present) ought to be enough to recover a stuck state, so for now don't
try single-block reads in the recovery path.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
mmc: Disable CMD23 transfers on all cards
Pending wire-level investigation of these types of transfers
and associated errors on bcm2835-mmc, disable for now. Fallback of
CMD18/CMD25 transfers will be used automatically by the MMC layer.
Reported/Tested-by: Gellert Weisz <gellert@raspberrypi.org>
mmc: bcm2835-mmc: enable DT support for all architectures
Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now.
Enable Device Tree support for all architectures.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
mmc: bcm2835-mmc: fix probe error handling
Probe error handling is broken in several places.
Simplify error handling by using device managed functions.
Replace pr_{err,info} with dev_{err,info}.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-mmc: Add locks when accessing sdhost registers
bcm2835-mmc: Add range of debug options for slowing things down
bcm2835-mmc: Add option to disable some delays
bcm2835-mmc: Add option to disable MMC_QUIRK_BLK_NO_CMD23
bcm2835-mmc: Default to disabling MMC_QUIRK_BLK_NO_CMD23
bcm2835-mmc: Adding overclocking option
Allow a different clock speed to be substitued for a requested 50MHz.
This option is exposed using the "overclock_50" DT parameter.
Note that the mmc interface is restricted to EVEN integer divisions of
250MHz, and the highest sensible option is 63 (250/4 = 62.5), the
next being 125 (250/2) which is much too high.
Use at your own risk.
bcm2835-mmc: Round up the overclock, so 62 works for 62.5Mhz
Also only warn once for each overclock setting.
mmc: bcm2835-mmc: Make available on ARCH_BCM2835
Make the bcm2835-mmc driver available for use on ARCH_BCM2835.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: add bcm2835-mmc entry
Add Device Tree entry for bcm2835-mmc.
In non-DT mode, don't add the device in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-mmc: Don't overwrite MMC capabilities from DT
bcm2835-mmc: Don't override bus width capabilities from devicetree
Take out the force setting of the MMC_CAP_4_BIT_DATA host capability
so that the result read from devicetree via mmc_of_parse() is
preserved.
bcm2835-mmc: Only claim one DMA channel
With both MMC controllers enabled there are few DMA channels left. The
bcm2835-mmc driver only uses DMA in one direction at a time, so it
doesn't need to claim two channels.
See: https://github.com/raspberrypi/linux/issues/1327
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-mmc: New timer API
mmc: bcm2835-mmc: Support underclocking
Support underclocking of the SD bus using the max-frequency DT property
(which currently has no DT parameter). The sd_overclock parameter
already provides another way to achieve the same thing which should be
equivalent in end result, but it is a bug not to support max-frequency
as well.
See: https://github.com/raspberrypi/linux/issues/2350
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc/bcm2835: Recover from MMC_SEND_EXT_CSD
If the user issues an "mmc extcsd read", the SD controller receives
what it thinks is a SEND_IF_COND command with an unexpected data block.
The resulting operations leave the FSM stuck in READWAIT, a state which
persists until the MMC framework resets the controller, by which point
the root filesystem is likely to have been unmounted.
A less heavyweight solution is to detect the condition and nudge the
FSM by asserting the (self-clearing) FORCE_DATA_MODE bit.
N.B. This workaround was essentially discovered by accident and without
a full understanding the inner workings of the controller, so it is
fortunate that the "fix" only modifies error paths.
See: https://github.com/raspberrypi/linux/issues/2728
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-mmc: Fix DMA channel leak
The BCM2835 MMC host driver requests a DMA channel on probe but neglects
to release the channel in the probe error path and on driver unbind.
I'm seeing this happen on every boot of the Compute Module 3: On first
driver probe, DMA channel 2 is allocated and then leaked with a "could
not get clk, deferring probe" message. On second driver probe, channel 4
is allocated.
Fix it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835-mmc: Fix struct mmc_host leak on probe
The BCM2835 MMC host driver requests the bus address of the host's
register map on probe. If that fails, the driver leaks the struct
mmc_host allocated earlier.
Fix it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835-mmc: Fix duplicate free_irq() on remove
The BCM2835 MMC host driver requests its interrupt as a device-managed
resource, so the interrupt is automatically freed after the driver is
unbound.
However on driver unbind, bcm2835_mmc_remove() frees the interrupt
explicitly to avoid invocation of the interrupt handler after driver
structures have been torn down.
The interrupt is thus freed twice, leading to a WARN splat in
__free_irq(). Fix by not requesting the interrupt as a device-managed
resource.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835-mmc: Handle mmc_add_host() errors
The BCM2835 MMC host driver calls mmc_add_host() but doesn't check its
return value. Errors occurring in that function are therefore not
handled. Fix it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835-mmc: Deduplicate reset of driver data on remove
The BCM2835 MMC host driver sets the device's driver data pointer to
NULL on ->remove() even though the driver core subsequently does the
same in __device_release_driver(). Drop the duplicate assignment.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835_mmc: Remove vestigial threaded IRQ
With SDIO processing now managed by the MMC framework with a
workqueue, the bcm2835_mmc driver no longer needs a threaded
IRQ.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add missing dma_unmap_sg calls to free relevant swiotlb bounce buffers.
This prevents DMA leaks.
Signed-off-by: Yaroslav Rosomakho <yaroslavros@gmail.com>
Limit max_req_size under arm64 (or any other platform that uses swiotlb) to prevent potential buffer overflow due to bouncing.
Signed-off-by: Yaroslav Rosomakho <yaroslavros@gmail.com>
mmc: sdhci: Silence MMC warnings
When the MMC isn't plugged in, the driver will spam the console which is
pretty annoying when using NFS.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
mmc: sdhci-iproc: Fix vmmc regulators on iProc
The Linux support for controlling card power via regulators appears to
be contentious. I would argue that the default behaviour is contrary to
the SDHCI spec - turning off the power writes a reserved value to the
SD Bus Voltage Select field of the Power Control Register, which
seems to kill the Arasan/iProc controller - but fortunately there is a
hook in sdhci_ops to override the behaviour. Borrow the implementation
from sdhci_arasan_set_power.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-mmc: uninitialized_var is no more
Revert "mmc: sdhci-iproc: Fix vmmc regulators on iProc"
This reverts commit aed19399a0.
Commit 6c92ae1e45 ("mmc: sdhci: Introduce sdhci_set_power_and_bus_voltage()")
introduced a generic helper that does the same thing so use that instead in
the following commit.
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
mmc: sdhci-iproc: Fix vmmc regulators (pre-bcm2711)
The Linux support for controlling card power via regulators appears to
be contentious. I would argue that the default behaviour is contrary to
the SDHCI spec - turning off the power writes a reserved value to the
SD Bus Voltage Select field of the Power Control Register, which
seems to kill the Arasan/iProc controller - but fortunately there is a
hook in sdhci_ops to override the behaviour.
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2835-mmc: Honor return value of mmc_of_parse()
bcm2835_mmc_probe() ignores errors returned by mmc_of_parse() and in
particular ignores -EPROBE_DEFER, which may be returned if the power
sequencing driver configured in the devicetree is compiled as a module.
The user-visible result is that access to the SDIO device fails because
its power sequencing requirements have not been observed. Fix it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
bcm2835-mmc: Use phys addresses for slave DMA config
Contrary to what struct snd_dmaengine_dai_dma_data suggests, the
configuration of addresses of DMA slave interfaces should be done in
CPU physical addresses.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add support for DMA controller of BCM2708 as used in the Raspberry Pi.
Currently it only supports cyclic DMA.
Signed-off-by: Florian Meier <florian.meier@koalo.de>
dmaengine: expand functionality by supporting scatter/gather transfers sdhci-bcm2708 and dma.c: fix for LITE channels
DMA: fix cyclic LITE length overflow bug
dmaengine: bcm2708: Remove chancnt affectations
Mirror bcm2835-dma.c commit 9eba5536a7:
chancnt is already filled by dma_async_device_register, which uses the channel
list to know how much channels there is.
Since it's already filled, we can safely remove it from the drivers' probe
function.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: overwrite dreq only if it is not set
dreq is set when the DMA channel is fetched from Device Tree.
slave_id is set using dmaengine_slave_config().
Only overwrite dreq with slave_id if it is not set.
dreq/slave_id in the cyclic DMA case is not touched, because I don't
have hardware to test with.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: do device registration in the board file
Don't register the device in the driver. Do it in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: don't restrict DT support to ARCH_BCM2835
Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now.
Add Device Tree support to the non ARCH_BCM2835 case.
Use the same driver name regardless of architecture.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: add bcm2835-dma entry
Add Device Tree entry for bcm2835-dma.
The entry doesn't contain any resources since they are handled
by the arch/arm/mach-bcm270x/dma.c driver.
In non-DT mode, don't add the device in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2708-dmaengine: Add debug options
BCM270x: Add memory and irq resources to dmaengine device and DT
Prepare for merging of the legacy DMA API arch driver dma.c
with bcm2708-dmaengine by adding memory and irq resources both
to platform file device and Device Tree node.
Don't use BCM_DMAMAN_DRIVER_NAME so we don't have to include mach/dma.h
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Merge with arch dma.c driver and disable dma.c
Merge the legacy DMA API driver with bcm2708-dmaengine.
This is done so we can use bcm2708_fb on ARCH_BCM2835 (mailbox
driver is also needed).
Changes to the dma.c code:
- Use BIT() macro.
- Cutdown some comments to one line.
- Add mutex to vc_dmaman and use this, since the dev lock is locked
during probing of the engine part.
- Add global g_dmaman variable since drvdata is used by the engine part.
- Restructure for readability:
vc_dmaman_chan_alloc()
vc_dmaman_chan_free()
bcm_dma_chan_free()
- Restructure bcm_dma_chan_alloc() to simplify error handling.
- Use device irq resources instead of hardcoded bcm_dma_irqs table.
- Remove dev_dmaman_register() and code it directly.
- Remove dev_dmaman_deregister() and code it directly.
- Simplify bcm_dmaman_probe() using devm_* functions.
- Get dmachans from DT if available.
- Keep 'dma.dmachans' module argument name for backwards compatibility.
Make it available on ARCH_BCM2835 as well.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: set residue_granularity field
bcm2708-dmaengine supports residue reporting at burst level
but didn't report this via the residue_granularity field.
Without this field set properly we get playback issues with I2S cards.
dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer
bcm2708-dmaengine: Use more DMA channels (but not 12)
1) Only the bcm2708_fb drivers uses the legacy DMA API, and
it requires a BULK-capable channel, so all other types
(FAST, NORMAL and LITE) can be made available to the regular
DMA API.
2) DMA channels 11-14 share an interrupt. The driver can't
handle this, so don't use channels 12-14 (12 was used, probably
because it appears to have an interrupt, but in reality that
interrupt is for activity on ANY channel). This may explain
a lockup encountered when running out of DMA channels.
The combined effect of this patch is to leave 7 DMA channels
available + channel 0 for bcm2708_fb via the legacy API.
See: https://github.com/raspberrypi/linux/issues/1110https://github.com/raspberrypi/linux/issues/1108
dmaengine: bcm2708: Make legacy API available for bcm2835-dma
bcm2708_fb uses the legacy DMA API, so in order to start using
bcm2835-dma, bcm2835-dma has to support the legacy API. Make this
possible by exporting bcm_dmaman_probe() and bcm_dmaman_remove().
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Change DT compatible string
Both bcm2835-dma and bcm2708-dmaengine have the same compatible string.
So change compatible to "brcm,bcm2708-dma".
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Remove driver but keep legacy API
Dropping non-DT support means we don't need this driver,
but we still need the legacy DMA API.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2708-dmaengine - Fix arm64 portability/build issues
dma-bcm2708: Fix module compilation of CONFIG_DMA_BCM2708
bcm2708-dmaengine.c defines functions like bcm_dma_start which are
defined as well in dma-bcm2708.h as inline versions when
CONFIG_DMA_BCM2708 is not defined. This works fine when
CONFIG_DMA_BCM2708 is built in, but when it is selected as module build
fails with redefinition errors because in the build system when
CONFIG_DMA_BCM2708 is selected as module, the macro becomes
CONFIG_DMA_BCM2708_MODULE.
This patch makes the header use CONFIG_DMA_BCM2708_MODULE too when
available.
Fixes https://github.com/raspberrypi/linux/issues/2056
Signed-off-by: Andrei Gherzan <andrei@gherzan.com>
bcm2708-dmaengine: Use platform_get_irq
The platform driver framework no longer creates IRQ resources for
platform devices because they are expected to use platform_get_irq.
This causes the bcm2808_fb acceleration to fail.
Fix the problem by calling platform_get_irq as intended.
See: https://github.com/raspberrypi/linux/issues/5131
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com>
bcm2708_fb : Implement blanking support using the mailbox property interface
bcm2708_fb: Add pan and vsync controls
bcm2708_fb: DMA acceleration for fb_copyarea
Based on http://www.raspberrypi.org/phpBB3/viewtopic.php?p=62425#p62425
Also used Simon's dmaer_master module as a reference for tweaking DMA
settings for better performance.
For now busylooping only. IRQ support might be added later.
With non-overclocked Raspberry Pi, the performance is ~360 MB/s
for simple copy or ~260 MB/s for two-pass copy (used when dragging
windows to the right).
In the case of using DMA channel 0, the performance improves
to ~440 MB/s.
For comparison, VFP optimized CPU copy can only do ~114 MB/s in
the same conditions (hindered by reading uncached source buffer).
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
bcm2708_fb: report number of dma copies
Add a counter (exported via debugfs) reporting the
number of dma copies that the framebuffer driver
has done, in order to help evaluate different
optimization strategies.
Signed-off-by: Luke Diamand <luked@broadcom.com>
bcm2708_fb: use IRQ for DMA copies
The copyarea ioctl() uses DMA to speed things along. This
was busy-waiting for completion. This change supports using
an interrupt instead for larger transfers. For small
transfers, busy-waiting is still likely to be faster.
Signed-off-by: Luke Diamand <luke@diamand.org>
bcm2708: Make ioctl logging quieter
video: fbdev: bcm2708_fb: Don't panic on error
No need to panic the kernel if the video driver fails.
Just print a message and return an error.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
fbdev: bcm2708_fb: Add ARCH_BCM2835 support
Add Device Tree support.
Pass the device to dma_alloc_coherent() in order to get the
correct bus address on ARCH_BCM2835.
Use the new DMA legacy API header file.
Including <mach/platform.h> is not necessary.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: Add bcm2708-fb device
Add bcm2708-fb to Device Tree and don't add the
platform device when booting in DT mode.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Cleanup of bcm2708_fb file to kernel coding standards
Some minor change to function - remove a use of
in_atomic, plus replacing various debug messages
that manually specify the function name with
("%s",.__func__)
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
video: bcm2708_fb: Try allocating on the ARM and passing to VPU
Currently the VPU allocates the contiguous buffer for the
framebuffer.
Try an alternate path first where we use dma_alloc_coherent
and pass the buffer to the VPU. Should the VPU firmware not
support that path, then free the buffer and revert to the
old behaviour of using the VPU allocation.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Pulled in the multi frame buffer support from the Pi3 repo
fbdev: add FBIOCOPYAREA ioctl
Based on the patch authored by Ali Gholami Rudi at
https://lkml.org/lkml/2009/7/13/153
Provide an ioctl for userspace applications, but only if this operation
is hardware accelerated (otherwide it does not make any sense).
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
bcm2708_fb: Add ioctl for reading gpu memory through dma
video: bcm2708_fb: Add compat_ioctl support.
When using a 64 bit kernel with 32 bit userspace we need
compat ioctl handling for FBIODMACOPY as one of the
parameters is a pointer.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
video: fbdev: bcm2708_fb: Use common compat header
The definition of compat_ptr is now common for most platforms, but
requires the inclusion of <linux/compat.h>.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
video: bcm2708_fb: Disable FB if no displays found
If the firmware hasn't detected a display, the driver would assume
one display was available, but because it had failed to retrieve the
display size it would try to allocate a zero-sized buffer.
Avoid the allocation failure by bailing out early if no display is
found.
See: https://github.com/raspberrypi/linux/issues/3598
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2708_fb: Fix a build warning
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2708_fb: Explicitly initialise the IOMEM ops
Prior to [1], an fb_ops member of 0 was intepreted as a request for a
default value. This saves source code but requires special handling by
the framework, slowing down all accesses for no runtime benefit.
Use the new __FB_DEFAULT_ macros to explicitly select default handlers
in the bcm2708_fb driver. Also remove the pointless wrappers around
cfb_fillrect and cfb_imageblit - call them directly.
Link: https://forums.raspberrypi.com/viewtopic.php?p=2286016#p2286016
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[1] 8813e86f6d ("fbdev: Remove default file-I/O implementations")
Signed-off-by: popcornmix <popcornmix@gmail.com>
usb: dwc: fix lockdep false positive
Signed-off-by: Kari Suvanto <karis79@gmail.com>
usb: dwc: fix inconsistent lock state
Signed-off-by: Kari Suvanto <karis79@gmail.com>
Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas
Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.
Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh
Make sure we wait for the reset to finish
dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
memory corruption, escalating to OOPS under high USB load.
dwc_otg: Fix unsafe access of QTD during URB enqueue
In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.
This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).
dwc_otg: Fix incorrect URB allocation error handling
If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.
dwc_otg: fix potential use-after-free case in interrupt handler
If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.
dwc_otg: add handling of SPLIT transaction data toggle errors
Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).
dwc_otg: implement tasklet for returning URBs to usbcore hcd layer
The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.
This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.
dwc_otg: fix NAK holdoff and allow on split transactions only
This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.
Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.
dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler
usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4a1a).
This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.
NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer. This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.
USB fix using a FIQ to implement split transactions
This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux
dwc_otg: fix device attributes and avoid kernel warnings on boot
dcw_otg: avoid logging function that can cause panics
See: https://github.com/raspberrypi/firmware/issues/21
Thanks to cleverca22 for fix
dwc_otg: mask correct interrupts after transaction error recovery
The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.
By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.
dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ
In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.
Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.
Fix function tracing
dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue
dwc_otg: prevent OOPSes during device disconnects
The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.
Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.
dwc_otg: prevent BUG() in TT allocation if hub address is > 16
A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.
This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.
Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.
dwc_otg: make channel halts with unknown state less damaging
If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.
Add catchall handling to treat as a transaction error and retry.
dwc_otg: fiq_split: use TTs with more granularity
This fixes certain issues with split transaction scheduling.
- Isochronous multi-packet OUT transactions now hog the TT until
they are completed - this prevents hubs aborting transactions
if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
allows simultaneous use of the TT's bulk/control and periodic
transaction buffers
This commit will mainly affect USB audio playback.
dwc_otg: fix potential sleep while atomic during urb enqueue
Fixes a regression introduced with eb1b482a. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.
dwc_otg: make fiq_split_enable imply fiq_fix_enable
Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.
dwc_otg: prevent crashes on host port disconnects
Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.
- Clean up queue heads properly in kill_urbs_in_qh_list by
removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
device drivers so they respond in a more correct fashion
and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
interrupts if the associated URB was dequeued (and the
driver state was cleared)
dwc_otg: prevent leaking URBs during enqueue
A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.
dwc_otg: Enable NAK holdoff for control split transactions
Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb238.
In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.
dwc_otg: Fix for occasional lockup on boot when doing a USB reset
dwc_otg: Don't issue traffic to LS devices in FS mode
Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.
Fix ARM architecture issue with local_irq_restore()
If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.
Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.
Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.
dwc_otg: fiq_fsm: Base commit for driver rewrite
This commit removes the previous FIQ fixes entirely and adds fiq_fsm.
This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.
All driver options have been removed and replaced with:
- dwc_otg.fiq_enable (bool)
- dwc_otg.fiq_fsm_enable (bool)
- dwc_otg.fiq_fsm_mask (bitmask)
- dwc_otg.nak_holdoff (unsigned int)
Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.
fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used
If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).
Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.
In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.
Push the error count reset into the FIQ handler.
fiq_fsm: Implement timeout mechanism
For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.
Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.
fiq_fsm: fix bounce buffer utilisation for Isochronous OUT
Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.
Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.
dwc_otg: Return full-speed frame numbers in HS mode
The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.
fiq_fsm: save PID on completion of interrupt OUT transfers
Also add edge case handling for interrupt transports.
Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.
fiq_fsm: add missing case for fiq_fsm_tt_in_use()
Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.
fiq_fsm: clear hcintmsk for aborted transactions
Prevents the FIQ from erroneously handling interrupts
on a timed out channel.
fiq_fsm: enable by default
fiq_fsm: fix dequeues for non-periodic split transactions
If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.
fiq_fsm: Disable by default
fiq_fsm: Handle HC babble errors
The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.
dwc_otg: fix interrupt registration for fiq_enable=0
Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.
fiq_fsm: Enable by default
dwc_otg: Fix various issues with root port and transaction errors
Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.
Fix a few thinkos with the transaction error passthrough for fiq_fsm.
fiq_fsm: Implement hack for Split Interrupt transactions
Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.
It goes without saying that this is a fairly egregious USB specification
violation, but it works.
Original idea by Hans Petter Selasky @ FreeBSD.org.
dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.
dwc_otg: introduce fiq_fsm_spin(un|)lock()
SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.
This also makes it possible to run the FIQ and IRQ handlers on different
cores.
fiq_fsm: fix build on bcm2708 and bcm2709 platforms
dwc_otg: put some barriers back where they should be for UP
bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active
dwc_otg: fixup read-modify-write in critical paths
Be more careful about read-modify-write on registers that the FIQ
also touches.
Guard fiq_fsm_spin_lock with fiq_enable check
fiq_fsm: Falling out of the state machine isn't fatal
This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.
Also get rid of the useless return value.
squash: dwc_otg: Allow to build without SMP
usb: core: make overcurrent messages more prominent
Hub overcurrent messages are more serious than "debug". Increase loglevel.
usb: dwc_otg: Don't use dma_to_virt()
Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.
Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.
transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Fix crash when fiq_enable=0
dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly
Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.
Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.
dwc_otg: Force host mode to fix incorrect compute module boards
dwc_otg: Add ARCH_BCM2835 support
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Simplify FIQ irq number code
Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Remove duplicate gadget probe/unregister function
dwc_otg: Properly set the HFIR
Douglas Anderson reported:
According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1
This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)
and reported lower timing jitter on a USB analyser
dcw_otg: trim xfer length when buffer larger than allocated size is received
dwc_otg: Don't free qh align buffers in atomic context
dwc_otg: Enable the hack for Split Interrupt transactions by default
dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.
See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437
Signed-off-by: popcornmix <popcornmix@gmail.com>
dwc_otg: Use kzalloc when suitable
dwc_otg: Pass struct device to dma_alloc*()
This makes it possible to get the bus address from Device Tree.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: fix summarize urb->actual_length for isochronous transfers
Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixesraspberrypi/linux#903
fiq_fsm: Use correct states when starting isoc OUT transfers
In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.
For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.
Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.
Set the FSM state up properly if we select a single-packet transfer.
Fixes https://github.com/raspberrypi/linux/issues/1842
dwc_otg: make nak_holdoff work as intended with empty queues
If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.
Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.
Fixes https://github.com/raspberrypi/linux/issues/1709
dwc_otg: fix split transaction data toggle handling around dequeues
See https://github.com/raspberrypi/linux/issues/1709
Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()
dwc_otg: fix several potential crash sources
On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.
dwc_otg: delete hcd->channel_lock
The lock serves no purpose as it is only held while the HCD spinlock
is already being held.
dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt
Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.
There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.
dwcotg: Allow to build without FIQ on ARM64
Signed-off-by: popcornmix <popcornmix@gmail.com>
dwc_otg: make periodic scheduling behave properly for FS buses
If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.
Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.
Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.
Related issue: https://github.com/raspberrypi/linux/issues/2020
dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly
Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.
dwc_otg: add module parameter int_ep_interval_min
Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.
The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).
dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints
Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.
Constrain queuing non-periodic split transactions so they are performed
serially in such cases.
Related: https://github.com/raspberrypi/linux/issues/2024
dwc_otg: Fixup change to DRIVER_ATTR interface
dwc_otg: Fix compilation warnings
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
USB_DWCOTG: Disable building dwc_otg as a module (#2265)
When dwc_otg is built as a module, build will fail with the following
error:
ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2
Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.
As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.
See: https://github.com/raspberrypi/linux/issues/2258
Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>
dwc_otg: New timer API
dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE
dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()
Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.
dwc_otg: Fix a regression when dequeueing isochronous transfers
In 282bed95 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.
This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.
Restore the state machine fence for isochronous transfers.
fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)
See: https://github.com/raspberrypi/linux/issues/2140
dwc_otg: add smp_mb() to prevent driver state corruption on boot
Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.
The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.
usb: dwc_otg: fix memory corruption in dwc_otg driver
[Upstream commit 51b1b64917]
The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.
The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.
Thanks to Stephen Warren for tracking down the cause of this.
Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
usb: dwb_otg: Fix unreachable switch statement warning
This warning appears with GCC 7.3.0 from toolchains.bootlin.com:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
~~~~~~~~~~~~~~~~~^~~~
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation
Rationalise the offset and update all call sites.
Fixes https://github.com/raspberrypi/linux/issues/2408
dwc_otg: fix bug with port_addr assignment for single-TT hubs
See https://github.com/raspberrypi/linux/issues/2734
The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.
usb: dwc_otg: Clean up build warnings on 64bit kernels
No functional changes. Almost all are changes to logging lines.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
usb: dwc_otg: Use dma allocation for mphi dummy_send buffer
The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.
Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
dwc_otg: only do_split when we actually need to do a split
The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.
dwc_otg: fix locking around dequeueing and killing URBs
kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.
Also fix up a case where the global interrupt register could be read with
the fiq lock not held.
Fixes the deadlock seen in https://github.com/raspberrypi/linux/issues/2907
ARM64/DWC_OTG: Port dwc_otg driver to ARM64
In ARM64, the FIQ mechanism used by this driver is not current
implemented. As a workaround, reqular IRQ is used instead
of FIQ.
In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time. This reduces the need for FIQ.
Tests Run:
This mechanism is most likely to break when multiple USB devices
are attached at the same time. So the system was tested under
stress.
Devices:
1. USB Speakers playing back a FLAC audio through VLC
at 96KHz.(Higher then typically, but supported on my speakers).
2. sftp transferring large files through the buildin ethernet
connection which is connected through USB.
3. Keyboard and mouse attached and being used.
Although I do occasionally hear some glitches, the music seems to
play quite well.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
usb: dwc_otg: Clean up interrupt claiming code
The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.
See: https://github.com/raspberrypi/linux/issues/2612
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
dwc_otg: Choose appropriate IRQ handover strategy
2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
usb: host: dwc_otg: fix compiling in separate directory
The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.
Signed-off-by: Marek Behún <marek.behun@nic.cz>
dwc_otg: use align_buf for small IN control transfers (#3150)
The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.
Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.
In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.
See: https://github.com/raspberrypi/linux/issues/3148
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
dwc_otg: Declare DMA capability with HCD_DMA flag
Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.
[1] 7b81cb6bdd ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
dwc_otg: checking the urb->transfer_buffer too early (#3332)
After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0
When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.
The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.
But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.
BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>
dwc_otg: constrain endpoint max packet and transfer size on split IN
The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.
Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.
The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
dwc_otg: fiq_fsm: pause when cancelling split transactions
Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.
A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)
On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.
Add dsb(sy) on entry so that all outstanding reads are retired.
The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.
On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
dwc_otg: whitelist_table is now productlist_table
dwc_otg: initialise sched_frame for periodic QHs that were parked
If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.
See https://bugs.launchpad.net/raspbian/+bug/1819560
and
https://github.com/raspberrypi/linux/issues/3883
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
dwc_otg: Minimise header and fix build warnings
Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
dwc-otg: fix clang -Wignored-attributes warning
warning: attribute declaration must precede definition
dwc-otg: fix clang -Wsometimes-uninitialized warning
warning: variable 'retval' is used uninitialized whenever 'if' condition is false
dwc-otg: fix clang -Wpointer-bool-conversion warning
warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'
The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.
dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>
dwc_otg: Update NetBSD usb.h header licence
NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.
See https://www.netbsd.org/about/redistribution.html for more
information.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
dwc_otg: pay attention to qh->interval when rescheduling periodic queues
A regression introduced in https://github.com/raspberrypi/linux/pull/3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.
Use the larger of SCHEDULE_SLOP or the configured endpoint interval.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: dwc_otg: Fix fallthrough warnings
Signed-off-by: Alexander Winkowski <dereference23@outlook.com>
drivers: usb: dwc_otg: fix reference passing when checking bandwidth
The pointer (struct usb_host_endpoint *)->hcpriv should contain a
reference to dwc_otg_qh_t if the driver has already seen a URB submitted
to this endpoint.
It then checks whether the qh exists and is already in a schedule in
order to decide whether to allocate periodic bandwidth or not. Passing a
pointer to an offset inside of struct usb_host_endpoint instead of just
the pointer means it dereferences bogus addresses.
Rationalise (delete) a variable while we're at it.
See https://github.com/raspberrypi/linux/issues/5189
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: dwc_otg: stop GCC from patching FIQ functions
Configuring GCC to use task stack protector canaries means it will
insert calls to check functions in FIQ code. This is bad, as a) the
FIQ's stack is banked and b) the failure invokes __stack_chk_fail which
eventually tries to call printk(). Printing to the console inside the
FIQ is generally fatal.
Add CFLAGS to stop this happening in FIQ code.
Also catch one function where notrace wasn't specified.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
dwc_otg: Avoid the use of align_buf for short packets
Recent kernels (from 6.5) fail to boot on Pi0-3.
This has been tracked down to the call to:
ret = usb_get_std_status(hdev, USB_RECIP_DEVICE, 0, &hubstatus);
returning garbage in hubstatus (it gets the uninitialised contents of
a kmalloc buffer that is not overwritten as expected).
As we don't have strong evidence that this code path has ever worked,
and it is causing a clear problem currently, lets disable it to
allow wider use of newer kernels.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
drivers: dwc_otg: use C11 style variable array declarations
The kernel C standard changed in 5.18.
Remove a layer of indirection around the FIQ bounce buffers, be consistent
with pointers to FIQ bounce buffers, and remove open-coded 32-bit clamping
of DMA addresses.
Also remove a pointless fiq_state initialisation loop.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: dwc_otg: move FIQ locking functions to header file
Also declare as static inline, as they should be.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: dwc_otg: add ticket-based spinlock for ARM64
The ARM64 architecture uses qspinlock which has a fast and slow path.
This isn't ideal for all claimers of a lock operating in interrupt
context. Add a ticket-based lock similar to the armv6/7 implementation.
Based on an upstream patch that was abandoned in favour of qspinlock.
Link: https://patchwork.kernel.org/project/linux-arm-kernel/patch/1381330468-32625-2-git-send-email-will.deacon@arm.com/
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: dwc_otg: reduce loglevel for probe messages
Warning on normal behaviour isn't sensible and is spammy. Demote to info.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
drivers: dwc_otg: don't call disable_irq on the fake FIQ
The local spinlock protects the handlers from racing against each other
on separate cores, hard IRQs don't preempt each other, and
disabling/enabling the interrupt is more expensive than letting the fake
FIQ contend the spinlock.
So turn local_fiq_en/disable into no-ops.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: popcornmix <popcornmix@gmail.com>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2709: Drop platform smp and timer init code
irq-bcm2836 handles this through these functions:
bcm2835_init_local_timer_frequency()
bcm2836_arm_irqchip_smp_init()
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm270x: Use watchdog for reboot/poweroff
The watchdog driver already has support for reboot/poweroff.
Make use of this and remove the code from the platform files.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
board_bcm2835: Remove coherent dma pool increase - API has gone
Under some circumstances on BCM283x processors data loss can be
observed - a single byte missing from the TX output stream. These bytes
are always the last byte of a batch of 8 written from pl011_tx_chars
when from_irq is true, meaning that the FIFO full flag is not checked
before writing.
The transmit optimisation relies on the FIFO being half-empty when the
TX interrupt is raised. Instrumenting the driver further showed that
the failure case correlated with the TX FIFO full flag being set at the
point where the last byte was written to the data register, which
explains the data loss but not how the FIFO appeared to be prematurely
full. A possible explanation is that a FIFO write was in flight at the
time the interrupt was raised, but as yet there is no hypothesis as to
how this might occur.
In the absence of a clear understanding of the failure mechanism, avoid
the problem by checking the FIFO levels before writing the last byte of
the group, which will have minimal performance impact.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The BCM2835 PL011 implementation seems to have a bug that can lead to a
transmission lockup if CTS changes frequently. A workaround was added to
the driver with a vendor-specific flag to enable it, but this flag is
currently not set for ARM implementations.
Add a "cts-event-workaround" property to Pi DTBs and use the presence
of that property to force the flag to be enabled in the driver.
See: https://github.com/raspberrypi/linux/issues/1280
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The pl011 register accessor functions use the _relaxed versions of the
standard readl() and writel() functions, meaning that there are no
automatic memory barriers. When polling a FIFO status register to check
for fullness, it is necessary to ensure that any outstanding writes have
completed; otherwise the flags are effectively stale, making it possible
that the next write is to a full FIFO.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The syscon node defines a register range that duplicates that used by
the local_intc node on bcm2836/7. Since irq-bcm2835 and irq-bcm2836 are
built in and always present together (both drivers are enabled by
CONFIG_ARCH_BCM2835), it is possible to replace the syscon usage with a
global variable that simplifies the code. Doing so does lose the
locking provided by regmap, but as only one side is using the regmap
interface (irq-bcm2835 uses readl and write) there is no loss of
atomicity.
See: https://github.com/raspberrypi/firmware/issues/926
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
This adds a debug module parameter to aid in debugging transfer issues
by printing info to the kernel log. When enabled, status values are
collected in the interrupt routine and msg info in
bcm2835_i2c_start_transfer(). This is done in a way that tries to avoid
affecting timing. Having printk in the isr can mask issues.
debug values (additive):
1: Print info on error
2: Print info on all transfers
3: Print messages before transfer is started
The value can be changed at runtime:
/sys/module/i2c_bcm2835/parameters/debug
Example output, debug=3:
[ 747.114448] bcm2835_i2c_xfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1]
[ 747.114463] bcm2835_i2c_xfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1]
[ 747.117809] start_transfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1]
[ 747.117825] isr: remain=2, status=0x30000055 : TA TXW TXD TXE [i2c1]
[ 747.117839] start_transfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1]
[ 747.117849] isr: remain=32, status=0xd0000039 : TA RXR TXD RXD [i2c1]
[ 747.117861] isr: remain=20, status=0xd0000039 : TA RXR TXD RXD [i2c1]
[ 747.117870] isr: remain=8, status=0x32 : DONE TXD RXD [i2c1]
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Christopher Alexander Tobias Schulze - May 2, 2015, 11:57 a.m.
This patch fixes a problem with VFP state save and restore related
to exception handling (panic with message "BUG: unsupported FP
instruction in kernel mode") present on VFP11 floating point units
(as used with ARM1176JZF-S CPUs, e.g. on first generation Raspberry
Pi boards). This patch was developed and discussed on
https://github.com/raspberrypi/linux/issues/859
A precondition to see the crashes is that floating point exception
traps are enabled. In this case, the VFP11 might determine that a FPU
operation needs to trap at a point in time when it is not possible to
signal this to the ARM11 core any more. The VFP11 will then set the
FPEXC.EX bit and store the trapped opcode in FPINST. (In some cases,
a second opcode might have been accepted by the VFP11 before the
exception was detected and could be reported to the ARM11 - in this
case, the VFP11 also sets FPEXC.FP2V and stores the second opcode in
FPINST2.)
If FPEXC.EX is set, the VFP11 will "bounce" the next FPU opcode issued
by the ARM11 CPU, which will be seen by the ARM11 as an undefined opcode
trap. The VFP support code examines the FPEXC.EX and FPEXC.FP2V bits
to decide what actions to take, i.e., whether to emulate the opcodes
found in FPINST and FPINST2, and whether to retry the bounced instruction.
If a user space application has left the VFP11 in this "pending trap"
state, the next FPU opcode issued to the VFP11 might actually be the
VSTMIA operation vfp_save_state() uses to store the FPU registers
to memory (in our test cases, when building the signal stack frame).
In this case, the kernel crashes as described above.
This patch fixes the problem by making sure that vfp_save_state() is
always entered with FPEXC.EX cleared. (The current value of FPEXC has
already been saved, so this does not corrupt the context. Clearing
FPEXC.EX has no effects on FPINST or FPINST2. Also note that many
callers already modify FPEXC by setting FPEXC.EN before invoking
vfp_save_state().)
This patch also addresses a second problem related to FPEXC.EX: After
returning from signal handling, the kernel reloads the VFP context
from the user mode stack. However, the current code explicitly clears
both FPEXC.EX and FPEXC.FP2V during reload. As VFP11 requires these
bits to be preserved, this patch disables clearing them for VFP
implementations belonging to architecture 1. There should be no
negative side effects: the user can set both bits by executing FPU
opcodes anyway, and while user code may now place arbitrary values
into FPINST and FPINST2 (e.g., non-VFP ARM opcodes) the VFP support
code knows which instructions can be emulated, and rejects other
opcodes with "unhandled bounce" messages, so there should be no
security impact from allowing reloading FPEXC.EX and FPEXC.FP2V.
Signed-off-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>
reboot: Use power off rather than busy spinning when halt is requested
Busy spinning after halt is dumb
We've previously applied this patch to arch/arm
but it is currenltly missing in arch/arm64
Pi4 after "sudo halt" uses 520mA
Pi4 after "sudo shutdown now" uses 310mA
Make them both use the lower powered option
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
The Raspberry Pi firmware looks at the RSTS register to know which
partition to boot from. The reboot syscall command
LINUX_REBOOT_CMD_RESTART2 supports passing in a string argument.
Add support for passing in a partition number 0..63 to boot from.
Partition 63 is a special partiton indicating halt.
If the partition doesn't exist, the firmware falls back to partition 0.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Load driver early since at least bcm2708_fb doesn't support deferred
probing and even if it did, we don't want the video driver deferred.
Support the legacy DMA API which is needed by bcm2708_fb.
Don't mask out channel 2.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-dma: Add support for per-channel flags
Add the ability to interpret the high bits of the dreq specifier as
flags to be included in the DMA_CS register. The motivation for this
change is the ability to set the DISDEBUG flag for SD card transfers
to avoid corruption when using the VPU debugger.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-dma: Add proper 40-bit DMA support
BCM2711 has 4 DMA channels with a 40-bit address range, allowing them
to access the full 4GB of memory on a Pi 4.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-dma: Derive slave DMA addresses correctly
Slave addresses for DMA are meant to be supplied as physical addresses
(contrary to what struct snd_dmaengine_dai_dma_data does). It is up to
the DMA controller driver to perform the translation based on its own
view of the world, as described in Device Tree.
Now that the Pi Device Trees have the correct peripheral mappings,
replace the hacky address munging with phys_to_dma().
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2835-dma: Add NO_WAIT_RESP flag
Use bit 27 of the dreq value (the second cell of the DT DMA descriptor)
to request that the WAIT_RESP bit is not set.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2835-dma: Advertise the full DMA range
Unless the DMA mask is set wider than 32 bits, DMA mapping will use a
bounce buffer.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
bcm2835-dma: only reserve channel 0 if legacy dma driver is enabled
If CONFIG_DMA_BCM2708 isn't enabled there's no need to mask out
one of the already scarce DMA channels.
Signed-off-by: Matthias Reichl <hias@horus.com>
bcm2835-dma: Avoid losing CS flags after interrupt
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
bcm2835-dma: Add bcm2835-dma: Add DMA_WIDE_SOURCE and DMA_WIDE_DEST flags
Use (reserved) bits 24 and 25 of the dreq value
(the second cell of the DT DMA descriptor) to request
that wide source reads or wide dest writes are required
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
dmaengine: bcm2835: Fix position reporting for 40 bits channels
For 40 bits channels, the position is reported by reading the upper byte
in the SRCI/DESTI registers. However the driver adds that upper byte
with an 8-bits left shift, while it should be 32.
Fixes: 9a52a99183 ("bcm2835-dma: Add proper 40-bit DMA support")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
dmaengine: bcm2835: Use to_bcm2711_cbaddr where relevant
bcm2711_dma40_memcpy has some code strictly equivalent to the
to_bcm2711_cbaddr() function. Let's use it instead.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
dmaengine: bcm2835: Fix descriptors usage for 40-bits channels
The bcm2835_dma_create_cb_chain() function is in charge of building up
the descriptors chain for a given transfer.
It was initially supporting only the BCM2835-style DMA controller, and
was later expanded to support controllers with 40-bits channels that use
a different descriptor layout.
However, some part of the function only use the old style descriptor,
even when building a chain of new-style descriptors, resulting in weird
bugs.
Fixes: 9a52a99183 ("bcm2835-dma: Add proper 40-bit DMA support")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
bcm2835-dma: Fix WAIT_RESP on memcpy
It goes in info not extra
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
bcm2835-dma: Fix dma_abort for 40-bit channels
It wasn't aborting the transfer and caused stop/start
of hdmi audio dma to be unreliable.
New sequence approved by Broadcom.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
bcm2835-dma: Fix dma_abort for non-40bit channels
The sequence we were doing was not safe.
Clearing CS meant BCM2835_DMA_WAIT_FOR_WRITES was cleared
and so polling BCM2835_DMA_WAITING_FOR_WRITES has no benefit
Broadcom have provided a recommended sequence to abort
a dma lite channel, so switch to that.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
bcm2835-dma: Support dma flags for multi-beat burst
Add a control bit to enable a multi-beat burst on a DMA.
This improves DMA performance and is required for HDMI audio.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
bcm2835-dma: Need to keep PROT bits set in CS on 40bit controller
Resetting them to zero puts DMA channel into secure mode
which makes further accesses impossible
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
dmaengine: bcm2835: Delete vestigial code
The dedicated dma40 memcpy code is no longer used, and without a
prototype the kernel build fails. Delete it.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
An alternative strategy would be to use "rpi,spidev" instead, but that
would require many Raspberry Pi Device Tree changes.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add a duplicate irq range with an offset on the hwirq's so the
driver can detect that enable_fiq() is used.
Tested with downstream dwc_otg USB controller driver.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
SQUASH: smsc95xx: Use dev_mod_addr to set MAC addr
Since adeef3e321 ("net: constify netdev->dev_addr") it has been
illegal to write to the dev_addr MAC address field. Later commits
have added explicit checks that it hasn't been modified by nefarious
means. The dev_addr_mod helper function is the accepted way to change
the dev_addr field, so use it.
Squash with 96c1def63ee1 ("Allow mac address to be set in smsc95xx").
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
DSI0 can take the clock from either PLLA or PLLD. PLLA is
the default muxing, but PLLD is considered the more stable.
Switch to using PLLD.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This is controlled by firmware, see clk-raspberrypi.c
Signed-off-by: popcornmix <popcornmix@gmail.com>
clk-bcm2835: Remove VEC clock support
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
This falls under the same "we can reprogram glitch-free as long as we
pause generation" rule as updating the div/frac fields. This can be
used for runtime reclocking of V3D to manage power leakage.
Signed-off-by: Eric Anholt <eric@anholt.net>
The VPU is responsible for managing the core clock, usually under
direction from the bcm2835-cpufreq driver but not via the clk-bcm2835
driver. Since the core frequency can change without warning, it is
safer to report the maximum clock rate to users of the core clock -
I2C, SPI and the mini UART - to err on the safe side when calculating
clock divisors.
If the DT node for the clock driver includes a reference to the
firmware node, use the firmware API to query the maximum core clock
instead of reading the divider registers.
Prior to this patch, a "100KHz" I2C bus was sometimes clocked at about
160KHz. In particular, switching to the 4.9 kernel was likely to break
SenseHAT usage on a Pi3.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
clk: bcm2835: Pass DT node to rpi_firmware_get
The fw_node pointer has already been retrieved, and using it allows
us to remove a downstream patch to the firmware driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The claim-clocks property can be used to prevent PLLs and dividers
from being marked as critical. It contains a vector of clock IDs,
as defined by dt-bindings/clock/bcm2835.h.
Use this mechanism to claim PLLD_DSI0, PLLD_DSI1, PLLH_AUX and
PLLH_PIX for the vc4_kms_v3d driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The VPU configures and relies on several PLLs and dividers. Mark all
enabled dividers and their PLLs as CRITICAL to prevent the kernel from
switching them off.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
so that special/critical clocks can get enabled early on in the
boot process avoiding the risk of disabling a clock, pll_divider
or pll when a claiming driver fails to install propperly - maybe it needs to defer.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
clk: clk-bcm2835: Use %zd when printing size_t
The debug text for how many clocks have been registered
uses "%d" with a size_t. Correct it to "%zd".
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Initialise rpi-firmware before clk-bcm2835
The IMA (Integrity Measurement Architecture) looks for a TPM (Trusted
Platform Module) having been registered when it initialises; otherwise
it assumes there is no TPM. It has been observed on BCM2835 that IMA
is initialised before TPM, and that initialising the BCM2835 clock
driver before the firmware driver has the effect of reversing this
order.
Change the firmware driver to initialise at core_initcall, delaying the
BCM2835 clock driver to postcore_initcall.
See: https://github.com/raspberrypi/linux/issues/3291https://github.com/raspberrypi/linux/pull/3297
Signed-off-by: Luke Hinds <lhinds@redhat.com>
Co-authored-by: Phil Elwell <phil@raspberrypi.org>
clk-bcm2835: use subsys_initcall for the clock driver when IMA is enabled
Co-authored-by: Davide Scovotto <scovottodavide@gmail.com>
Co-developed-by: Davide Scovotto <scovottodavide@gmail.com>
Signed-off-by: Davide Scovotto <scovottodavide@gmail.com>
Signed-off-by: Alberto Solavagione <albertosolavagione30@gmail.com>
[1] Removed the sc16is752-spi1 overlay, replacing it with an entry in
overlay_map.dts that invokes sc16is75x-spi with specific parameters.
This does not work because it fails to configure the SPI1 interface.
Most such overlays would require the respective SPI interface to have
already been configured using one of the spi<n>-<m> overlays, but it is
not possible to do that using overlay_map, and it is unreasonable to
suddenly impose that requirement on users.
Work around that specific problem by adding an extra parameter to
sc16is75x to configure SPI1. It's not ideal, but better than a complete
dedicated overlay.
Link: https://github.com/raspberrypi/linux/issues/6962
Fixes: ce20a8fdbf ("overlays: sc16is75x: Add generic SPI overlay")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[1] commit ce20a8fdbf ("overlays: sc16is75x: Add generic SPI overlay")
Add the bare minimum needed to boot BCM2708 from a Device Tree.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
BCM2708: DT: change 'axi' nodename to 'soc'
Change DT node named 'axi' to 'soc' so it matches ARCH_BCM2835.
The VC4 bootloader fills in certain properties in the 'axi' subtree,
but since this is part of an upstreaming effort, the name is changed.
Signed-off-by: Noralf Tronnes notro@tronnes.org
BCM2708_DT: Correct length of the peripheral space
Use dts-dirs feature for overlays.
The kernel makefiles have a dts-dirs target that is for vendor subdirectories.
Using this fixes the install_dtbs target, which previously did not install the overlays.
BCM270X_DT: configure I2S DMA channels
Signed-off-by: Matthias Reichl <hias@horus.com>
BCM270X_DT: switch to bcm2835-i2s
I2S soundcard drivers with proper devicetree support (i.e. not linking
to the cpu_dai/platform via name but to cpu/platform via of_node)
will work out of the box without any modifications.
When the kernel is compiled without devicetree support the platform
code will instantiate the bcm2708-i2s driver and I2S soundcard drivers
will link to it via name, as before.
Signed-off-by: Matthias Reichl <hias@horus.com>
SDIO-overlay: add poll_once-boolean parameter
Add paramter to toggle sdio-device-polling
done every second or once at boot-time.
Signed-off-by: Patrick Boettcher <patrick.boettcher@posteo.de>
BCM270X_DT: Make mmc overlay compatible with current firmware
The original DT overlay logic followed a merge-then-patch procedure,
i.e. parameters are applied to the loaded overlay before the overlay
is merged into the base DTB. This sequence has been changed to
patch-then-merge, in order to support parameterised node names, and
to protect against bad overlays. As a result, overrides (parameters)
must only target labels in the overlay, but the overlay can obviously target nodes in the base DTB.
mmc-overlay.dts (that switches back to the original mmc sdcard
driver) is the only overlay violating that rule, and this patch
fixes it.
bcm270x_dt: Use the sdhost MMC controller by default
The "mmc" overlay reverts to using the other controller.
squash: Add cprman to dt
BCM270X_DT: Use clk_core for I2C interfaces
BCM270X_DT: Use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi
The mainline Device Tree files are quite close to downstream now.
Let's use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi as base files
for our dts files.
Mainline dts files are based on these files:
bcm2835-rpi.dtsi
bcm2835.dtsi bcm2836.dtsi
bcm283x.dtsi
Current downstream are based on these:
bcm2708.dtsi bcm2709.dtsi bcm2710.dtsi
bcm2708_common.dtsi
This patch introduces this dependency:
bcm2708.dtsi bcm2709.dtsi
bcm2708-rpi.dtsi
bcm270x.dtsi
bcm2835.dtsi bcm2836.dtsi
bcm283x.dtsi
And:
bcm2710.dtsi
bcm2708-rpi.dtsi
bcm270x.dtsi
bcm283x.dtsi
bcm270x.dtsi contains the downstream bcm283x.dtsi diff.
bcm2708-rpi.dtsi is the downstream version of bcm2835-rpi.dtsi.
Other changes:
- The led node has moved from /soc/leds to /leds. This is not a problem
since the label is used to reference it.
- The clk_osc reg property changes from 6 to 3.
- The gpu nodes has their interrupt property set in the base file.
- the clocks label does not point to the /clocks node anymore, but
points to the cprman node. This is not a problem since the overlays
that use the clock node refer to it directly: target-path = "/clocks";
- some nodes now have 2 labels since mainline and downstream differs in
this respect: cprman/clocks, spi0/spi, gpu/vc4.
- some nodes doesn't have an explicit status = "okay" since they're not
disabled in the base file: watchdog and random.
- gpiomem doesn't need an explicit status = "okay".
- bcm2708-rpi-cm.dts got the hpd-gpios property from bcm2708_common.dtsi,
it's now set directly in that file.
- bcm2709-rpi-2-b.dts has the timer node moved from /soc/timer to /timer.
- Removed clock-frequency property on the bcm{2709,2710}.dtsi timer nodes.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270X_DT: Use raspberrypi-power to turn on USB power
Use the raspberrypi-power driver to turn on USB power.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270X_DT: Add a .dtbo target, use for overlays
Change the filenames and extensions to keep the pre-DDT style of
overlay (<name>-overlay.dtb) distinct from new ones that use a
different style of local fixups (<name>.dtbo), and to match other
platforms.
The RPi firmware uses the DDTK trailer atom to choose which type of
overlay to use for each kernel.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Don't generate "linux,phandle" props
The EPAPR standard says to use "phandle" properties to store phandles,
rather than the deprecated "linux,phandle" version. By default, dtc
generates both, but adding "-H epapr" causes it to only generate
"phandle"s, saving some space and clutter.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Add overlay for enc28j60 on SPI2
Works on SPI2 for compute module
BCM270X_DT: Add midi-uart0 overlay
MIDI requires 31.25kbaud, a baudrate unsupported by Linux. The
midi-uart0 overlay configures uart0 (ttyAMA0) to use a fake clock
so that requesting 38.4kbaud actually gets 31.25kbaud.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Add i2c-sensor overlay
The i2c-sensor overlay is a container for various pressure and
temperature sensors, currently bmp085 and bmp280. The standalone
bmp085_i2c-sensor overlay is now deprecated.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: overlays/*-overlay.dtb -> overlays/*.dtbo (#1752)
We now create overlays as .dtbo files.
build: support for .dtbo files for dtb overlays
Kernel 4.4.6+ on RaspberryPi support .dtbo files for overlays, instead of .dtb.
Patch the kernel, which has faulty rules to generate .dtbo the way yocto does
Signed-off-by: Herve Jourdain <herve.jourdain@neuf.fr>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
BCM270X: Drop position requirement for CMA in VC4 overlay.
No longer necessary since 2aefcd5761,
and will probably let peeople that want to choose a larger CMA
allocation (particularly on pi0/1).
Signed-off-by: Eric Anholt <eric@anholt.net>
BCM270X_DT: RPi Device Tree tidy
Use the upstream sdhost node, add thermal-zones, and factor out some
common elements.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
kbuild: Silence unhelpful DTC warnings
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: DT build rules no longer arch-specific
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
kbuild: Silence unavoidable dtc overlay warnings
Much effort has been put into finding ways to avoid warnings from dtc
about overlays, usually to do with the presence of #address-cells and
size-cells, but not exclusively so. Since the issues being warned about
are harmless, suppress the warnings to declutter the build output and
to avoid alarming users.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
overlays: Suppress another dtc warning
I'm sure the dtc warnings mean well, but overlays don't have enough
context for the checkers to give meaningful results.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
overlays: Use dtbs-list for overlay installation
Update the overlay build rules to use the dtbs-list mechanism. Also
include the README, and don't set the executable bits.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Margins may be implemented by scaling the planes, but as there
is no way of intercepting the set_property for a standard property,
and all planes are checked in drm_atomic_check_only before the
connectors, there's now way to add the planes into the state
from the driver.
If the margin properties change, add all corresponding planes to
the state.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The stuff never really worked, and leads to lots of fun because it
out-of-order frees atomic states. Which upsets KASAN, among other
things.
For async updates we now have a more solid solution with the
->atomic_async_check and ->atomic_async_commit hooks. Support for that
for msm and vc4 landed. nouveau and i915 have their own commit
routines, doing something similar.
For everyone else it's probably better to remove the use-after-free
bug, and encourage folks to use the async support instead. The
affected drivers which register a legacy cursor plane and don't either
use the new async stuff or their own commit routine are: amdgpu,
atmel, mediatek, qxl, rockchip, sti, sun4i, tegra, virtio, and vmwgfx.
Inspired by an amdgpu bug report.
v2: Drop RFC, I think with amdgpu converted over to use
atomic_async_check/commit done in
commit 674e78acae
Author: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Date: Wed Dec 5 14:59:07 2018 -0500
drm/amd/display: Add fast path for cursor plane updates
we don't have any driver anymore where we have userspace expecting
solid legacy cursor support _and_ they are using the atomic helpers in
their fully glory. So we can retire this.
v3: Paper over msm and i915 regression. The complete_all is the only
thing missing afaict.
v4: Rebased on recent kernel, added extra link for vc4 bug.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199425
Link: https://lore.kernel.org/all/20220221134155.125447-9-maxime@cerno.tech/
Cc: mikita.lipski@amd.com
Cc: Michel Dänzer <michel@daenzer.net>
Cc: harry.wentland@amd.com
Cc: Rob Clark <robdclark@gmail.com>
Cc: "Kazlauskas, Nicholas" <nicholas.kazlauskas@amd.com>
Tested-by: Maxime Ripard <maxime@cerno.tech>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
mipi_dsi_attach can fail due to resources not being available
yet, therefore do not log error messages should they occur.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The MIPI_DSI_MODE_* flags have fairly terse descriptions and no reference
to the DSI specification as to their exact meaning. Usage has therefore
been rather fluid.
Extend the descriptions and provide references to the part of the
MIPI DSI specification regarding what they mean.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm_crtc_legacy_gamma_set updates the gamma_lut blob unconditionally,
which leads to unnecessary reprogramming of hardware.
Check whether the blob contents has actually changed before
signalling that it has been updated.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
CI builds are done with debug info disabled, but this removes some
members from struct module. This causes builds to fail if there is an
ABI reference for the current ABI.
Define these members unconditionally, so that there is no ABI change.
When symbols from overlays are added to the live tree their paths must
be rebased. The translated symbol is normally the result of joining
the fragment-relative path (with a leading "/") to the target path
(either copied directly from the "target-path" property or resolved
from the phandle). This translation fails when the target is the root
node (a common case for Raspberry Pi overlays) because the resulting
path starts with a double slash. For example, if target-path is "/" and
the fragment adds a node called "newnode", the label associated with
that node will be assigned the path "//newnode", which can't be found
in the tree.
Fix the failure case by explicitly replacing a target path of "/" with
an empty string.
Fixes: d1651b03c2 ("of: overlay: add overlay symbols to live device tree")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Use GitHubs issue form for bug reports.
- modern look
- user don't need to mess with given markdown parts while filling the issue template
Setup config.yml for general questions and problems with the Raspbian distribution packages.
Update issue templates (#2736)
Adding Pi 5 as a device to bug_report.yml
Update the Issue template
* Update config.yml - Raspbian -> Raspberry Pi OS
* Update config.yml
* .org to .com
* Update forum URL
Add Pi 500 and CM5 as a device to bug_report.yml
Signed-off-by: Nicolai Buchwitz <nb@tipi-net.de>
This is a copy of README with the tags added.
You can not delete the file README as then checkpatch complains
you aren't in a kernel tree.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
README: Show rpi-6.5.y build status
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
README: show rpi-6.6.y build status
Replace rpi-6.5.y with rpi-6.6.y in the build status list.
README: show rpi-6.12.y build status
Remove rpi-5.15.y build status since it doesn't appear to be built anymore, and add rpi-6.12.y build status.
This is currently running on defaults, so the --strict desired
for media drivers and similar won't be observed. That may be
possible to add later.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
.github: Add Github Workflow for KUnit
Now that we have some KUnit coverage, let's add a github actions file to
run them on each push or pull request.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
.github/workflows: Add dtoverlaycheck workflow
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
.github/workflows: Create workflow to CI kernel builds
Builds the bcmrpi, bcm2709, bcm2711, and bcm2835 32 bit kernels,
and defconfig and bcm2711 64bit kernels, saving the artifacts for
7 days.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
.github: Skip broken Generic DRM/KMS Unit Tests
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
.github/workflows: Set warnings-as-errors for builds
To avoid code with build warnings being introduced into the tree, force
CONFIG_WERROR=y in the build workflow.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
.github/workflows: Correct kernel builds artifacts
Modify the kernel build workflow to create artifacts with the correct
names and structure, both as an example of what we expect and in case
anyone wants to use the output.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
.github/workflows: Switch to a matrix build
Remove the per-build duplication by putting build parameters in a
matrix.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
.github/workflows: Retain artifacts for 90 days
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
.github/workflows: Add a bcm2712 build configuration
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Update kernel-build.yml to use node.js 20
Upgrade the actions to v4 to get rid of the warning about migrating from node.js 16.
Update kunit.yml to use node.js 20
Bump actions/checkout to v4.
Update dtoverlaycheck.yml to node.js 20
.github/workflows: More jobs for kernel builds
Using the "cores * 1.5" heuristic, configure the kernel builds for the
4-core GitHub-hosted runners.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
workflows: Add arm64 bcm2711_rt build
Add a Github CI workflow bcm2711_rt_defconfig
Signed-off-by: Tim Gover <tim.gover@raspberrypi.com>
workflows: Remove the ARCH=arm bcm2711 build
As we will be moving Pi 4 support to kernel8.img only and dropping
kernel7l.img, the ARCH=arm bcm2711 defconfig has been deleted.
Remove the corresponding autobuild.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
workflows: Use bcm2709_defconfig for dtoverlaycheck
Now that ARCH=arm bcm2711_defconfig has been deleted, update
dtoverlaycheck to use bcm2709_defconfig.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
kunit: Use ubuntu-22.04 for arm64
There's a bug in the version of qemu used by Ubuntu 24.04 that kills
the arm64 KUnit test. Revert to Ubuntu 22.04 just for that test,
until ubuntu-latest updates to qemu 9.2.0+.
Link: https://bugzilla.suse.com/show_bug.cgi?id=1236310
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
workflows: Switch to overlaycheck's thorough mode
Now that the current trees are passing the thorough/try-all mode of
overlaycheck (mainly by excluding trying to apply the vl805 overlay
on a CM4S), use it in the build checks.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
commit a3c4a0a42e upstream.
When scheduling the deferred balance callbacks, check SCX_RQ_BAL_CB_PENDING
instead of SCX_RQ_BAL_PENDING. This way schedule_deferred() properly tests
whether there is already a pending request for queue_balance_callback() to
be invoked at the end of .balance().
Fixes: a8ad873113 ("sched_ext: defer queue_balance_callback() until after ops.dispatch")
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05e63305c8 upstream.
If we load a BPF scheduler while another scheduler is already running,
alloc_kick_pseqs() would be called again, overwriting the previously
allocated arrays.
Fix by moving the alloc_kick_pseqs() call after the scx_enable_state()
check, ensuring that the arrays are only allocated when a scheduler can
actually be loaded.
Fixes: 14c1da3895 ("sched_ext: Allocate scx_kick_cpus_pnt_seqs lazily using kvzalloc()")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Below is a patch for 6.12.58+ and 6.17.8+ stable branches only.
Upstream does not need this.
Signed-off-by: Jari Ruusu <jariruusu@protonmail.com>
Fixes: da7e8b3823 ("tty/vt: Add missing return value for VT_RESIZE in vt_ioctl()")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cfa0904a35 ]
[why]
1. With allow_0_dtb_clk enabled, the time required to latch DTBCLK to 600 MHz
depends on the SMU. If DTBCLK is not latched to 600 MHz before set_mode completes,
gating DTBCLK causes the DP2 sink to lose its clock source.
2. The existing DTBCLK gating sequence ungates DTBCLK based on both pix_clk and ref_dtbclk,
but gates DTBCLK when either pix_clk or ref_dtbclk is zero.
pix_clk can be zero outside the set_mode sequence before DTBCLK is properly latched,
which can lead to DTBCLK being gated by mistake.
[how]
Consider both pixel_clk and ref_dtbclk when determining when it is safe to gate DTBCLK;
this is more accurate.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4701
Fixes: 5949e7c489 ("drm/amd/display: Enable Dynamic DTBCLK Switch")
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d04eb0c402780ca037b62a6aecf23b863545ebca)
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 678e1cc2f4 ]
xfs/286 produced this report on my test fleet:
==================================================================
BUG: KFENCE: out-of-bounds read in memcpy_orig+0x54/0x110
Out-of-bounds read at 0xffff88843fe9e038 (184B right of kfence-#184):
memcpy_orig+0x54/0x110
xrep_symlink_salvage_inline+0xb3/0xf0 [xfs]
xrep_symlink_salvage+0x100/0x110 [xfs]
xrep_symlink+0x2e/0x80 [xfs]
xrep_attempt+0x61/0x1f0 [xfs]
xfs_scrub_metadata+0x34f/0x5c0 [xfs]
xfs_ioc_scrubv_metadata+0x387/0x560 [xfs]
xfs_file_ioctl+0xe23/0x10e0 [xfs]
__x64_sys_ioctl+0x76/0xc0
do_syscall_64+0x4e/0x1e0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
kfence-#184: 0xffff88843fe9df80-0xffff88843fe9dfea, size=107, cache=kmalloc-128
allocated by task 3470 on cpu 1 at 263329.131592s (192823.508886s ago):
xfs_init_local_fork+0x79/0xe0 [xfs]
xfs_iformat_local+0xa4/0x170 [xfs]
xfs_iformat_data_fork+0x148/0x180 [xfs]
xfs_inode_from_disk+0x2cd/0x480 [xfs]
xfs_iget+0x450/0xd60 [xfs]
xfs_bulkstat_one_int+0x6b/0x510 [xfs]
xfs_bulkstat_iwalk+0x1e/0x30 [xfs]
xfs_iwalk_ag_recs+0xdf/0x150 [xfs]
xfs_iwalk_run_callbacks+0xb9/0x190 [xfs]
xfs_iwalk_ag+0x1dc/0x2f0 [xfs]
xfs_iwalk_args.constprop.0+0x6a/0x120 [xfs]
xfs_iwalk+0xa4/0xd0 [xfs]
xfs_bulkstat+0xfa/0x170 [xfs]
xfs_ioc_fsbulkstat.isra.0+0x13a/0x230 [xfs]
xfs_file_ioctl+0xbf2/0x10e0 [xfs]
__x64_sys_ioctl+0x76/0xc0
do_syscall_64+0x4e/0x1e0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
CPU: 1 UID: 0 PID: 1300113 Comm: xfs_scrub Not tainted 6.18.0-rc4-djwx #rc4 PREEMPT(lazy) 3d744dd94e92690f00a04398d2bd8631dcef1954
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-4.module+el8.8.0+21164+ed375313 04/01/2014
==================================================================
On further analysis, I realized that the second parameter to min() is
not correct. xfs_ifork::if_bytes is the size of the xfs_ifork::if_data
buffer. if_bytes can be smaller than the data fork size because:
(a) the forkoff code tries to keep the data area as large as possible
(b) for symbolic links, if_bytes is the ondisk file size + 1
(c) forkoff is always a multiple of 8.
Case in point: for a single-byte symlink target, forkoff will be
8 but the buffer will only be 2 bytes long.
In other words, the logic here is wrong and we walk off the end of the
incore buffer. Fix that.
Cc: stable@vger.kernel.org # v6.10
Fixes: 2651923d8d ("xfs: online repair of symbolic links")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 33ddc796ec ]
The changes modernizes the code by aligning it with current kernel best
practices. It improves code clarity and consistency, as strncpy is deprecated
as explained in Documentation/process/deprecated.rst. This change does
not alter the functionality or introduce any behavioral changes.
Suggested-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Marcelo Moreira <marcelomoreira1905@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Stable-dep-of: 678e1cc2f4 ("xfs: fix out of bounds memory read error in symlink repair")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The previous commit bdb596ceb4 ("smb: client: fix potential UAF in
smb2_close_cached_fid()") was an incomplete backport and missed one
kref_put() call in cfids_invalidation_worker() that should have been
converted to close_cached_dir().
Fixes: bdb596ceb4 ("smb: client: fix potential UAF in smb2_close_cached_fid()")"
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eb6e7f520d ]
On PF passthrough environment, after hibernate and then resume, coralgemm
will cause gpu page fault.
Mode1 reset happens during hibernate, but partition mode is not restored
on resume, register mmCP_HYP_XCP_CTL and mmCP_PSP_XCP_CTL is not right
after resume. When CP access the MQD BO, wrong stride size is used,
this will cause out of bound access on the MQD BO, resulting page fault.
The fix is to ensure gfx_v9_4_3_switch_compute_partition() is called
when resume from a hibernation.
KFD resume is called separately during a reset recovery or resume from
suspend sequence. Hence it's not required to be called as part of
partition switch.
Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5d1b32cfe4a676fe552416cb5ae847b215463a1a)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 953902e4fb ]
If we are logging a new name make sure our inode has the runtime flag
BTRFS_INODE_COPY_EVERYTHING set so that at btrfs_log_inode() we will find
new inode refs/extrefs in the subvolume tree and copy them into the log
tree.
We are currently doing it when adding a new link but we are missing it
when renaming.
An example where this makes a new name not persisted:
1) create symlink with name foo in directory A
2) fsync directory A, which persists the symlink
3) rename the symlink from foo to bar
4) fsync directory A to persist the new symlink name
Step 4 isn't working correctly as it's not logging the new name and also
leaving the old inode ref in the log tree, so after a power failure the
symlink still has the old name of "foo". This is because when we first
fsync directoy A we log the symlink's inode (as it's a new entry) and at
btrfs_log_inode() we set the log mode to LOG_INODE_ALL and then because
we are using that mode and the inode has the runtime flag
BTRFS_INODE_NEEDS_FULL_SYNC set, we clear that flag as well as the flag
BTRFS_INODE_COPY_EVERYTHING. That means the next time we log the inode,
during the rename through the call to btrfs_log_new_name() (calling
btrfs_log_inode_parent() and then btrfs_log_inode()), we will not search
the subvolume tree for new refs/extrefs and jump directory to the
'log_extents' label.
Fix this by making sure we set BTRFS_INODE_COPY_EVERYTHING on an inode
when we are about to log a new name. A test case for fstests will follow
soon.
Reported-by: Vyacheslav Kovalevsky <slava.kovalevskiy.2014@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/ac949c74-90c2-4b9a-b7fd-1ffc5c3175c7@gmail.com/
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 53afec2c8f ]
The help message incorrectly listed '-t' as the short option for
--threads, but the actual getopt_long configuration uses '-e'.
This mismatch can confuse users and lead to incorrect command-line
usage. This patch updates the usage string to correctly show:
"-e, --threads NRTHR"
to match the implementation.
Note: checkpatch.pl reports a false-positive spelling warning on
'Run', which is intentional.
Link: https://patch.msgid.link/20251106031040.1869-1-zhangchujun@cmss.chinamobile.com
Signed-off-by: Zhang Chujun <zhangchujun@cmss.chinamobile.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90a88306eb ]
Make knav_dma_open_channel consistently return NULL on error instead
of ERR_PTR. Currently the header include/linux/soc/ti/knav_dma.h
returns NULL when the driver is disabled, but the driver
implementation does not even return NULL or ERR_PTR on failure,
causing inconsistency in the users. This results in a crash in
netcp_free_navigator_resources as followed (trimmed):
Unhandled fault: alignment exception (0x221) at 0xfffffff2
[fffffff2] *pgd=80000800207003, *pmd=82ffda003, *pte=00000000
Internal error: : 221 [#1] SMP ARM
Modules linked in:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.17.0-rc7 #1 NONE
Hardware name: Keystone
PC is at knav_dma_close_channel+0x30/0x19c
LR is at netcp_free_navigator_resources+0x2c/0x28c
[... TRIM...]
Call trace:
knav_dma_close_channel from netcp_free_navigator_resources+0x2c/0x28c
netcp_free_navigator_resources from netcp_ndo_open+0x430/0x46c
netcp_ndo_open from __dev_open+0x114/0x29c
__dev_open from __dev_change_flags+0x190/0x208
__dev_change_flags from netif_change_flags+0x1c/0x58
netif_change_flags from dev_change_flags+0x38/0xa0
dev_change_flags from ip_auto_config+0x2c4/0x11f0
ip_auto_config from do_one_initcall+0x58/0x200
do_one_initcall from kernel_init_freeable+0x1cc/0x238
kernel_init_freeable from kernel_init+0x1c/0x12c
kernel_init from ret_from_fork+0x14/0x38
[... TRIM...]
Standardize the error handling by making the function return NULL on
all error conditions. The API is used in just the netcp_core.c so the
impact is limited.
Note, this change, in effect reverts commit 5b6cb43b4d ("net:
ethernet: ti: netcp_core: return error while dma channel open issue"),
but provides a less error prone implementation.
Suggested-by: Simon Horman <horms@kernel.org>
Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251103162811.3730055-1-nm@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5127be409c ]
According to UFS specifications, the power-off sequence for a UFS device
includes:
- Sending an SSU command with Power_Condition=3 and await a response.
- Asserting RST_N low.
- Turning off REF_CLK.
- Turning off VCC.
- Turning off VCCQ/VCCQ2.
As part of ufs shutdown, after the SSU command completion, asserting
hardware reset (HWRST) triggers the device firmware to wake up and
execute its reset routine. This routine initializes hardware blocks and
takes a few milliseconds to complete. During this time, the ICCQ draws a
large current.
This large ICCQ current may cause issues for the regulator which is
supplying power to UFS, because the turn off request from UFS driver to
the regulator framework will be immediately followed by low power
mode(LPM) request by regulator framework. This is done by framework
because UFS which is the only client is requesting for disable. So if
the rail is still in the process of shutting down while ICCQ exceeds LPM
current thresholds, and LPM mode is activated in hardware during this
state, it may trigger an overcurrent protection (OCP) fault in the
regulator.
To prevent this, a 10ms delay is added after asserting HWRST. This
allows the reset operation to complete while power rails remain active
and in high-power mode.
Currently there is no way for Host to query whether the reset is
completed or not and hence this the delay is based on experiments with
Qualcomm UFS controllers across multiple UFS vendors.
Signed-off-by: Nitin Rawat <nitin.rawat@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20251012173828.9880-1-nitin.rawat@oss.qualcomm.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d26e9f669c ]
Since 8b3a087f7f ("ALSA: usb-audio: Unify virtual type units type to
UAC3 values") usb-audio is using UAC3_CLOCK_SOURCE instead of
bDescriptorSubtype, later refactored with e0ccdef926 ("ALSA: usb-audio:
Clean up check_input_term()") into parse_term_uac2_clock_source().
This breaks the clock source selection for at least my
1397:0003 BEHRINGER International GmbH FCA610 Pro.
Fix by using UAC2_CLOCK_SOURCE in parse_term_uac2_clock_source().
Fixes: 8b3a087f7f ("ALSA: usb-audio: Unify virtual type units type to UAC3 values")
Signed-off-by: René Rebe <rene@exactco.de>
Link: https://patch.msgid.link/20251125.154149.1121389544970412061.rene@exactco.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d52dea485c ]
If user provides a large value (such as 0x80) for parameter
prefetch_mem_region_instance in vm_bind ioctl, it will cause
BIT(prefetch_region) overflow as below:
"
------------[ cut here ]------------
UBSAN: shift-out-of-bounds in drivers/gpu/drm/xe/xe_vm.c:3414:7
shift exponent 128 is too large for 64-bit type 'long unsigned int'
CPU: 8 UID: 0 PID: 53120 Comm: xe_exec_system_ Tainted: G W 6.18.0-rc1-lgci-xe-kernel+ #200 PREEMPT(voluntary)
Tainted: [W]=WARN
Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
Call Trace:
<TASK>
dump_stack_lvl+0xa0/0xc0
dump_stack+0x10/0x20
ubsan_epilogue+0x9/0x40
__ubsan_handle_shift_out_of_bounds+0x10e/0x170
? mutex_unlock+0x12/0x20
xe_vm_bind_ioctl.cold+0x20/0x3c [xe]
...
"
Fix it by validating prefetch_region before the BIT() usage.
v2: Add Closes and Cc stable kernels. (Matt)
Reported-by: Koen Koning <koen.koning@intel.com>
Reported-by: Peter Senna Tschudin <peter.senna@linux.intel.com>
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6478
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20251112181005.2120521-2-shuicheng.lin@intel.com
(cherry picked from commit 8f565bdd14eec5611cc041dba4650e42ccdf71d9)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit d52dea485c)
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c15d5c62ab ]
When a netdev issues a RX async resync request for a TLS connection,
the TLS module handles it by logging record headers and attempting to
match them to the tcp_sn provided by the device. If a match is found,
the TLS module approves the tcp_sn for resynchronization.
While waiting for a device response, the TLS module also increments
rcd_delta each time a new TLS record is received, tracking the distance
from the original resync request.
However, if the device response is delayed or fails (e.g due to
unstable connection and device getting out of tracking, hardware
errors, resource exhaustion etc.), the TLS module keeps logging and
incrementing, which can lead to a WARN() when rcd_delta exceeds the
threshold.
To address this, introduce tls_offload_rx_resync_async_request_cancel()
to explicitly cancel resync requests when a device response failure is
detected. Call this helper also as a final safeguard when rcd_delta
crosses its threshold, as reaching this point implies that earlier
cancellation did not occur.
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761508983-937977-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 34892cfec0 ]
Update tls_offload_rx_resync_async_request_start() and
tls_offload_rx_resync_async_request_end() to get a struct
tls_offload_resync_async parameter directly, rather than
extracting it from struct sock.
This change aligns the function signatures with the upcoming
tls_offload_rx_resync_async_request_cancel() helper, which
will be introduced in a subsequent patch.
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1761508983-937977-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9311e9540a ]
In bareudp.sh, this script uses /bin/sh and it will load another lib.sh
BASH script at the very beginning.
But on some operating systems like Ubuntu, /bin/sh is actually pointed to
DASH, thus it will try to run BASH commands with DASH and consequently
leads to syntax issues:
# ./bareudp.sh: 4: ./lib.sh: Bad substitution
# ./bareudp.sh: 5: ./lib.sh: source: not found
# ./bareudp.sh: 24: ./lib.sh: Syntax error: "(" unexpected
Fix this by explicitly using BASH for bareudp.sh. This fixes test
execution failures on systems where /bin/sh is not BASH.
Reported-by: Edoardo Canepa <edoardo.canepa@canonical.com>
Link: https://bugs.launchpad.net/bugs/2129812
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20251027095710.2036108-2-po-hsu.lin@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fac56c4651 ]
In very rare cases, DFS mounts could end up with SMB sessions without
any IPC connections. These mounts are only possible when having
unexpired cached DFS referrals, hence not requiring any IPC
connections during the mount process.
Try to establish those missing IPC connections when refreshing DFS
referrals. If the server is still rejecting it, then simply ignore
and leave expired cached DFS referral for any potential DFS failovers.
Reported-by: Jay Shin <jaeshin@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a0b7780602 ]
Commit 995412e23b ("blk-mq: Replace tags->lock with SRCU for tag
iterators") introduced the following regression:
Call trace:
__srcu_read_lock+0x30/0x80 (P)
blk_mq_tagset_busy_iter+0x44/0x300
scsi_host_busy+0x38/0x70
ufshcd_print_host_state+0x34/0x1bc
ufshcd_link_startup.constprop.0+0xe4/0x2e0
ufshcd_init+0x944/0xf80
ufshcd_pltfrm_init+0x504/0x820
ufs_rockchip_probe+0x2c/0x88
platform_probe+0x5c/0xa4
really_probe+0xc0/0x38c
__driver_probe_device+0x7c/0x150
driver_probe_device+0x40/0x120
__driver_attach+0xc8/0x1e0
bus_for_each_dev+0x7c/0xdc
driver_attach+0x24/0x30
bus_add_driver+0x110/0x230
driver_register+0x68/0x130
__platform_driver_register+0x20/0x2c
ufs_rockchip_pltform_init+0x1c/0x28
do_one_initcall+0x60/0x1e0
kernel_init_freeable+0x248/0x2c4
kernel_init+0x20/0x140
ret_from_fork+0x10/0x20
Fix this regression by making scsi_host_busy() check whether the SCSI
host tag set has already been initialized. tag_set->ops is set by
scsi_mq_setup_tags() just before blk_mq_alloc_tag_set() is called. This
fix is based on the assumption that scsi_host_busy() and
scsi_mq_setup_tags() calls are serialized. This is the case in the UFS
driver.
Reported-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Closes: https://lore.kernel.org/linux-block/pnezafputodmqlpumwfbn644ohjybouveehcjhz2hmhtcf2rka@sdhoiivync4y/
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://patch.msgid.link/20251007214800.1678255-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a8ad873113 ]
The sched_ext code calls queue_balance_callback() during enqueue_task()
to defer operations that drop multiple locks until we can unpin them.
The call assumes that the rq lock is held until the callbacks are
invoked, and the pending callbacks will not be visible to any other
threads. This is enforced by a WARN_ON_ONCE() in rq_pin_lock().
However, balance_one() may actually drop the lock during a BPF dispatch
call. Another thread may win the race to get the rq lock and see the
pending callback. To avoid this, sched_ext must only queue the callback
after the dispatch calls have completed.
CPU 0 CPU 1 CPU 2
scx_balance()
rq_unpin_lock()
scx_balance_one()
|= IN_BALANCE scx_enqueue()
ops.dispatch()
rq_unlock()
rq_lock()
queue_balance_callback()
rq_unlock()
[WARN] rq_pin_lock()
rq_lock()
&= ~IN_BALANCE
rq_repin_lock()
Changelog
v2-> v1 (https://lore.kernel.org/sched-ext/aOgOxtHCeyRT_7jn@gpd4)
- Fixed explanation in patch description (Andrea)
- Fixed scx_rq mask state updates (Andrea)
- Added Reviewed-by tag from Andrea
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Emil Tsalapatis (Meta) <emil@etsalapatis.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 14c1da3895 ]
On systems with >4096 CPUs, scx_kick_cpus_pnt_seqs allocation fails during
boot because it exceeds the 32,768 byte percpu allocator limit.
Restructure to use DEFINE_PER_CPU() for the per-CPU pointers, with each CPU
pointing to its own kvzalloc'd array. Move allocation from boot time to
scx_enable() and free in scx_disable(), so the O(nr_cpu_ids^2) memory is only
consumed when sched_ext is active.
Use RCU to guard against racing with free. Arrays are freed via call_rcu()
and kick_cpus_irq_workfn() uses rcu_dereference_bh() with a NULL check.
While at it, rename to scx_kick_pseqs for brevity and update comments to
clarify these are pick_task sequence numbers.
v2: RCU protect scx_kick_seqs to manage kick_cpus_irq_workfn() racing
against disable as per Andrea.
v3: Fix bugs notcied by Andrea.
Reported-by: Phil Auld <pauld@redhat.com>
Link: http://lkml.kernel.org/r/20251007133523.GA93086@pauld.westford.csb
Cc: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: Phil Auld <pauld@redhat.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 14b46ba92b ]
Commit 69896119dc ("MIPS: vdso: Switch to generic storage
implementation") switches to a generic vdso storage, which increases
the number of data pages from 1 to 4. But there is only one page
reserved, which causes segementation faults depending where the VDSO
area is randomized to. To fix this use the same size of reservation
and allocation of the VDSO data pages.
Fixes: 69896119dc ("MIPS: vdso: Switch to generic storage implementation")
Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7b5ab04f03 ]
tk_aux_sysfs_init() returns immediately on error during the auxiliary clock
initialization loop without cleaning up previously allocated kobjects and
sysfs groups.
If kobject_create_and_add() or sysfs_create_group() fails during loop
iteration, the parent kobjects (tko and auxo) and any previously created
child kobjects are leaked.
Fix this by adding proper error handling with goto labels to ensure all
allocated resources are cleaned up on failure. kobject_put() on the
parent kobjects will handle cleanup of their children.
Fixes: 7b95663a3d ("timekeeping: Provide interface to control auxiliary clocks")
Signed-off-by: Malaya Kumar Rout <mrout@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251120150213.246777-1-mrout@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f1f96511b1 ]
Currently cpu-clock event always returns 0 count, e.g.,
perf stat -e cpu-clock -- sleep 1
Performance counter stats for 'sleep 1':
0 cpu-clock # 0.000 CPUs utilized
1.002308394 seconds time elapsed
The root cause is the commit 'bc4394e5e79c ("perf: Fix the throttle
error of some clock events")' adds PERF_EF_UPDATE flag check before
calling cpu_clock_event_update() to update the count, however the
PERF_EF_UPDATE flag is never set when the cpu-clock event is stopped in
counting mode (pmu->dev() -> cpu_clock_event_del() ->
cpu_clock_event_stop()). This leads to the cpu-clock event count is
never updated.
To fix this issue, force to set PERF_EF_UPDATE flag for cpu-clock event
just like what task-clock does.
Fixes: bc4394e5e7 ("perf: Fix the throttle error of some clock events")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://patch.msgid.link/20251112080526.3971392-1-dapeng1.mi@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7e4d9120cf ]
Add proper cleanup of ctx->source and fc->source to the
cifs_parse_mount_err error handler. This ensures that memory allocated
for the source strings is correctly freed on all error paths, matching
the cleanup already performed in the success path by
smb3_cleanup_fs_context_contents().
Pointers are also set to NULL after freeing to prevent potential
double-free issues.
This change fixes a memory leak originally detected by syzbot. The
leak occurred when processing Opt_source mount options if an error
happened after ctx->source and fc->source were successfully
allocated but before the function completed.
The specific leak sequence was:
1. ctx->source = smb3_fs_context_fullpath(ctx, '/') allocates memory
2. fc->source = kstrdup(ctx->source, GFP_KERNEL) allocates more memory
3. A subsequent error jumps to cifs_parse_mount_err
4. The old error handler freed passwords but not the source strings,
causing the memory to leak.
This issue was not addressed by commit e8c73eb7db ("cifs: client:
fix memory leak in smb3_fs_context_parse_param"), which only fixed
leaks from repeated fsconfig() calls but not this error path.
Patch updated with minor change suggested by kernel test robot
Reported-by: syzbot+87be6809ed9bf6d718e3@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=87be6809ed9bf6d718e3
Fixes: 24e0a1eff9 ("cifs: switch to new mount api")
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 20d7338f2d ]
The kernel UAPI headers already contain fixed-width integer types, there
is no need to rely on the libc types. There may not be a libc available
or the libc may not provides the <stdint.h>, like for example on nolibc.
This also aligns the header with the rest of the LoongArch UAPI headers.
Fixes: 803b0fc5c3 ("LoongArch: Add process management")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 807e0d187d ]
In commit 0345691b24 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle") the
new function report_idle_softirq() was created by breaking code out of the
existing can_stop_idle_tick() for kernels v5.18 and newer.
In doing so, the code essentially went from this form:
if (A) {
static int ratelimit;
if (ratelimit < 10 && !C && A&D) {
pr_warn("NOHZ tick-stop error: ...");
ratelimit++;
}
return false;
}
to a new function:
static bool report_idle_softirq(void)
{
static int ratelimit;
if (likely(!A))
return false;
if (ratelimit < 10)
return false;
...
pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",
pending);
ratelimit++;
return true;
}
commit a7e282c777 ("tick/rcu: Fix bogus ratelimit condition") realized
ratelimit was essentially set to zero instead of ten, and hence *no*
softirq pending messages would ever be issued, but "fixed" it as:
- if (ratelimit < 10)
+ if (ratelimit >= 10)
return false;
However, this fix introduced another issue:
When ratelimit is greater than or equal 10, even if A is true, it will
directly return false. While ratelimit in the original code was only used
to control printing and will not affect the return value.
Restore the original logic and restrict ratelimit to control the printk and
not the return value.
Fixes: 0345691b24 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
Fixes: a7e282c777 ("tick/rcu: Fix bogus ratelimit condition")
Signed-off-by: Wen Yang <wen.yang@linux.dev>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251119174525.29470-1-wen.yang@linux.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e31a11be41 ]
Pause, Asym_Pause and Autoneg bits are not set when pl->supported is
initialized, so these link modes will not work for the fixed-link. This
leads to a TCP performance degradation issue observed on the i.MX943
platform.
The switch CPU port of i.MX943 is connected to an ENETC MAC, this link
is a fixed link and the link speed is 2.5Gbps. And one of the switch
user ports is the RGMII interface, and its link speed is 1Gbps. If the
flow-control of the fixed link is not enabled, we can easily observe
the iperf performance of TCP packets is very low. Because the inbound
rate on the CPU port is greater than the outbound rate on the user port,
the switch is prone to congestion, leading to the loss of some TCP
packets and requiring multiple retransmissions.
Solving this problem should be as simple as setting the Asym_Pause and
Pause bits. The reason why the Autoneg bit needs to be set, Russell
has gave a very good explanation in the thread [1], see below.
"As the advertising and lp_advertising bitmasks have to be non-empty,
and the swphy reports aneg capable, aneg complete, and AN enabled, then
for consistency with that state, Autoneg should be set. This is how it
was prior to the blamed commit."
Fixes: de7d3f87be ("net: phylink: Use phy_caps_lookup for fixed-link configuration")
Link: https://lore.kernel.org/aRjqLN8eQDIQfBjS@shell.armlinux.org.uk # [1]
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20251117102943.1862680-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4cd0902c1 ]
With the final call to fput() on a file descriptor, the release action
may be deferred and scheduled on a work queue. The reference count of
that descriptor is still zero and it must not be used. It's possible
that a GPIO change, we want to notify the user-space about, happens
AFTER the reference count on the file descriptor associated with the
character device went down to zero but BEFORE the .release() callback
was called from the workqueue and so BEFORE we unregistered from the
notifier.
Using the regular get_file() routine in this situation triggers the
following warning:
struct file::f_count incremented from zero; use-after-free condition present!
So use the get_file_active() variant that will return NULL on file
descriptors that have been or are being released.
Fixes: 40b7c49950 ("gpio: cdev: put emitting the line state events on a workqueue")
Reported-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Closes: https://lore.kernel.org/all/5d605f7fc99456804911403102a4fe999a14cc85.camel@siemens.com/
Tested-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://lore.kernel.org/r/20251117-gpio-cdev-get-file-v1-1-28a16b5985b8@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7bf3a476ce ]
Miao Wang reported a bug of SO_PEEK_OFF on AF_UNIX SOCK_STREAM
socket.
The unexpected behaviour is triggered when the peek offset is
larger than the recv queue and the thread is unblocked by new
data.
Let's assume a socket which has "aaaa" in the recv queue and
the peek offset is 4.
First, unix_stream_read_generic() reads the offset 4 and skips
the skb(s) of "aaaa" with the code below:
skip = max(sk_peek_offset(sk, flags), 0); /* @skip is 4. */
do {
...
while (skip >= unix_skb_len(skb)) {
skip -= unix_skb_len(skb);
...
skb = skb_peek_next(skb, &sk->sk_receive_queue);
if (!skb)
goto again; /* @skip is 0. */
}
The thread jumps to the 'again' label and goes to sleep since
new data has not arrived yet.
Later, new data "bbbb" unblocks the thread, and the thread jumps
to the 'redo:' label to restart the entire process from the first
skb in the recv queue.
do {
...
redo:
...
last = skb = skb_peek(&sk->sk_receive_queue);
...
again:
if (skb == NULL) {
...
timeo = unix_stream_data_wait(sk, timeo, last,
last_len, freezable);
...
goto redo; /* @skip is 0 !! */
However, the peek offset is not reset in the path.
If the buffer size is 8, recv() will return "aaaabbbb" without
skipping any data, and the final offset will be 12 (the original
offset 4 + peeked skbs' length 8).
After sleeping in unix_stream_read_generic(), we have to fetch the
peek offset again.
Let's move the redo label before mutex_lock(&u->iolock).
Fixes: 9f389e3567 ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
Reported-by: Miao Wang <shankerwangmiao@gmail.com>
Closes: https://lore.kernel.org/netdev/3B969F90-F51F-4B9D-AB1A-994D9A54D460@gmail.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251117174740.3684604-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f94c1a114a ]
The function devl_rate_nodes_destroy is documented to "Unset parent for
all rate objects". However, it was only calling the driver-specific
`rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing
the parent's refcount, without actually setting the
`devlink_rate->parent` pointer to NULL.
This leaves a dangling pointer in the `devlink_rate` struct, which cause
refcount error in netdevsim[1] and mlx5[2]. In addition, this is
inconsistent with the behavior of `devlink_nl_rate_parent_node_set`,
where the parent pointer is correctly cleared.
This patch fixes the issue by explicitly setting `devlink_rate->parent`
to NULL after notifying the driver, thus fulfilling the function's
documented behavior for all rate objects.
[1]
repro steps:
echo 1 > /sys/bus/netdevsim/new_device
devlink dev eswitch set netdevsim/netdevsim1 mode switchdev
echo 1 > /sys/bus/netdevsim/devices/netdevsim1/sriov_numvfs
devlink port function rate add netdevsim/netdevsim1/test_node
devlink port function rate set netdevsim/netdevsim1/128 parent test_node
echo 1 > /sys/bus/netdevsim/del_device
dmesg:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 8 PID: 1530 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0
CPU: 8 UID: 0 PID: 1530 Comm: bash Not tainted 6.18.0-rc4+ #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0x42/0xe0
Call Trace:
<TASK>
devl_rate_leaf_destroy+0x8d/0x90
__nsim_dev_port_del+0x6c/0x70 [netdevsim]
nsim_dev_reload_destroy+0x11c/0x140 [netdevsim]
nsim_drv_remove+0x2b/0xb0 [netdevsim]
device_release_driver_internal+0x194/0x1f0
bus_remove_device+0xc6/0x130
device_del+0x159/0x3c0
device_unregister+0x1a/0x60
del_device_store+0x111/0x170 [netdevsim]
kernfs_fop_write_iter+0x12e/0x1e0
vfs_write+0x215/0x3d0
ksys_write+0x5f/0xd0
do_syscall_64+0x55/0x10f0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
[2]
devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 1000
devlink port function rate add pci/0000:08:00.0/group1
devlink port function rate set pci/0000:08:00.0/32768 parent group1
modprobe -r mlx5_ib mlx5_fwctl mlx5_core
dmesg:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 7 PID: 16151 at lib/refcount.c:31 refcount_warn_saturate+0x42/0xe0
CPU: 7 UID: 0 PID: 16151 Comm: bash Not tainted 6.17.0-rc7_for_upstream_min_debug_2025_10_02_12_44 #1 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
RIP: 0010:refcount_warn_saturate+0x42/0xe0
Call Trace:
<TASK>
devl_rate_leaf_destroy+0x8d/0x90
mlx5_esw_offloads_devlink_port_unregister+0x33/0x60 [mlx5_core]
mlx5_esw_offloads_unload_rep+0x3f/0x50 [mlx5_core]
mlx5_eswitch_unload_sf_vport+0x40/0x90 [mlx5_core]
mlx5_sf_esw_event+0xc4/0x120 [mlx5_core]
notifier_call_chain+0x33/0xa0
blocking_notifier_call_chain+0x3b/0x50
mlx5_eswitch_disable_locked+0x50/0x110 [mlx5_core]
mlx5_eswitch_disable+0x63/0x90 [mlx5_core]
mlx5_unload+0x1d/0x170 [mlx5_core]
mlx5_uninit_one+0xa2/0x130 [mlx5_core]
remove_one+0x78/0xd0 [mlx5_core]
pci_device_remove+0x39/0xa0
device_release_driver_internal+0x194/0x1f0
unbind_store+0x99/0xa0
kernfs_fop_write_iter+0x12e/0x1e0
vfs_write+0x215/0x3d0
ksys_write+0x5f/0xd0
do_syscall_64+0x53/0x1f0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Fixes: d755598450 ("devlink: Allow setting parent node of rate objects")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1763381149-1234377-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6010d4d8b5 ]
s32_pmx_gpio_request_enable() does not initialize the newly-allocated
gpio_pin_config::list before adding it to s32_pinctrl::gpio_configs.
This could result in a linked list corruption.
Initialize the new list_head with INIT_LIST_HEAD() to fix this.
Fixes: fd84aaa817 ("pinctrl: add NXP S32 SoC family support")
Signed-off-by: Jared Kangas <jkangas@redhat.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97ea34defb ]
s32_pinctrl_desc is allocated with devm_kmalloc(), but not all of its
fields are initialized. Notably, num_custom_params is used in
pinconf_generic_parse_dt_config(), resulting in intermittent allocation
errors, such as the following splat when probing i2c-imx:
WARNING: CPU: 0 PID: 176 at mm/page_alloc.c:4795 __alloc_pages_noprof+0x290/0x300
[...]
Hardware name: NXP S32G3 Reference Design Board 3 (S32G-VNP-RDB3) (DT)
[...]
Call trace:
__alloc_pages_noprof+0x290/0x300 (P)
___kmalloc_large_node+0x84/0x168
__kmalloc_large_node_noprof+0x34/0x120
__kmalloc_noprof+0x2ac/0x378
pinconf_generic_parse_dt_config+0x68/0x1a0
s32_dt_node_to_map+0x104/0x248
dt_to_map_one_config+0x154/0x1d8
pinctrl_dt_to_map+0x12c/0x280
create_pinctrl+0x6c/0x270
pinctrl_get+0xc0/0x170
devm_pinctrl_get+0x50/0xa0
pinctrl_bind_pins+0x60/0x2a0
really_probe+0x60/0x3a0
[...]
__platform_driver_register+0x2c/0x40
i2c_adap_imx_init+0x28/0xff8 [i2c_imx]
[...]
This results in later parse failures that can cause issues in dependent
drivers:
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c0-pins/i2c0-grp0: could not parse node property
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c0-pins/i2c0-grp0: could not parse node property
[...]
pca953x 0-0022: failed writing register: -6
i2c i2c-0: IMX I2C adapter registered
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c2-pins/i2c2-grp0: could not parse node property
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c2-pins/i2c2-grp0: could not parse node property
i2c i2c-1: IMX I2C adapter registered
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c4-pins/i2c4-grp0: could not parse node property
s32g-siul2-pinctrl 4009c240.pinctrl: /soc@0/pinctrl@4009c240/i2c4-pins/i2c4-grp0: could not parse node property
i2c i2c-2: IMX I2C adapter registered
Fix this by initializing s32_pinctrl_desc with devm_kzalloc() instead of
devm_kmalloc() in s32_pinctrl_probe(), which sets the previously
uninitialized fields to zero.
Fixes: fd84aaa817 ("pinctrl: add NXP S32 SoC family support")
Signed-off-by: Jared Kangas <jkangas@redhat.com>
Tested-by: Jan Petrous (OSS) <jan.petrous@oss.nxp.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23a5b9b12d ]
Improve the cleanup on releasing PTP resources in error path.
The error case might happen either at the driver probe and PTP
feature initialization or on PTP restart (errors in reset handling, NVM
update etc). In both cases, calls to PF PTP cleanup (ice_ptp_cleanup_pf
function) and 'ps_lock' mutex deinitialization were missed.
Additionally, ptp clock was not unregistered in the latter case.
Keep PTP state as 'uninitialized' on init to distinguish between error
scenarios and to avoid resource release duplication at driver removal.
The consequence of missing ice_ptp_cleanup_pf call is the following call
trace dumped when ice_adapter object is freed (port list is not empty,
as it is required at this stage):
[ T93022] ------------[ cut here ]------------
[ T93022] WARNING: CPU: 10 PID: 93022 at
ice/ice_adapter.c:67 ice_adapter_put+0xef/0x100 [ice]
...
[ T93022] RIP: 0010:ice_adapter_put+0xef/0x100 [ice]
...
[ T93022] Call Trace:
[ T93022] <TASK>
[ T93022] ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[ T93022] ? __warn.cold+0xb0/0x10e
[ T93022] ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[ T93022] ? report_bug+0xd8/0x150
[ T93022] ? handle_bug+0xe9/0x110
[ T93022] ? exc_invalid_op+0x17/0x70
[ T93022] ? asm_exc_invalid_op+0x1a/0x20
[ T93022] ? ice_adapter_put+0xef/0x100 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
[ T93022] pci_device_remove+0x42/0xb0
[ T93022] device_release_driver_internal+0x19f/0x200
[ T93022] driver_detach+0x48/0x90
[ T93022] bus_remove_driver+0x70/0xf0
[ T93022] pci_unregister_driver+0x42/0xb0
[ T93022] ice_module_exit+0x10/0xdb0 [ice
33d2647ad4f6d866d41eefff1806df37c68aef0c]
...
[ T93022] ---[ end trace 0000000000000000 ]---
[ T93022] ice: module unloaded
Fixes: e800654e85 ("ice: Use ice_adapter for PTP shared data instead of auxdev")
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5b38c22687 ]
Current gu2host handler registered as MSI-X vector 0 and as per bspec for
a msix vector 0 interrupt, the driver must check the legacy registers
190008(TILE_INT_REG), 190060h (GT INTR Identity Reg 0) and other registers
mentioned in "Interrupt Service Routine Pseudocode" otherwise it will block
the next interrupts. To overcome this issue replacing guc2host handler
with legacy xe_irq_handler.
Fixes: da889070be ("drm/xe/irq: Separate MSI and MSI-X flows")
Bspec: 62357
Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patch.msgid.link/20251107083141.2080189-1-venkata.ramana.nayana@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit c34a14bce7090862ebe5a64abe8d85df75e62737)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5474560381 ]
On PTL, no combo PHY is connected to PORT B. However, PORT B can
still be used for Type-C and will utilize the C20 PHY for eDP
over Type-C. In such configurations, VBTs also enumerate PORT B.
This leads to issues where PORT B is incorrectly identified as using the
C10 PHY, due to the assumption that returning true for PORT B in
intel_encoder_is_c10phy() would not cause problems.
From PTL's perspective, only PORT A/PHY A uses the C10 PHY.
Update the helper intel_encoder_is_c10phy() to return true only for
PORT A/PHY on PTL.
v2: Change the condition code style for ptl/wcl
Bspec: 72571,73944
Fixes: 9d10de78a3 ("drm/i915/wcl: C10 phy connected to port A and B")
Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://lore.kernel.org/r/20250922150317.2334680-4-dnyaneshwar.bhadane@intel.com
(cherry picked from commit 8147f7a1c083fd565fb958824f7c552de3b2dc46)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 896f1a2493 ]
The loops in 'qede_tpa_cont()' and 'qede_tpa_end()', iterate
over 'cqe->len_list[]' using only a zero-length terminator as
the stopping condition. If the terminator was missing or
malformed, the loop could run past the end of the fixed-size array.
Add an explicit bound check using ARRAY_SIZE() in both loops to prevent
a potential out-of-bounds access.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 55482edc25 ("qede: Add slowpath/fastpath support and enable hardware GRO")
Signed-off-by: Pavel Zhigulin <Pavel.Zhigulin@kaspersky.com>
Link: https://patch.msgid.link/20251113112757.4166625-1-Pavel.Zhigulin@kaspersky.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db30233361 ]
In file uncore-frequency/uncore-frequency-common.h,
correct all kernel-doc warnings by adding missing leading " *" to some
lines, adding a missing kernel-doc entry, and fixing a name typo.
Warning: uncore-frequency-common.h:50 bad line:
Storage for kobject attribute elc_low_threshold_percent
Warning: uncore-frequency-common.h:52 bad line:
Storage for kobject attribute elc_high_threshold_percent
Warning: uncore-frequency-common.h:54 bad line:
Storage for kobject attribute elc_high_threshold_enable
Warning: uncore-frequency-common.h:92 struct member
'min_freq_khz_kobj_attr' not described in 'uncore_data'
Warning: uncore-frequency-common.h:92 struct member
'die_id_kobj_attr' not described in 'uncore_data'
Fixes: 24b6616355 ("platform/x86/intel-uncore-freq: Add efficiency latency control to sysfs interface")
Fixes: 416de0246f ("platform/x86: intel-uncore-freq: Fix types in sysfs callbacks")
Fixes: 247b43fcd8 ("platform/x86/intel-uncore-freq: Add attributes to show die_id")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20251111060938.1998542-1-rdunlap@infradead.org
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e0a754b08 ]
Airoha_eth driver forwards offloaded uplink traffic (packets received
on GDM1 and forwarded to GDM{3,4}) to GDM2 in order to apply hw QoS.
This is correct if the device does not support a dedicated GDM2 port.
In this case, in order to enable hw offloading for uplink traffic,
the packets should be sent to GDM{3,4} directly.
Fixes: 9cd451d414 ("net: airoha: Add loopback support for GDM2")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20251113-airoha-hw-offload-gdm2-fix-v1-1-7e4ca300872f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bed22c7b90 ]
ret_set_ksft_status() calls ksft_status_merge() with the current return
status and the last one. It treats a non-zero return code from
ksft_status_merge() as an indication that the return status was
overwritten by the last one and therefore overwrites the return message
with the last one.
Currently, ksft_status_merge() returns a non-zero return code even if
the current return status and the last one are equal. This results in
return messages being overwritten which is counter-productive since we
are more interested in the first failure message and not the last one.
Fix by changing ksft_status_merge() to only return a non-zero return
code if the current return status was actually changed.
Add a test case which checks that the first error message is not
overwritten.
Before:
# ./lib_sh_test.sh
[...]
TEST: RET tfail2 tfail -> fail [FAIL]
retmsg=tfail expected tfail2
[...]
# echo $?
1
After:
# ./lib_sh_test.sh
[...]
TEST: RET tfail2 tfail -> fail [ OK ]
[...]
# echo $?
0
Fixes: 596c8819cb ("selftests: forwarding: Have RET track kselftest framework constants")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251116081029.69112-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 159de7a825 ]
Commit 7e091add9c "nvme-auth: update sc_c in host response" added
the sc_c variable to the dhchap queue context structure which is
appropriately set during negotiate and then used in the host response.
This breaks secure concat connections with a Linux target as the target
code wasn't updated at the same time. This patch fixes this by adding a
new sc_c variable to the host hash calculations.
Fixes: 7e091add9c ("nvme-auth: update sc_c in host response")
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Martin George <marting@netapp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5442a9da69 ]
Commit dc82a33297 ("veth: apply qdisc backpressure on full ptr_ring to
reduce TX drops") introduced a race condition that can lead to a permanently
stalled TXQ. This was observed in production on ARM64 systems (Ampere Altra
Max).
The race occurs in veth_xmit(). The producer observes a full ptr_ring and
stops the queue (netif_tx_stop_queue()). The subsequent conditional logic,
intended to re-wake the queue if the consumer had just emptied it (if
(__ptr_ring_empty(...)) netif_tx_wake_queue()), can fail. This leads to a
"lost wakeup" where the TXQ remains stopped (QUEUE_STATE_DRV_XOFF) and
traffic halts.
This failure is caused by an incorrect use of the __ptr_ring_empty() API
from the producer side. As noted in kernel comments, this check is not
guaranteed to be correct if a consumer is operating on another CPU. The
empty test is based on ptr_ring->consumer_head, making it reliable only for
the consumer. Using this check from the producer side is fundamentally racy.
This patch fixes the race by adopting the more robust logic from an earlier
version V4 of the patchset, which always flushed the peer:
(1) In veth_xmit(), the racy conditional wake-up logic and its memory barrier
are removed. Instead, after stopping the queue, we unconditionally call
__veth_xdp_flush(rq). This guarantees that the NAPI consumer is scheduled,
making it solely responsible for re-waking the TXQ.
This handles the race where veth_poll() consumes all packets and completes
NAPI *before* veth_xmit() on the producer side has called netif_tx_stop_queue.
The __veth_xdp_flush(rq) will observe rx_notify_masked is false and schedule
NAPI.
(2) On the consumer side, the logic for waking the peer TXQ is moved out of
veth_xdp_rcv() and placed at the end of the veth_poll() function. This
placement is part of fixing the race, as the netif_tx_queue_stopped() check
must occur after rx_notify_masked is potentially set to false during NAPI
completion.
This handles the race where veth_poll() consumes all packets, but haven't
finished (rx_notify_masked is still true). The producer veth_xmit() stops the
TXQ and __veth_xdp_flush(rq) will observe rx_notify_masked is true, meaning
not starting NAPI. Then veth_poll() change rx_notify_masked to false and
stops NAPI. Before exiting veth_poll() will observe TXQ is stopped and wake
it up.
Fixes: dc82a33297 ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops")
Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://patch.msgid.link/176295323282.307447.14790015927673763094.stgit@firesoul
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dfe28c4167 ]
The validation of the set(nsh(...)) action is completely wrong.
It runs through the nsh_key_put_from_nlattr() function that is the
same function that validates NSH keys for the flow match and the
push_nsh() action. However, the set(nsh(...)) has a very different
memory layout. Nested attributes in there are doubled in size in
case of the masked set(). That makes proper validation impossible.
There is also confusion in the code between the 'masked' flag, that
says that the nested attributes are doubled in size containing both
the value and the mask, and the 'is_mask' that says that the value
we're parsing is the mask. This is causing kernel crash on trying to
write into mask part of the match with SW_FLOW_KEY_PUT() during
validation, while validate_nsh() doesn't allocate any memory for it:
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 1c2383067 P4D 1c2383067 PUD 20b703067 PMD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 8 UID: 0 Kdump: loaded Not tainted 6.17.0-rc4+ #107 PREEMPT(voluntary)
RIP: 0010:nsh_key_put_from_nlattr+0x19d/0x610 [openvswitch]
Call Trace:
<TASK>
validate_nsh+0x60/0x90 [openvswitch]
validate_set.constprop.0+0x270/0x3c0 [openvswitch]
__ovs_nla_copy_actions+0x477/0x860 [openvswitch]
ovs_nla_copy_actions+0x8d/0x100 [openvswitch]
ovs_packet_cmd_execute+0x1cc/0x310 [openvswitch]
genl_family_rcv_msg_doit+0xdb/0x130
genl_family_rcv_msg+0x14b/0x220
genl_rcv_msg+0x47/0xa0
netlink_rcv_skb+0x53/0x100
genl_rcv+0x24/0x40
netlink_unicast+0x280/0x3b0
netlink_sendmsg+0x1f7/0x430
____sys_sendmsg+0x36b/0x3a0
___sys_sendmsg+0x87/0xd0
__sys_sendmsg+0x6d/0xd0
do_syscall_64+0x7b/0x2c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The third issue with this process is that while trying to convert
the non-masked set into masked one, validate_set() copies and doubles
the size of the OVS_KEY_ATTR_NSH as if it didn't have any nested
attributes. It should be copying each nested attribute and doubling
them in size independently. And the process must be properly reversed
during the conversion back from masked to a non-masked variant during
the flow dump.
In the end, the only two outcomes of trying to use this action are
either validation failure or a kernel crash. And if somehow someone
manages to install a flow with such an action, it will most definitely
not do what it is supposed to, since all the keys and the masks are
mixed up.
Fixing all the issues is a complex task as it requires re-writing
most of the validation code.
Given that and the fact that this functionality never worked since
introduction, let's just remove it altogether. It's better to
re-introduce it later with a proper implementation instead of trying
to fix it in stable releases.
Fixes: b2d0f5d5dc ("openvswitch: enable NSH support")
Reported-by: Junvy Yang <zhuque@tencent.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20251112112246.95064-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6cbab9f0da ]
Add a call to put_pid() corresponding to get_task_pid().
host1x_memory_context_alloc() does not take ownership of the PID so we
need to free it here to avoid leaking.
Signed-off-by: Prateek Agarwal <praagarwal@nvidia.com>
Fixes: e09db97889 ("drm/tegra: Support context isolation")
[mperttunen@nvidia.com: reword commit message]
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20250919-host1x-put-pid-v1-1-19c2163dfa87@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 407a06507c ]
The function mlxsw_sp_flower_stats() calls mlxsw_sp_acl_ruleset_get() to
obtain a ruleset reference. If the subsequent call to
mlxsw_sp_acl_rule_lookup() fails to find a rule, the function returns
an error without releasing the ruleset reference, causing a memory leak.
Fix this by using a goto to the existing error handling label, which
calls mlxsw_sp_acl_ruleset_put() to properly release the reference.
Fixes: 7c1b8eb175 ("mlxsw: spectrum: Add support for TC flower offload statistics")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251112052114.1591695-1-zilin@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f796a8dec9 ]
The ethtool tsconfig Netlink path can trigger a null pointer
dereference. A call chain such as:
tsconfig_prepare_data() ->
dev_get_hwtstamp_phylib() ->
vlan_hwtstamp_get() ->
generic_hwtstamp_get_lower() ->
generic_hwtstamp_ioctl_lower()
results in generic_hwtstamp_ioctl_lower() being called with
kernel_cfg->ifr as NULL.
The generic_hwtstamp_ioctl_lower() function does not expect
a NULL ifr and dereferences it, leading to a system crash.
Fix this by adding a NULL check for kernel_cfg->ifr in
generic_hwtstamp_ioctl_lower(). If ifr is NULL, return -EINVAL.
Fixes: 6e9e2eed4f ("net: ethtool: Add support for tsconfig command to get/set hwtstamp config")
Closes: https://lore.kernel.org/cd6a7056-fa6d-43f8-b78a-f5e811247ba8@linux.dev
Signed-off-by: Jiaming Zhang <r772577952@gmail.com>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20251111173652.749159-2-r772577952@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09782e72ee ]
In fact, it is a multi-threaded MIPS34Kc, not a single-threaded MIPS24Kc.
Fixes: 0ec4887009 ("mips: dts: Add EcoNet DTS with EN751221 and SmartFiber XP8421-B board")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97b726eb1d ]
The WMI driver core only supports GUID strings containing only
uppercase characters, however the GUID string used by the
msi-wmi-platform driver contains a single lowercase character.
This prevents the WMI driver core from matching said driver to
its WMI device.
Fix this by turning the lowercase character into a uppercase
character. Also update the WMI driver development guide to warn
about this.
Reported-by: Antheas Kapenekakis <lkml@antheas.dev>
Fixes: 9c0beb6b29 ("platform/x86: wmi: Add MSI WMI Platform driver")
Tested-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251110111253.16204-3-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c93433fd4e ]
It turns out that the GUID used by the msi-wmi-platform driver
(ABBC0F60-8EA1-11D1-00A0-C90629100000) is not unique, but was instead
copied from the WIndows Driver Samples. This means that this driver
could load on devices from other manufacturers that also copied this
GUID, potentially causing hardware errors.
Prevent this by only loading on devices whitelisted via DMI. The DMI
matches where taken from the msi-ec driver.
Reported-by: Antheas Kapenekakis <lkml@antheas.dev>
Fixes: 9c0beb6b29 ("platform/x86: wmi: Add MSI WMI Platform driver")
Tested-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251110111253.16204-2-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b07cdf86a ]
The driver calls fwnode_get_named_child_node() which takes a reference
on the child node, but never releases it, which causes a reference leak.
Fix by using devm_add_action_or_reset() to automatically release the
reference when the device is removed.
Fixes: d5282a5392 ("pinctrl: cs42l43: Add support for the cs42l43")
Suggested-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 59630e2ccd ]
Add a check to ensure locally generated packets (skb->sk != NULL) do
not use direct output in tunnel mode, as these packets require proper
L2 header setup that is handled by the normal XFRM processing path.
Fixes: 5eddd76ec2 ("xfrm: fix tunnel mode TX datapath in packet offload mode")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 61fafbee6c ]
The GSO segmentation functions for ESP tunnel mode
(xfrm4_tunnel_gso_segment and xfrm6_tunnel_gso_segment) were
determining the inner packet's L2 protocol type by checking the static
x->inner_mode.family field from the xfrm state.
This is unreliable. In tunnel mode, the state's actual inner family
could be defined by x->inner_mode.family or by
x->inner_mode_iaf.family. Checking only the former can lead to a
mismatch with the actual packet being processed, causing GSO to create
segments with the wrong L2 header type.
This patch fixes the bug by deriving the inner mode directly from the
packet's inner protocol stored in XFRM_MODE_SKB_CB(skb)->protocol.
Instead of replicating the code, this patch modifies the
xfrm_ip2inner_mode helper function. It now correctly returns
&x->inner_mode if the selector family (x->sel.family) is already
specified, thereby handling both specific and AF_UNSPEC cases
appropriately.
With this change, ESP GSO can use xfrm_ip2inner_mode to get the
correct inner mode. It doesn't affect existing callers, as the updated
logic now mirrors the checks they were already performing externally.
Fixes: 26dbd66eab ("esp: choose the correct inner protocol for GSO on inter address family tunnels")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 082ef944e5 ]
In the output path, xfrm_dev_offload_ok and xfrm_get_inner_ipproto
need to determine the protocol family of the inner packet (skb) before
it gets encapsulated.
In xfrm_dev_offload_ok, the code checked x->inner_mode.family. This is
unreliable because, for states handling both IPv4 and IPv6, the
relevant inner family could be either x->inner_mode.family or
x->inner_mode_iaf.family. Checking only the former can lead to a
mismatch with the actual packet being processed.
In xfrm_get_inner_ipproto, the code checked x->outer_mode.family. This
is also incorrect for tunnel mode, as the inner packet's family can be
different from the outer header's family.
At both of these call sites, the skb variable holds the original inner
packet. The most direct and reliable source of truth for its protocol
family is its destination entry. This patch fixes the issue by using
skb_dst(skb)->ops->family to ensure protocol-specific headers are only
accessed for the correct packet type.
Fixes: 91d8a53db2 ("xfrm: fix offloading of cross-family tunnels")
Fixes: 45a98ef492 ("net/xfrm: IPsec tunnel mode fix inner_ipproto setting in sec_path")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 369f772299 ]
The pinctrl-rtd driver uses 'devm_regmap_init_mmio', which requires
'REGMAP_MMIO' to be enabled.
Without this selection, the build fails with an undefined reference:
aarch64-none-linux-gnu-ld: drivers/pinctrl/realtek/pinctrl-rtd.o: in
function rtd_pinctrl_probe': pinctrl-rtd.c:(.text+0x5a0): undefined
reference to __devm_regmap_init_mmio_clk'
Fix this by selecting 'REGMAP_MMIO' in the Kconfig.
Fixes: e99ce78030 ("pinctrl: realtek: Add common pinctrl driver for Realtek DHC RTD SoCs")
Signed-off-by: Yu-Chun Lin <eleanor.lin@realtek.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2050280a4b ]
While the user manual states that the PLL's rate should be between 180
MHz and 3 GHz in the register defninition section, it also says the
actual operating frequency is 22.5792*4 MHz in the PLL features table.
22.5792*4 MHz is one of the actual clock rates that we want and is
is available in the SDM table. Lower the minimum clock rate to 90 MHz
so that both rates in the SDM table can be used.
Fixes: 7cae1e2b55 ("clk: sunxi-ng: Add support for the A523/T527 CCU PLLs")
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20251020171059.2786070-7-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5888533c60 ]
The "bus-r-dma" clock in the A523's PRCM clock controller is also
referred to as "DMA_CLKEN_SW" or "DMA ADB400 gating". It is unclear how
this ties into the DMA controller MBUS clock gate; however if the clock
is not enabled, the DMA controller in the MCU block will fail to access
DRAM, even failing to retrieve the DMA descriptors.
Mark this clock as critical. This sort of mirrors what is done for the
main DMA controller's MBUS clock, which has a separate toggle that is
currently left out of the main clock controller driver.
Fixes: 8cea339cfb ("clk: sunxi-ng: add support for the A523/T527 PRCM CCU")
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20251020171059.2786070-6-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1dcf617bec ]
xfrm_state_construct can fail without setting an error if the
requested pcpu_num value is too big. Set err and add an extack message
to avoid confusing userspace.
Fixes: 1ddf9916ac ("xfrm: Add support for per cpu xfrm state handling.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f02285764 ]
In case xfrm_state_migrate fails after calling xfrm_dev_state_add, we
directly release the last reference and destroy the new state, without
calling xfrm_dev_state_delete (this only happens in
__xfrm_state_delete, which we're not calling on this path, since the
state was never added).
Call xfrm_dev_state_delete on error when an offload configuration was
provided.
Fixes: ab244a394c ("xfrm: Migrate offload configuration")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 10deb69864 ]
In commit b441cf3f8c ("xfrm: delete x->tunnel as we delete x"), I
missed the case where state creation fails between full
initialization (->init_state has been called) and being inserted on
the lists.
In this situation, ->init_state has been called, so for IPcomp
tunnels, the fallback tunnel has been created and added onto the
lists, but the user state never gets added, because we fail before
that. The user state doesn't go through __xfrm_state_delete, so we
don't call xfrm_state_delete_tunnel for those states, and we end up
leaking the FB tunnel.
There are several codepaths affected by this: the add/update paths, in
both net/key and xfrm, and the migrate code (xfrm_migrate,
xfrm_state_migrate). A "proper" rollback of the init_state work would
probably be doable in the add/update code, but for migrate it gets
more complicated as multiple states may be involved.
At some point, the new (not-inserted) state will be destroyed, so call
xfrm_state_delete_tunnel during xfrm_state_gc_destroy. Most states
will have their fallback tunnel cleaned up during __xfrm_state_delete,
which solves the issue that b441cf3f8c (and other patches before it)
aimed at. All states (including FB tunnels) will be removed from the
lists once xfrm_state_fini has called flush_work(&xfrm_state_gc_work).
Reported-by: syzbot+999eb23467f83f9bf9bf@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=999eb23467f83f9bf9bf
Fixes: b441cf3f8c ("xfrm: delete x->tunnel as we delete x")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 518919276c ]
The mt8189-pinctrl driver requires to probe that a device tree uses
in the device node the same names than mt8189_pinctrl_register_base_names
array. But they are not matching the required ones in the
"mediatek,mt8189-pinctrl" dt-bindings, leading to possible dtbs check
issues. The mt8189_pinctrl_register_base_names entry order is also
different.
So, align all mt8189_pinctrl_register_base_names entry names and order
on dt-bindings.
Fixes: a3fe1324c3 ("pinctrl: mediatek: Add pinctrl driver for mt8189")
Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 404ee89b40 ]
The mt8196-pinctrl driver requires to probe that a device tree uses
in the device node the same names than mt8196_pinctrl_register_base_names
array. But they are not matching the required ones in the
"mediatek,mt8196-pinctrl" dt-bindings, leading to possible dtbs check
issues.
So, align all mt8196_pinctrl_register_base_names entries on dt-bindings
ones.
Fixes: f7a29377c2 ("pinctrl: mediatek: Add pinctrl driver on mt8196")
Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 5bab4c8939 upstream.
[Why]
On DCN20 & DCN30, the 6th DPP's & HUBP's are powered on permanently and
cannot be power gated. Thus, when dpp_reset() is invoked for the DPP5,
while it's still powered on, the cached cursor_state
(dpp_base->pos.cur0_ctl.bits.cur0_enable)
and the actual state (CUR0_ENABLE) bit are unsycned. This can cause a
double cursor in full screen with non-native scaling.
[How]
Force disable cursor on DPP5 on plane powerdown for ASICs w/ 6 DPPs/HUBPs.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4673
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 79b3c037f972dcb13e325a8eabfb8da835764e15)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1788ef3072 upstream.
[Why]
Existing routine has two conversion sequence,
pbn_to_kbps and kbps_to_pbn with margin.
Non of those has without-margin calculation.
kbps_to_pbn with margin conversion includes
fec overhead which has already been included in
pbn_div calculation with 0.994 factor considered.
It is a double counted fec overhead factor that causes
potential bw loss.
[How]
Add without-margin calculation.
Fix fec overhead double counted issue.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3735
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e0dec00f3d05e8c0eceaaebfdca217f8d10d380c)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71ad9054c1 upstream.
[Why]
When a monitor is booting it's possible that it isn't ready to retrieve
link caps and this can lead to an EDID read failure:
```
[drm:retrieve_link_cap [amdgpu]] *ERROR* retrieve_link_cap: Read receiver caps dpcd data failed.
amdgpu 0000:c5:00.0: [drm] *ERROR* No EDID read.
```
[How]
Rather than msleep once and try a few times, msleep each time. Should
be no changes for existing working monitors, but should correct reading
caps on a monitor that is slow to boot.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4672
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 669dca37b3348a447db04bbdcbb3def94d5997cc)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80d8a9ad15 upstream.
[Why]
Accoreding to CP updated to RS64 on gfx11,
WRITE_DATA with PREEMPTION_META_MEMORY(dst_sel=8) is illegal for CP FW.
That packet is used for MCBP on F32 based system.
So it would lead to incorrect GRBM write and FW is not handling that
extra case correctly.
[How]
With gfx11 rs64 enabled, skip emit de meta data.
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8366cd442d226463e673bed5d199df916f4ecbcf)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31ab31433c upstream.
During the suspend sequence VPE is already going to be power gated
as part of vpe_suspend(). It's unnecessary to call during calls to
amdgpu_device_set_pg_state().
It actually can expose a race condition with the firmware if s0i3
sequence starts as well. Drop these calls.
Cc: Peyton.Lee@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2a6c826cfeedd7714611ac115371a959ead55bda)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fff0c87996 upstream.
With the current fastclose implementation, the mptcp_do_fastclose()
helper is in charge of two distinct actions: send the fastclose reset
and cleanup the subflows.
Formally decouple the two steps, ensuring that mptcp explicitly closes
all the subflows after the mentioned helper.
This will make the upcoming fix simpler, and allows dropping the 2nd
argument from mptcp_destroy_common(). The Fixes tag is then the same as
in the next commit to help with the backports.
Fixes: d21f834855 ("mptcp: use fastclose on more edge scenarios")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-5-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f102d747c upstream.
The rcv window is shared among all the subflows. Currently, MPTCP sync
the TCP-level rcv window with the MPTCP one at tcp_transmit_skb() time.
The above means that incoming data may sporadically observe outdated
TCP-level rcv window and being wrongly dropped by TCP.
Address the issue checking for the edge condition before queuing the
data at TCP level, and eventually syncing the rcv window as needed.
Note that the issue is actually present from the very first MPTCP
implementation, but backports older than the blamed commit below will
range from impossible to useless.
Before:
$ nstat -n; sleep 1; nstat -z TcpExtBeyondWindow
TcpExtBeyondWindow 14 0.0
After:
$ nstat -n; sleep 1; nstat -z TcpExtBeyondWindow
TcpExtBeyondWindow 0 0.0
Fixes: fa3fe2b150 ("mptcp: track window announced to peer")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-2-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e4ec14dc1 upstream.
In rare cases, when the test environment is very slow, some userspace
tests can fail because some expected events have not been seen.
Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to have a longer
timeout, and even go over the default one. This connection will be
killed at the end, after the verifications: increasing the timeout
doesn't change anything, apart from avoiding it to end before the end of
the verifications.
To play it safe, all userspace tests not waiting for the end of the
transfer are now having a longer timeout: 2 minutes.
The Fixes commit was making the connection longer, but still, the
default timeout would have stopped it after 1 minute, which might not be
enough in very slow environments.
Fixes: 290493078b ("selftests: mptcp: join: userspace: longer transfer")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-9-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb13c6bb81 upstream.
In rare cases, when the test environment is very slow, some endpoints
tests can fail because some expected events have not been seen.
Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to have a longer
timeout, and even go over the default one. This connection will be
killed at the end, after the verifications: increasing the timeout
doesn't change anything, apart from avoiding it to end before the end of
the verifications.
To play it safe, all endpoints tests not waiting for the end of the
transfer are now having a longer timeout: 2 minutes.
The Fixes commit was making the connection longer, but still, the
default timeout would have stopped it after 1 minute, which might not be
enough in very slow environments.
Fixes: 6457595db9 ("selftests: mptcp: join: endpoints: longer transfer")
Cc: stable@vger.kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-8-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 17393fa7b7 upstream.
I'm observing very frequent self-tests failures in case of fallback when
running on a CONFIG_PREEMPT kernel.
The root cause is that subflow_sched_work_if_closed() closes any subflow
as soon as it is half-closed and has no incoming data pending.
That works well for regular subflows - MPTCP needs bi-directional
connectivity to operate on a given subflow - but for fallback socket is
race prone.
When TCP peer closes the connection before the MPTCP one,
subflow_sched_work_if_closed() will schedule the MPTCP worker to
gracefully close the subflow, and shortly after will do another schedule
to inject and process a dummy incoming DATA_FIN.
On CONFIG_PREEMPT kernel, the MPTCP worker can kick-in and close the
fallback subflow before subflow_sched_work_if_closed() is able to create
the dummy DATA_FIN, unexpectedly interrupting the transfer.
Address the issue explicitly avoiding closing fallback subflows on when
the peer is only half-closed.
Note that, when the subflow is able to create the DATA_FIN before the
worker invocation, the worker will change the msk state before trying to
close the subflow and will skip the latter operation as the msk will not
match anymore the precondition in __mptcp_close_subflow().
Fixes: f09b0ad55a ("mptcp: close subflow when receiving TCP+FIN")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-3-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae15506024 upstream.
The CI reports sporadic failures of the fastclose self-tests. The root
cause is a duplicate reset, not carrying the relevant MPTCP option.
In the failing scenario the bad reset is received by the peer before
the fastclose one, preventing the reception of the latter.
Indeed there is window of opportunity at fastclose time for the
following race:
mptcp_do_fastclose
__mptcp_close_ssk
__tcp_close()
tcp_set_state() [1]
tcp_send_active_reset() [2]
After [1] the stack will send reset to in-flight data reaching the now
closed port. Such reset may race with [2].
Address the issue explicitly sending a single reset on fastclose before
explicitly moving the subflow to close status.
Fixes: d21f834855 ("mptcp: use fastclose on more edge scenarios")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/596
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-6-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e15395f6d upstream.
mptcp_cleanup_rbuf() needs to know the last most recent, mptcp-level
rcv_wnd sent, and such information is tracked into the msk->old_wspace
field, updated at ack transmission time by mptcp_write_options().
Fallback socket do not add any mptcp options, such helper is never
invoked, and msk->old_wspace value remain stale. That in turn makes
ack generation at recvmsg() time quite random.
Address the issue ensuring mptcp_write_options() is invoked even for
fallback sockets, and just update the needed info in such a case.
The issue went unnoticed for a long time, as mptcp currently overshots
the fallback socket receive buffer autotune significantly. It is going
to change in the near future.
Fixes: e3859603ba ("mptcp: better msk receive window updates")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/594
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-1-806d3781c95f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 035bca3f01 upstream.
syzbot reported use-after-free in mptcp_schedule_work() [1]
Issue here is that mptcp_schedule_work() schedules a work,
then gets a refcount on sk->sk_refcnt if the work was scheduled.
This refcount will be released by mptcp_worker().
[A] if (schedule_work(...)) {
[B] sock_hold(sk);
return true;
}
Problem is that mptcp_worker() can run immediately and complete before [B]
We need instead :
sock_hold(sk);
if (schedule_work(...))
return true;
sock_put(sk);
[1]
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 1 PID: 29 at lib/refcount.c:25 refcount_warn_saturate+0xfa/0x1d0 lib/refcount.c:25
Call Trace:
<TASK>
__refcount_add include/linux/refcount.h:-1 [inline]
__refcount_inc include/linux/refcount.h:366 [inline]
refcount_inc include/linux/refcount.h:383 [inline]
sock_hold include/net/sock.h:816 [inline]
mptcp_schedule_work+0x164/0x1a0 net/mptcp/protocol.c:943
mptcp_tout_timer+0x21/0xa0 net/mptcp/protocol.c:2316
call_timer_fn+0x17e/0x5f0 kernel/time/timer.c:1747
expire_timers kernel/time/timer.c:1798 [inline]
__run_timers kernel/time/timer.c:2372 [inline]
__run_timer_base+0x648/0x970 kernel/time/timer.c:2384
run_timer_base kernel/time/timer.c:2393 [inline]
run_timer_softirq+0xb7/0x180 kernel/time/timer.c:2403
handle_softirqs+0x22f/0x710 kernel/softirq.c:622
__do_softirq kernel/softirq.c:656 [inline]
run_ktimerd+0xcf/0x190 kernel/softirq.c:1138
smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
kthread+0x711/0x8a0 kernel/kthread.c:463
ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
Cc: stable@vger.kernel.org
Fixes: 3b1d6210a9 ("mptcp: implement and use MPTCP-level retransmission")
Reported-by: syzbot+355158e7e301548a1424@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6915b46f.050a0220.3565dc.0028.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251113103924.3737425-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit acf5de1b23 upstream.
On physical machine, NUMA node id comes from high bit 44:48 of physical
address. However it is not true on virt machine. With general method, it
comes from ACPI SRAT table.
Here the common function numa_memblks_init() is used to parse NUMA node
information with numa_memblks.
Cc: <stable@vger.kernel.org>
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6b533adfc upstream.
If there is no valid cache info detected (may happen in virtual machine)
for pci_dfl_cache_line_size, kernel shouldn't panic. Because in the PCI
core it will be evaluated to (L1_CACHE_BYTES >> 2).
Cc: <stable@vger.kernel.org>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 677e6123e3 upstream.
The current LoongArch BPF trampoline implementation is incompatible
with tracing functions in kernel modules. This causes several severe
and user-visible problems:
* The `bpf_selftests/module_attach` test fails consistently.
* Kernel lockup when a BPF program is attached to a module function [1].
* Critical kernel modules like WireGuard experience traffic disruption
when their functions are traced with fentry [2].
Given the severity and the potential for other unknown side-effects, it
is safest to disable the feature entirely for now. This patch prevents
the BPF subsystem from allowing trampoline attachments to kernel module
functions on LoongArch.
This is a temporary mitigation until the core issues in the trampoline
code for kernel module handling can be identified and fixed.
[root@fedora bpf]# ./test_progs -a module_attach -v
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -524
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
Successfully unloaded bpf_testmod.ko.
[1]: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C13FMT58oqQ@mail.gmail.com/
[2]: https://lore.kernel.org/loongarch/CAK3+h2wYcpc+OwdLDUBvg2rF9rvvyc5amfHT-KcFaK93uoELPg@mail.gmail.com/
Cc: stable@vger.kernel.org
Fixes: f9b6b41f0c ("LoongArch: BPF: Add basic bpf trampoline support")
Acked-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 316e361b5d upstream.
The "groups" property can hold multiple entries (e.g.
toshiba/tmpv7708-rm-mbrc.dts file), so allow that by dropping incorrect
type (pinmux-node.yaml schema already defines that as string-array) and
adding constraints for items. This fixes dtbs_check warnings like:
toshiba/tmpv7708-rm-mbrc.dtb: pinctrl@24190000 (toshiba,tmpv7708-pinctrl):
pwm-pins:groups: ['pwm0_gpio16_grp', 'pwm1_gpio17_grp', 'pwm2_gpio18_grp', 'pwm3_gpio19_grp'] is too long
Fixes: 1825c1fe00 ("pinctrl: Add DT bindings for Toshiba Visconti TMPV7700 SoC")
Cc: stable@vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebd729fef3 upstream.
Fix a regression that has caused accesses to the PCI MMIO window to
complete unclaimed in non-EVA configurations with the SOC-it family of
system controllers, preventing PCI devices from working that use MMIO.
In the non-EVA case PHYS_OFFSET is set to 0, meaning that PCI_BAR0 is
set with an empty mask (and PCI_HEAD4 matches addresses starting from 0
accordingly). Consequently all addresses are matched for incoming DMA
accesses from PCI. This seems to confuse the system controller's logic
and outgoing bus cycles targeting the PCI MMIO window seem not to make
it to the intended devices.
This happens as well when a wider mask is used with PCI_BAR0, such as
0x80000000 or 0xe0000000, that makes addresses match that overlap with
the PCI MMIO window, which starts at 0x10000000 in our configuration.
Set the mask in PCI_BAR0 to 0xf0000000 for non-EVA then, covering the
non-EVA maximum 256 MiB of RAM, which is what YAMON does and which used
to work correctly up to the offending commit. Set PCI_P2SCMSKL to match
PCI_BAR0 as required by the system controller's specification, and match
PCI_P2SCMAPL to PCI_HEAD4 for identity mapping.
Verified with:
Core board type/revision = 0x0d (Core74K) / 0x01
System controller/revision = MIPS SOC-it 101 OCP / 1.3 SDR-FW-4:1
Processor Company ID/options = 0x01 (MIPS Technologies, Inc.) / 0x1c
Processor ID/revision = 0x97 (MIPS 74Kf) / 0x4c
for non-EVA and with:
Core board type/revision = 0x0c (CoreFPGA-5) / 0x00
System controller/revision = MIPS ROC-it2 / 0.0 FW-1:1 (CLK_unknown) GIC
Processor Company ID/options = 0x01 (MIPS Technologies, Inc.) / 0x00
Processor ID/revision = 0xa0 (MIPS interAptiv UP) / 0x20
for EVA/non-EVA, fixing:
defxx 0000:00:12.0: assign IRQ: got 10
defxx: v1.12 2021/03/10 Lawrence V. Stefani and others
0000:00:12.0: Could not read adapter factory MAC address!
vs:
defxx 0000:00:12.0: assign IRQ: got 10
defxx: v1.12 2021/03/10 Lawrence V. Stefani and others
0000:00:12.0: DEFPA at MMIO addr = 0x10142000, IRQ = 10, Hardware addr = 00-00-f8-xx-xx-xx
0000:00:12.0: registered as fddi0
for non-EVA and causing no change for EVA.
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: 422dd25664 ("MIPS: Malta: Allow PCI devices DMA to lower 2GB physical")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6965188f8 upstream.
If the allocation of tl_hba->sh fails in tcm_loop_driver_probe() and we
attempt to dereference it in tcm_loop_tpg_address_show() we will get a
segfault, see below for an example. So, check tl_hba->sh before
dereferencing it.
Unable to allocate struct scsi_host
BUG: kernel NULL pointer dereference, address: 0000000000000194
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 8356 Comm: tokio-runtime-w Not tainted 6.6.104.2-4.azl3 #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024
RIP: 0010:tcm_loop_tpg_address_show+0x2e/0x50 [tcm_loop]
...
Call Trace:
<TASK>
configfs_read_iter+0x12d/0x1d0 [configfs]
vfs_read+0x1b5/0x300
ksys_read+0x6f/0xf0
...
Cc: stable@vger.kernel.org
Fixes: 2628b352c3 ("tcm_loop: Show address of tpg in configfs")
Signed-off-by: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Allen Pais <apais@linux.microsoft.com>
Link: https://patch.msgid.link/1762370746-6304-1-git-send-email-hamzamahfooz@linux.microsoft.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b6216baae upstream.
A crash was observed when the sched_ext selftests runner was
terminated with Ctrl+\ while test 15 was running:
NIP [c00000000028fa58] scx_enable.constprop.0+0x358/0x12b0
LR [c00000000028fa2c] scx_enable.constprop.0+0x32c/0x12b0
Call Trace:
scx_enable.constprop.0+0x32c/0x12b0 (unreliable)
bpf_struct_ops_link_create+0x18c/0x22c
__sys_bpf+0x23f8/0x3044
sys_bpf+0x2c/0x6c
system_call_exception+0x124/0x320
system_call_vectored_common+0x15c/0x2ec
kthread_run_worker() returns an ERR_PTR() on failure rather than NULL,
but the current code in scx_alloc_and_add_sched() only checks for a NULL
helper. Incase of failure on SIGQUIT, the error is not handled in
scx_alloc_and_add_sched() and scx_enable() ends up dereferencing an
error pointer.
Error handling is fixed in scx_alloc_and_add_sched() to propagate
PTR_ERR() into ret, so that scx_enable() jumps to the existing error
path, avoiding random dereference on failure.
Fixes: bff3b5aec1 ("sched_ext: Move disable machinery into scx_sched")
Cc: stable@vger.kernel.org # v6.16+
Reported-and-tested-by: Samir Mulani <samir@linux.ibm.com>
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f384497a76 upstream.
Runtime PM should only be enabled in device_resume_early() if it has
been disabled for the given device by device_suspend_late(). Otherwise,
it may cause runtime PM callbacks to run prematurely in some cases
which leads to further functional issues.
Make two changes to address this problem.
First, reorder device_suspend_late() to only disable runtime PM for a
device when it is going to look for the device's callback or if the
device is a "syscore" one. In all of the other cases, disabling runtime
PM for the device is not in fact necessary. However, if the device's
callback returns an error and the power.is_late_suspended flag is not
going to be set, enable runtime PM so it only remains disabled when
power.is_late_suspended is set.
Second, make device_resume_early() only enable runtime PM for the
devices with the power.is_late_suspended flag set.
Fixes: 443046d1ad ("PM: sleep: Make suspend of devices more asynchronous")
Reported-by: Rose Wu <ya-jou.wu@mediatek.com>
Closes: https://lore.kernel.org/linux-pm/70b25dca6f8c2756d78f076f4a7dee7edaaffc33.camel@mediatek.com/
Cc: 6.16+ <stable@vger.kernel.org> # 6.16+
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12784270.O9o76ZdvQC@rafael.j.wysocki
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea3442efab upstream.
Now target is removed from nvme_fc_ctrl_free() which is the ctrl->ref
release handler. And even admin queue is unquiesced there, this way
is definitely wrong because the ctr->ref is grabbed when submitting
command.
And Marco observed that nvme_fc_ctrl_free() can be called from request
completion code path, and trigger kernel warning since request completes
from softirq context.
Fix the issue by moveing target removal into nvme_fc_delete_ctrl(),
which is also aligned with nvme-tcp and nvme-rdma.
Patch originally proposed by Ming Lei, then modified to move the tagset
removal down to after nvme_fc_delete_association() after further testing.
Cc: Marco Patalano <mpatalan@redhat.com>
Cc: Ewan Milne <emilne@redhat.com>
Cc: James Smart <james.smart@broadcom.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Cc: stable@vger.kernel.org
Tested-by: Marco Patalano <mpatalan@redhat.com>
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69aeb50731 upstream.
In the pegasus_notetaker driver, the pegasus_probe() function allocates
the URB transfer buffer using the wMaxPacketSize value from
the endpoint descriptor. An attacker can use a malicious USB descriptor
to force the allocation of a very small buffer.
Subsequently, if the device sends an interrupt packet with a specific
pattern (e.g., where the first byte is 0x80 or 0x42),
the pegasus_parse_packet() function parses the packet without checking
the allocated buffer size. This leads to an out-of-bounds memory access.
Fixes: 1afca2b66a ("Input: add Pegasus Notetaker tablet driver")
Signed-off-by: Seungjin Bae <eeodqql09@gmail.com>
Link: https://lore.kernel.org/r/20251007214131.3737115-2-eeodqql09@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e08969c4d6 upstream.
If cros_ec_keyb_register_matrix() isn't called (due to
`buttons_switches_only`) in cros_ec_keyb_probe(), `ckdev->idev` remains
NULL. An invalid memory access is observed in cros_ec_keyb_process()
when receiving an EC_MKBP_EVENT_KEY_MATRIX event in cros_ec_keyb_work()
in such case.
Unable to handle kernel read from unreadable memory at virtual address 0000000000000028
...
x3 : 0000000000000000 x2 : 0000000000000000
x1 : 0000000000000000 x0 : 0000000000000000
Call trace:
input_event
cros_ec_keyb_work
blocking_notifier_call_chain
ec_irq_thread
It's still unknown about why the kernel receives such malformed event,
in any cases, the kernel shouldn't access `ckdev->idev` and friends if
the driver doesn't intend to initialize them.
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Link: https://patch.msgid.link/20251104070310.3212712-1-tzungbi@kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 660b299bed upstream.
Commit b6bcbce335 ("soc/tegra: pmc: Ensure power-domains are in a
known state") was introduced so that all power domains get initialized
to a known working state when booting and it does this by shutting them
down (including asserting resets and disabling clocks) before registering
each power domain with the genpd framework, leaving it to each driver to
later on power its needed domains.
This caused the Google Pixel C to hang when booting due to a workaround
in the DSI driver introduced in commit b22fd0b963 ("drm/tegra: dsi:
Clear enable register if powered by bootloader") meant to handle the case
where the bootloader enabled the DSI hardware module. The workaround relies
on reading a hardware register to determine the current status and after
b6bcbce335 that now happens in a powered down state thus leading to
the boot hang.
Fix this by reverting b22fd0b963 since currently we are guaranteed
that the hardware will be fully reset by the time we start enabling the
DSI module.
Fixes: b6bcbce335 ("soc/tegra: pmc: Ensure power-domains are in a known state")
Cc: stable@vger.kernel.org
Signed-off-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patch.msgid.link/20251103-diogo-smaug_ec_typec-v1-1-be656ccda391@tecnico.ulisboa.pt
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ceb6ac211 upstream.
Correct RGMII delay application logic in lan937x_set_tune_adj().
The function was missing `data16 &= ~PORT_TUNE_ADJ` before setting the
new delay value. This caused the new value to be bitwise-OR'd with the
existing PORT_TUNE_ADJ field instead of replacing it.
For example, when setting the RGMII 2 TX delay on port 4, the
intended TUNE_ADJUST value of 0 (RGMII_2_TX_DELAY_2NS) was
incorrectly OR'd with the default 0x1B (from register value 0xDA3),
leaving the delay at the wrong setting.
This patch adds the missing mask to clear the field, ensuring the
correct delay value is written. Physical measurements on the RGMII TX
lines confirm the fix, showing the delay changing from ~1ns (before
change) to ~2ns.
While testing on i.MX 8MP showed this was within the platform's timing
tolerance, it did not match the intended hardware-characterized value.
Fixes: b19ac41faa ("net: dsa: microchip: apply rgmii tx and rx delay in phylink mac config")
Cc: stable@vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20251114090951.4057261-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 46447367a5 upstream.
If timestamp retriving needs to be retried and the local list of
SKB's already has entries, then it's spliced back into the socket
queue. However, the arguments for the splice helper are transposed,
causing exactly the wrong direction of splicing into the on-stack
list. Fix that up.
Cc: stable@vger.kernel.org
Reported-by: Google Big Sleep <big-sleep-vuln-reports+bigsleep-462435176@google.com>
Fixes: 9e4ed359b8 ("io_uring/netcmd: add tx timestamping cmd support")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e837b9091b upstream.
Scanning can be offloaded to the firmware. To that end, the driver
prepares a list of channels to scan, including periodic visits back to
the operating channel, and sends the list to the firmware.
When the channel list is too long to fit in a single H2C message, the
driver splits the list, sends the first part, and tells the firmware to
scan. When the scan is complete, the driver sends the next part of the
list and tells the firmware to scan.
When the last channel that fit in the H2C message is the operating
channel something seems to go wrong in the firmware. It will
acknowledge receiving the list of channels but apparently it will not
do anything more. The AP can't be pinged anymore. The driver still
receives beacons, though.
One way to avoid this is to split the list of channels before the
operating channel.
Affected devices:
* RTL8851BU with firmware 0.29.41.3
* RTL8832BU with firmware 0.29.29.8
* RTL8852BE with firmware 0.29.29.8
The commit 57a5fbe39a ("wifi: rtw89: refactor flow that hw scan handles channel list")
is found by git blame, but it is actually to refine the scan flow, but not
a culprit, so skip Fixes tag.
Reported-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Closes: https://lore.kernel.org/linux-wireless/0abbda91-c5c2-4007-84c8-215679e652e1@gmail.com/
Cc: stable@vger.kernel.org # 6.16+
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/c1e61744-8db4-4646-867f-241b47d30386@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9d1f38df7 upstream.
Replace close_cached_dir() calls under cfid_list_lock with a new
close_cached_dir_locked() variant that uses kref_put() instead of
kref_put_lock() to avoid recursive locking when dropping references.
While the existing code works if the refcount >= 2 invariant holds,
this area has proven error-prone. Make deadlocks impossible and WARN
on invariant violations.
Cc: stable@vger.kernel.org
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75f72fe289 upstream.
Before Linux had cred structures, the SELinux task_security_struct was
per-task and although the structure was switched to being per-cred
long ago, the name was never updated. This change renames it to
cred_security_struct to avoid confusion and pave the way for the
introduction of an actual per-task security structure for SELinux. No
functional change.
Cc: stable@vger.kernel.org
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b118906833 upstream.
Commit cf3fc03762 ("ata: libata-scsi: Fix ata_to_sense_error() status
handling") fixed ata_to_sense_error() to properly generate sense key
ABORTED COMMAND (without any additional sense code), instead of the
previous bogus sense key ILLEGAL REQUEST with the additional sense code
UNALIGNED WRITE COMMAND, for a failed command.
However, this broke suspend for Security locked drives (drives that have
Security enabled, and have not been Security unlocked by boot firmware).
The reason for this is that the SCSI disk driver, for the Synchronize
Cache command only, treats any sense data with sense key ILLEGAL REQUEST
as a successful command (regardless of ASC / ASCQ).
After commit cf3fc03762 ("ata: libata-scsi: Fix ata_to_sense_error()
status handling") the code that treats any sense data with sense key
ILLEGAL REQUEST as a successful command is no longer applicable, so the
command fails, which causes the system suspend to be aborted:
sd 1:0:0:0: PM: dpm_run_callback(): scsi_bus_suspend returns -5
sd 1:0:0:0: PM: failed to suspend async: error -5
PM: Some devices failed to suspend, or early wake event detected
To make suspend work once again, for a Security locked device only,
return sense data LOGICAL UNIT ACCESS NOT AUTHORIZED, the actual sense
data which a real SCSI device would have returned if locked.
The SCSI disk driver treats this sense data as a successful command.
Cc: stable@vger.kernel.org
Reported-by: Ilia Baryshnikov <qwelias@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220704
Fixes: cf3fc03762 ("ata: libata-scsi: Fix ata_to_sense_error() status handling")
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2932a59c2 upstream.
ACPI 6.6 specification for EINJV2 appends an extra structure to
the end of the existing struct set_error_type_with_address.
Several issues showed up in testing.
1) Initialization was broken by an earlier fix [1] since is_v2 is only
set while performing an injection, not during initialization.
2) A buggy BIOS provided invalid "revision" and "length" for the
extension structure. Add several sanity checks.
3) When injecting legacy error types on an EINJV2 capable system,
don't copy the component arrays.
Fixes: 6c70585149 ("ACPI: APEI: EINJ: Check if user asked for EINJV2 injection") # [1]
Fixes: b47610296d ("ACPI: APEI: EINJ: Enable EINJv2 error injections")
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ rjw: Changelog edits ]
Cc: 6.17+ <stable@vger.kernel.org> # 6.17+
Link: https://patch.msgid.link/20251119012712.178715-1-tony.luck@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c77b3b79a9 upstream.
The sockmap feature allows bpf syscall from userspace, or based
on bpf sockops, replacing the sk_prot of sockets during protocol stack
processing with sockmap's custom read/write interfaces.
'''
tcp_rcv_state_process()
syn_recv_sock()/subflow_syn_recv_sock()
tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
bpf_skops_established <== sockops
bpf_sock_map_update(sk) <== call bpf helper
tcp_bpf_update_proto() <== update sk_prot
'''
When the server has MPTCP enabled but the client sends a TCP SYN
without MPTCP, subflow_syn_recv_sock() performs a fallback on the
subflow, replacing the subflow sk's sk_prot with the native sk_prot.
'''
subflow_syn_recv_sock()
subflow_ulp_fallback()
subflow_drop_ctx()
mptcp_subflow_ops_undo_override()
'''
Then, this subflow can be normally used by sockmap, which replaces the
native sk_prot with sockmap's custom sk_prot. The issue occurs when the
user executes accept::mptcp_stream_accept::mptcp_fallback_tcp_ops().
Here, it uses sk->sk_prot to compare with the native sk_prot, but this
is incorrect when sockmap is used, as we may incorrectly set
sk->sk_socket->ops.
This fix uses the more generic sk_family for the comparison instead.
Additionally, this also prevents a WARNING from occurring:
result from ./scripts/decode_stacktrace.sh:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 337 at net/mptcp/protocol.c:68 mptcp_stream_accept \
(net/mptcp/protocol.c:4005)
Modules linked in:
...
PKRU: 55555554
Call Trace:
<TASK>
do_accept (net/socket.c:1989)
__sys_accept4 (net/socket.c:2028 net/socket.c:2057)
__x64_sys_accept (net/socket.c:2067)
x64_sys_call (arch/x86/entry/syscall_64.c:41)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f87ac92b83d
---[ end trace 0000000000000000 ]---
Fixes: 0b4f33def7 ("mptcp: fix tcp fallback crash")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20251111060307.194196-3-jiayuan.chen@linux.dev
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31475b8811 upstream.
When a zero ASCE is passed to the __ptep_rdp() inline assembly, the
generated instruction should have the R3 field of the instruction set to
zero. However the inline assembly is written incorrectly: for such cases a
zero is loaded into a register allocated by the compiler and this register
is then used by the instruction.
This means that selected TLB entries may not be flushed since the specified
ASCE does not match the one which was used when the selected TLB entries
were created.
Fix this by removing the asce and opt parameters of __ptep_rdp(), since
all callers always pass zero, and use a hard-coded register zero for
the R3 field.
Fixes: 0807b85652 ("s390/mm: add support for RDP (Reset DAT-Protection)")
Cc: stable@vger.kernel.org
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fbade4bd08 upstream.
The sockmap feature allows bpf syscall from userspace, or based on bpf
sockops, replacing the sk_prot of sockets during protocol stack processing
with sockmap's custom read/write interfaces.
'''
tcp_rcv_state_process()
subflow_syn_recv_sock()
tcp_init_transfer(BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB)
bpf_skops_established <== sockops
bpf_sock_map_update(sk) <== call bpf helper
tcp_bpf_update_proto() <== update sk_prot
'''
Consider two scenarios:
1. When the server has MPTCP enabled and the client also requests MPTCP,
the sk passed to the BPF program is a subflow sk. Since subflows only
handle partial data, replacing their sk_prot is meaningless and will
cause traffic disruption.
2. When the server has MPTCP enabled but the client sends a TCP SYN
without MPTCP, subflow_syn_recv_sock() performs a fallback on the
subflow, replacing the subflow sk's sk_prot with the native sk_prot.
'''
subflow_ulp_fallback()
subflow_drop_ctx()
mptcp_subflow_ops_undo_override()
'''
Subsequently, accept::mptcp_stream_accept::mptcp_fallback_tcp_ops()
converts the subflow to plain TCP.
For the first case, we should prevent it from being combined with sockmap
by setting sk_prot->psock_update_sk_prot to NULL, which will be blocked by
sockmap's own flow.
For the second case, since subflow_syn_recv_sock() has already restored
sk_prot to native tcp_prot/tcpv6_prot, no further action is needed.
Fixes: cec37a6e41 ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20251111060307.194196-2-jiayuan.chen@linux.dev
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3cd1548a27 upstream.
In systemd we're trying to switch the internal credentials setup logic
to new mount API [1], and I noticed fsconfig(FSCONFIG_CMD_RECONFIGURE)
consistently fails on tmpfs with noswap option. This can be trivially
reproduced with the following:
```
int fs_fd = fsopen("tmpfs", 0);
fsconfig(fs_fd, FSCONFIG_SET_FLAG, "noswap", NULL, 0);
fsconfig(fs_fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
fsmount(fs_fd, 0, 0);
fsconfig(fs_fd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0); <------ EINVAL
```
After some digging the culprit is shmem_reconfigure() rejecting
!(ctx->seen & SHMEM_SEEN_NOSWAP) && sbinfo->noswap, which is bogus
as ctx->seen serves as a mask for whether certain options are touched
at all. On top of that, noswap option doesn't use fsparam_flag_no,
hence it's not really possible to "reenable" swap to begin with.
Drop the check and redundant SHMEM_SEEN_NOSWAP flag.
[1] https://github.com/systemd/systemd/pull/39637
Fixes: 2c6efe9cf2 ("shmem: add support to ignore swap")
Signed-off-by: Mike Yuan <me@yhndnzj.com>
Link: https://patch.msgid.link/20251108190930.440685-1-me@yhndnzj.com
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4185bed73 upstream.
The "req.start" and "req.len" variables are u64 values that come from the
user at the start of the function. We mask away the high 32 bits of
"req.len" so that's capped at U32_MAX but the "req.start" variable can go
up to U64_MAX which means that the addition can still integer overflow.
Use check_add_overflow() to fix this bug.
Fixes: 095bb6e44e ("mtdchar: add MEMREAD ioctl")
Fixes: 6420ac0af9 ("mtdchar: prevent unbounded allocation in MEMWRITE ioctl")
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0778ac7df5 upstream.
In statmount_string(), most flags assign an output offset pointer (offp)
which is later updated with the string offset. However, the
STATMOUNT_MNT_UIDMAP and STATMOUNT_MNT_GIDMAP cases directly set the
struct fields instead of using offp. This leaves offp uninitialized,
leading to a possible uninitialized dereference when *offp is updated.
Fix it by assigning offp for UIDMAP and GIDMAP as well, keeping the code
path consistent.
Fixes: 37c4a9590e ("statmount: allow to retrieve idmappings")
Fixes: e52e97f09f ("statmount: let unset strings be empty")
Cc: stable@vger.kernel.org
Signed-off-by: Zhen Ni <zhen.ni@easystack.cn>
Link: https://patch.msgid.link/20251013114151.664341-1-zhen.ni@easystack.cn
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c56bf214a upstream.
The DMA device pointer `dma_dev` was being dereferenced before ensuring
that `cdns_ctrl->dmac` is properly initialized.
Move the assignment of `dma_dev` after successfully acquiring the DMA
channel to ensure the pointer is valid before use.
Fixes: d76d22b509 ("mtd: rawnand: cadence: use dma_map_resource for sdma address")
Cc: stable@vger.kernel.org
Signed-off-by: Niravkumar L Rabara <niravkumarlaxmidas.rabara@altera.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3fa05f96fc upstream.
Don't update the LBR MSR intercept bitmaps if they're already up-to-date,
as unconditionally updating the intercepts forces KVM to recalculate the
MSR bitmaps for vmcb02 on every nested VMRUN. The redundant updates are
functionally okay; however, they neuter an optimization in Hyper-V
nested virtualization enlightenments and this manifests as a self-test
failure.
In particular, Hyper-V lets L1 mark "nested enlightenments" as clean, i.e.
tell KVM that no changes were made to the MSR bitmap since the last VMRUN.
The hyperv_svm_test KVM selftest intentionally changes the MSR bitmap
"without telling KVM about it" to verify that KVM honors the clean hint,
correctly fails because KVM notices the changed bitmap anyway:
==== Test Assertion Failure ====
x86/hyperv_svm_test.c:120: vmcb->control.exit_code == 0x081
pid=193558 tid=193558 errno=4 - Interrupted system call
1 0x0000000000411361: assert_on_unhandled_exception at processor.c:659
2 0x0000000000406186: _vcpu_run at kvm_util.c:1699
3 (inlined by) vcpu_run at kvm_util.c:1710
4 0x0000000000401f2a: main at hyperv_svm_test.c:175
5 0x000000000041d0d3: __libc_start_call_main at libc-start.o:?
6 0x000000000041f27c: __libc_start_main_impl at ??:?
7 0x00000000004021a0: _start at ??:?
vmcb->control.exit_code == SVM_EXIT_VMMCALL
Do *not* fix this by skipping svm_hv_vmcb_dirty_nested_enlightenments()
when svm_set_intercept_for_msr() performs a no-op change. changes to
the L0 MSR interception bitmap are only triggered by full CPUID updates
and MSR filter updates, both of which should be rare. Changing
svm_set_intercept_for_msr() risks hiding unintended pessimizations
like this one, and is actually more complex than this change.
Fixes: fbe5e5f030 ("KVM: nSVM: Always recalculate LBR MSR intercepts in svm_update_lbrv()")
Cc: stable@vger.kernel.org
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251112013017.1836863-1-yosry.ahmed@linux.dev
[Rewritten commit message based on mailing list discussion. - Paolo]
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit baa18d577c upstream.
We've had reports from the field that some RK3588 Tiger have random
issues with eMMC errors.
Applying commit a28352cf2d ("mmc: sdhci-of-dwcmshc: Change
DLL_STRBIN_TAPNUM_DEFAULT to 0x4") didn't help and seemed to have made
things worse for our board.
Our HW department checked the eMMC lines and reported that they are too
long and don't look great so signal integrity is probably not the best.
Note that not all Tigers with the same eMMC chip have errors, so the
suspicion is that we're really on the edge in terms of signal integrity
and only a handful devices are failing. Additionally, we have RK3588
Jaguars with the same eMMC chip but the layout is different and we also
haven't received reports about those so far.
Lowering the max-frequency to 150MHz from 200MHz instead of simply
disabling HS400 was briefly tested and seem to work as well. We've
disabled HS400 downstream and haven't received reports since so we'll go
with that instead of lowering the max-frequency.
Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de>
Fixes: 6173ef24b3 ("arm64: dts: rockchip: add RK3588-Q7 (Tiger) SoM")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251112-tiger-hs200-v1-1-b50adac107c0@cherry.de
[added Fixes tag and stable-cc from 2nd mail]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08d70143e3 upstream.
In commit 296602b8e5 ("arm64: dts: rockchip: Move RK3399 OPPs to dtsi
files for SoC variants"), everything shared between variants of RK3399
was put into rk3399-base.dtsi and the rest in variant-specific DTSI,
such as rk3399-t, rk3399-op1, rk3399, etc.
Therefore, the variant-specific DTSI should include rk3399-base.dtsi and
not another variant's DTSI.
rk3399-op1 wrongly includes rk3399 (a variant) DTSI instead of
rk3399-base DTSI, let's fix this oversight by including the intended
DTSI.
Fortunately, this had no impact on the resulting DTB since all nodes
were named the same and all node properties were overridden in
rk3399-op1.dtsi. This was checked by doing a checksum of rk3399-op1 DTBs
before and after this commit.
No intended change in behavior.
Fixes: 296602b8e5 ("arm64: dts: rockchip: Move RK3399 OPPs to dtsi files for SoC variants")
Cc: stable@vger.kernel.org
Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de>
Reviewed-by: Dragan Simic <dsimic@manjaro.org>
Link: https://patch.msgid.link/20251029-rk3399-op1-include-v1-1-2472ee60e7f8@cherry.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 997c06330f upstream.
As per the i.MX8MP TRM, section 14.2 "AUDIO_BLK_CTRL", table 14.2.3.1.1
"memory map", the definition of the EARC control register shows that the
EARC controller software reset is controlled via bit 0, while the EARC PHY
software reset is controlled via bit 1.
This means that the current definitions of IMX8MP_AUDIOMIX_EARC_RESET_MASK
and IMX8MP_AUDIOMIX_EARC_PHY_RESET_MASK are wrong since their values would
imply that the EARC controller software reset is controlled via bit 1 and
the EARC PHY software reset is controlled via bit 2. Fix them.
Fixes: a83bc87cd3 ("reset: imx8mp-audiomix: Prepare the code for more reset bits")
Cc: stable@vger.kernel.org
Reviewed-by: Shengjiu Wang <shengjiu.wang@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Laurentiu Mihalcea <laurentiu.mihalcea@nxp.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit beab067dbc upstream.
Based on available evidence, the USB ID 4c4a:4155 used by multiple
devices has been attributed to Jieli. The commit 1a8953f4f7
("HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY") affected touchscreen
functionality. Added checks for manufacturer and serial number to
maintain microphone compatibility, enabling both devices to function
properly.
[jkosina@suse.com: edit shortlog]
Fixes: 1a8953f4f7 ("HID: Add IGNORE quirk for SMARTLINKTECHNOLOGY")
Cc: stable@vger.kernel.org
Tested-by: staffan.melin@oscillator.se
Reviewed-by: Terry Junge <linuxhid@cosmicgizmosystems.com>
Signed-off-by: Zhang Heng <zhangheng@kylinos.cn>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20739af073 upstream.
There is a race condition between timer_shutdown_sync() and timer
expiration that can lead to hitting a WARN_ON in expire_timers().
The issue occurs when timer_shutdown_sync() clears the timer function
to NULL while the timer is still running on another CPU. The race
scenario looks like this:
CPU0 CPU1
<SOFTIRQ>
lock_timer_base()
expire_timers()
base->running_timer = timer;
unlock_timer_base()
[call_timer_fn enter]
mod_timer()
...
timer_shutdown_sync()
lock_timer_base()
// For now, will not detach the timer but only clear its function to NULL
if (base->running_timer != timer)
ret = detach_if_pending(timer, base, true);
if (shutdown)
timer->function = NULL;
unlock_timer_base()
[call_timer_fn exit]
lock_timer_base()
base->running_timer = NULL;
unlock_timer_base()
...
// Now timer is pending while its function set to NULL.
// next timer trigger
<SOFTIRQ>
expire_timers()
WARN_ON_ONCE(!fn) // hit
...
lock_timer_base()
// Now timer will detach
if (base->running_timer != timer)
ret = detach_if_pending(timer, base, true);
if (shutdown)
timer->function = NULL;
unlock_timer_base()
The problem is that timer_shutdown_sync() clears the timer function
regardless of whether the timer is currently running. This can leave a
pending timer with a NULL function pointer, which triggers the
WARN_ON_ONCE(!fn) check in expire_timers().
Fix this by only clearing the timer function when actually detaching the
timer. If the timer is running, leave the function pointer intact, which is
safe because the timer will be properly detached when it finishes running.
Fixes: 0cc04e8045 ("timers: Add shutdown mechanism to the internal functions")
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251122093942.301559-1-zouyipeng@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf91f4bc9c upstream.
The blamed commit introduced the function lanphy_modify_page_reg which
as name suggests it, it modifies the registers. In the same commit we
have started to use this function inside the drivers. The problem is
that in the function lan8814_config_init we passed the wrong page number
when disabling the aneg towards host side. We passed extended page number
4(LAN8814_PAGE_COMMON_REGS) instead of extended page
5(LAN8814_PAGE_PORT_REGS)
Fixes: a0de636ed7 ("net: phy: micrel: Introduce lanphy_modify_page_reg")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250925064702.3906950-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f978e3f15 upstream.
In hfcsusb_probe(), the memory allocated for ctrl_urb gets leaked when
setup_instance() fails with an error code. Fix that by freeing the urb
before freeing the hw structure. Also change the error paths to use the
goto ladder style.
Compile tested only. Issue found using a prototype static analysis tool.
Fixes: 69f52adb2d ("mISDN: Add HFC USB driver")
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Link: https://patch.msgid.link/20251030042524.194812-1-nihaal@cse.iitm.ac.in
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9d7dfb95da ]
Add VMX exit handlers for SEAMCALL and TDCALL to inject a #UD if a non-TD
guest attempts to execute SEAMCALL or TDCALL. Neither SEAMCALL nor TDCALL
is gated by any software enablement other than VMXON, and so will generate
a VM-Exit instead of e.g. a native #UD when executed from the guest kernel.
Note! No unprivileged DoS of the L1 kernel is possible as TDCALL and
SEAMCALL #GP at CPL > 0, and the CPL check is performed prior to the VMX
non-root (VM-Exit) check, i.e. userspace can't crash the VM. And for a
nested guest, KVM forwards unknown exits to L1, i.e. an L2 kernel can
crash itself, but not L1.
Note #2! The Intel® Trust Domain CPU Architectural Extensions spec's
pseudocode shows the CPL > 0 check for SEAMCALL coming _after_ the VM-Exit,
but that appears to be a documentation bug (likely because the CPL > 0
check was incorrectly bundled with other lower-priority #GP checks).
Testing on SPR and EMR shows that the CPL > 0 check is performed before
the VMX non-root check, i.e. SEAMCALL #GPs when executed in usermode.
Note #3! The aforementioned Trust Domain spec uses confusing pseudocode
that says that SEAMCALL will #UD if executed "inSEAM", but "inSEAM"
specifically means in SEAM Root Mode, i.e. in the TDX-Module. The long-
form description explicitly states that SEAMCALL generates an exit when
executed in "SEAM VMX non-root operation". But that's a moot point as the
TDX-Module injects #UD if the guest attempts to execute SEAMCALL, as
documented in the "Unconditionally Blocked Instructions" section of the
TDX-Module base specification.
Cc: stable@vger.kernel.org
Cc: Kai Huang <kai.huang@intel.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20251016182148.69085-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 885df2d210 ]
Add support for the immediate forms of RDMSR and WRMSRNS (currently
Intel-only). The immediate variants are only valid in 64-bit mode, and
use a single general purpose register for the data (the register is also
encoded in the instruction, i.e. not implicit like regular RDMSR/WRMSR).
The immediate variants are primarily motivated by performance, not code
size: by having the MSR index in an immediate, it is available *much*
earlier in the CPU pipeline, which allows hardware much more leeway about
how a particular MSR is handled.
Intel VMX support for the immediate forms of MSR accesses communicates
exit information to the host as follows:
1) The immediate form of RDMSR uses VM-Exit Reason 84.
2) The immediate form of WRMSRNS uses VM-Exit Reason 85.
3) For both VM-Exit reasons 84 and 85, the Exit Qualification field is
set to the MSR index that triggered the VM-Exit.
4) Bits 3 ~ 6 of the VM-Exit Instruction Information field are set to
the register encoding used by the immediate form of the instruction,
i.e. the destination register for RDMSR, and the source for WRMSRNS.
5) The VM-Exit Instruction Length field records the size of the
immediate form of the MSR instruction.
To deal with userspace RDMSR exits, stash the destination register in a
new kvm_vcpu_arch field, similar to cui_linear_rip, pio, etc.
Alternatively, the register could be saved in kvm_run.msr or re-retrieved
from the VMCS, but the former would require sanitizing the value to ensure
userspace doesn't clobber the value to an out-of-bounds index, and the
latter would require a new one-off kvm_x86_ops hook.
Don't bother adding support for the instructions in KVM's emulator, as the
only way for RDMSR/WRMSR to be encountered is if KVM is emulating large
swaths of code due to invalid guest state, and a vCPU cannot have invalid
guest state while in 64-bit mode.
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
[sean: minor tweaks, massage and expand changelog]
Link: https://lore.kernel.org/r/20250805202224.1475590-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Stable-dep-of: 9d7dfb95da ("KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or TDCALL")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ec400f6c2f ]
Rename "ecx" variables in {RD,WR}MSR and RDPMC helpers to "msr" and "pmc"
respectively, in anticipation of adding support for the immediate variants
of RDMSR and WRMSRNS, and to better document what the variables hold
(versus where the data originated).
No functional change intended.
Link: https://lore.kernel.org/r/20250805202224.1475590-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Stable-dep-of: 9d7dfb95da ("KVM: VMX: Inject #UD if guest tries to execute SEAMCALL or TDCALL")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 249d96b492 ]
Since snd_soc_suspend() is invoked through snd_soc_pm_ops->suspend(),
and snd_soc_pm_ops is associated with the soc_driver (defined in
sound/soc/soc-core.c), and there is no parent-child relationship between
the soc_driver and the DA7213 codec driver, the power management subsystem
does not enforce a specific suspend/resume order between the DA7213 driver
and the soc_driver.
Because of this, the different codec component functionalities, called from
snd_soc_resume() to reconfigure various functions, can race with the
DA7213 struct dev_pm_ops::resume function, leading to misapplied
configuration. This occasionally results in clipped sound.
Fix this by dropping the struct dev_pm_ops::{suspend, resume} and use
instead struct snd_soc_component_driver::{suspend, resume}. This ensures
the proper configuration sequence is handled by the ASoC subsystem.
Cc: stable@vger.kernel.org
Fixes: 431e040065 ("ASoC: da7213: Add suspend to RAM support")
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20251104114914.2060603-1-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d9f7d390f upstream.
Support for parsing PC source info in stacktraces (e.g. '(P)') was added
in commit 2bff77c665 ("scripts/decode_stacktrace.sh: fix decoding of
lines with an additional info"). However, this logic was placed after the
build ID processing. This incorrect order fails to parse lines containing
both elements, e.g.:
drm_gem_mmap_obj+0x114/0x200 [drm 03d0564e0529947d67bb2008c3548be77279fd27] (P)
This patch fixes the problem by extracting the PC source info first and
then processing the module build ID. With this change, the line above is
now properly parsed as such:
drm_gem_mmap_obj (./include/linux/mmap_lock.h:212 ./include/linux/mm.h:811 drivers/gpu/drm/drm_gem.c:1177) drm (P)
While here, also add a brief explanation the build ID section.
Link: https://lkml.kernel.org/r/20251030010347.2731925-1-cmllamas@google.com
Fixes: 2bff77c665 ("scripts/decode_stacktrace.sh: fix decoding of lines with an additional info")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Breno Leitao <leitao@debian.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a2fc4897b upstream.
With lines having a symbol to decode, the script was only trying to
preserve the alignment for the timestamps, but not the rest, nor when the
caller was set (CONFIG_PRINTK_CALLER=y).
With this sample ...
[ 52.080924] Call Trace:
[ 52.080926] <TASK>
[ 52.080931] dump_stack_lvl+0x6f/0xb0
... the script was producing the following output:
[ 52.080924] Call Trace:
[ 52.080926] <TASK>
[ 52.080931] dump_stack_lvl (arch/x86/include/asm/irqflags.h:19)
(dump_stack_lvl is no longer aligned with <TASK>: one missing space)
With this other sample ...
[ 52.080924][ T48] Call Trace:
[ 52.080926][ T48] <TASK>
[ 52.080931][ T48] dump_stack_lvl+0x6f/0xb0
... the script was producing the following output:
[ 52.080924][ T48] Call Trace:
[ 52.080926][ T48] <TASK>
[ 52.080931][ T48] dump_stack_lvl (arch/x86/include/asm/irqflags.h:19)
(the misalignment is clearer here)
That's because the script had a workaround for CONFIG_PRINTK_TIME=y only,
see the previous comment called "Format timestamps with tabs".
To always preserve spaces, they need to be recorded along the words. That
is what is now done with the new 'spaces' array.
Some notes:
- 'extglob' is needed only for this operation, and that's why it is set
in a dedicated subshell.
- 'read' is used with '-r' not to treat a <backslash> character in any
special way, e.g. when followed by a space.
- When a word is removed from the 'words' array, the corresponding space
needs to be removed from the 'spaces' array as well.
With the last sample, we now have:
[ 52.080924][ T48] Call Trace:
[ 52.080926][ T48] <TASK>
[ 52.080931][ T48] dump_stack_lvl (arch/x86/include/asm/irqflags.h:19)
(the alignment is preserved)
Link: https://lkml.kernel.org/r/20250908-decode_strace_indent-v1-2-28e5e4758080@kernel.org
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Tested-by: Carlos Llamas <cmllamas@google.com>
Cc: Breno Leitao <leitao@debian.org>
Cc: Elliot Berman <quic_eberman@quicinc.com>
Cc: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74207de2ba upstream.
Patch series "Fix SIGBUS semantics with large folios", v3.
Accessing memory within a VMA, but beyond i_size rounded up to the next
page size, is supposed to generate SIGBUS.
Darrick reported[1] an xfstests regression in v6.18-rc1. generic/749
failed due to missing SIGBUS. This was caused by my recent changes that
try to fault in the whole folio where possible:
19773df031 ("mm/fault: try to map the entire file folio in finish_fault()")
357b92761d ("mm/filemap: map entire large folio faultaround")
These changes did not consider i_size when setting up PTEs, leading to
xfstest breakage.
However, the problem has been present in the kernel for a long time -
since huge tmpfs was introduced in 2016. The kernel happily maps
PMD-sized folios as PMD without checking i_size. And huge=always tmpfs
allocates PMD-size folios on any writes.
I considered this corner case when I implemented a large tmpfs, and my
conclusion was that no one in their right mind should rely on receiving a
SIGBUS signal when accessing beyond i_size. I cannot imagine how it could
be useful for the workload.
But apparently filesystem folks care a lot about preserving strict SIGBUS
semantics.
Generic/749 was introduced last year with reference to POSIX, but no real
workloads were mentioned. It also acknowledged the tmpfs deviation from
the test case.
POSIX indeed says[3]:
References within the address range starting at pa and
continuing for len bytes to whole pages following the end of an
object shall result in delivery of a SIGBUS signal.
The patchset fixes the regression introduced by recent changes as well as
more subtle SIGBUS breakage due to split failure on truncation.
This patch (of 2):
Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are
supposed to generate SIGBUS.
Recent changes attempted to fault in full folio where possible. They did
not respect i_size, which led to populating PTEs beyond i_size and
breaking SIGBUS semantics.
Darrick reported generic/749 breakage because of this.
However, the problem existed before the recent changes. With huge=always
tmpfs, any write to a file leads to PMD-size allocation. Following the
fault-in of the folio will install PMD mapping regardless of i_size.
Fix filemap_map_pages() and finish_fault() to not install:
- PTEs beyond i_size;
- PMD mappings across i_size;
Make an exception for shmem/tmpfs that for long time intentionally
mapped with PMDs across i_size.
Link: https://lkml.kernel.org/r/20251027115636.82382-1-kirill@shutemov.name
Link: https://lkml.kernel.org/r/20251027115636.82382-2-kirill@shutemov.name
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Fixes: 6795801366 ("xfs: Support large folios")
Reported-by: "Darrick J. Wong" <djwong@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 852b644acb upstream.
The 'run_tests' function is executed in the background, but killing its
associated PID would not kill the children tasks running in the
background.
To properly kill all background tasks, 'kill -- -PID' could be used, but
this requires kill from procps-ng. Instead, all children tasks are
listed using 'ps', and 'kill' is called with all PIDs of this group.
Fixes: 31ee4ad86a ("selftests: mptcp: join: stop transfer when check is done (part 1)")
Cc: stable@vger.kernel.org
Fixes: 04b57c9e09 ("selftests: mptcp: join: stop transfer when check is done (part 2)")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-6-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 290493078b upstream.
In rare cases, when the test environment is very slow, some userspace
tests can fail because some expected events have not been seen.
Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to make the
connection longer. This connection will be killed at the end, after the
verifications, so making it longer doesn't change anything, apart from
avoid it to end before the end of the verifications
To play it safe, all userspace tests not waiting for the end of the
transfer are now sharing a longer file (128KB) at slow speed.
Fixes: 4369c198e5 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Fixes: b2e2248f36 ("selftests: mptcp: userspace pm create id 0 subflow")
Fixes: e3b47e460b ("selftests: mptcp: userspace pm remove initial subflow")
Fixes: b9fb176081 ("selftests: mptcp: userspace pm send RM_ADDR for ID 0")
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-4-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee79980f7a upstream.
MPTCP Join "fastclose server" selftest is sometimes failing because the
client output file doesn't have the expected size, e.g. 296B instead of
1024B.
When looking at a packet trace when this happens, the server sent the
expected 1024B in two parts -- 100B, then 924B -- then the MP_FASTCLOSE.
It is then strange to see the client only receiving 296B, which would
mean it only got a part of the second packet. The problem is then not on
the networking side, but rather on the data reception side.
When mptcp_connect is launched with '-f -1', it means the connection
might stop before having sent everything, because a reset has been
received. When this happens, the program was directly stopped. But it is
also possible there are still some data to read, simply because the
previous 'read' step was done with a buffer smaller than the pending
data, see do_rnd_read(). In this case, it is important to read what's
left in the kernel buffers before stopping without error like before.
SIGPIPE is now ignored, not to quit the app before having read
everything.
Fixes: 6bf41020b7 ("selftests: mptcp: update and extend fastclose test-cases")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-5-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6457595db9 upstream.
In rare cases, when the test environment is very slow, some userspace
tests can fail because some expected events have not been seen.
Because the tests are expecting a long on-going connection, and they are
not waiting for the end of the transfer, it is fine to make the
connection longer. This connection will be killed at the end, after the
verifications, so making it longer doesn't change anything, apart from
avoid it to end before the end of the verifications
To play it safe, all endpoints tests not waiting for the end of the
transfer are now sharing a longer file (128KB) at slow speed.
Fixes: 69c6ce7b6e ("selftests: mptcp: add implicit endpoint test case")
Cc: stable@vger.kernel.org
Fixes: e274f71540 ("selftests: mptcp: add subflow limits test-cases")
Fixes: b5e2fb832f ("selftests: mptcp: add explicit test case for remove/readd")
Fixes: e06959e9ee ("selftests: mptcp: join: test for flush/re-add endpoints")
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-3-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aea73bae66 upstream.
Some of these 'remove' tests rarely fail because a subflow has been
reset instead of cleanly removed. This can happen when one extra subflow
which has never carried data is being closed (FIN) on one side, while
the other is sending data for the first time.
To avoid such subflows to be used right at the end, the backup flag has
been added. With that, data will be only carried on the initial subflow.
Fixes: d2c4333a80 ("selftests: mptcp: add testcases for removing addrs")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-2-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 63c643aa7b upstream.
The "fallback due to TCP OoO" was never printed because the stat_ooo_now
variable was checked twice: once in the parent if-statement, and one in
the child one. The second condition was then always true then, and the
'else' branch was never taken.
The idea is that when there are more ACK + MP_CAPABLE than expected, the
test either fails if there was no out of order packets, or a notice is
printed.
Fixes: 69ca3d29a7 ("mptcp: update selftest for fallback due to OoO")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251110-net-mptcp-sft-join-unstable-v1-1-a4332c714e10@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fccac54b0d upstream.
Limit the workaround for the lack of the proper splash-screen handover
handling to the legacy ARM 32bit systems and replace forcing a sync_state
by explicite power domain shutdown. This approach lets compiler to
optimize it out on newer ARM 64bit systems.
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Fixes: 0745658aeb ("pmdomain: samsung: Fix splash-screen handover by enforcing a sync_state")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bbde14682e upstream.
of_get_child_by_name() returns a node pointer with refcount incremented, we
should use of_node_put() on it when not needed anymore. Add the missing
of_node_put() to avoid refcount leak.
Fixes: 721cabf6c6 ("soc: imx: move PGC handling to a new GPC driver")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7458f72cc2 upstream.
If of_genpd_add_provider_onecell() fails during probe, the previously
created generic power domains are not removed, leading to a memory leak
and potential kernel crash later in genpd_debug_add().
Add proper error handling to unwind the initialized domains before
returning from probe to ensure all resources are correctly released on
failure.
Example crash trace observed without this fix:
| Unable to handle kernel paging request at virtual address fffffffffffffc70
| CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.0-rc1 #405 PREEMPT
| Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform
| pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : genpd_debug_add+0x2c/0x160
| lr : genpd_debug_init+0x74/0x98
| Call trace:
| genpd_debug_add+0x2c/0x160 (P)
| genpd_debug_init+0x74/0x98
| do_one_initcall+0xd0/0x2d8
| do_initcall_level+0xa0/0x140
| do_initcalls+0x60/0xa8
| do_basic_setup+0x28/0x40
| kernel_init_freeable+0xe8/0x170
| kernel_init+0x2c/0x140
| ret_from_fork+0x10/0x20
Fixes: 898216c97e ("firmware: arm_scmi: add device power domain support using genpd")
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 994dec1099 upstream.
First, we can't assume pipe == crtc index. If a pipe is fused off in
between, it no longer holds. intel_crtc_for_pipe() is the only proper
way to get from a pipe to the corresponding crtc.
Second, drivers aren't supposed to access or index drm->vblank[]
directly. There's drm_crtc_vblank_crtc() for this.
Use both functions to fix the pipe to vblank conversion.
Fixes: f02658c46c ("drm/i915/psr: Add mechanism to notify PSR of pipe enable/disable")
Cc: Jouni Högander <jouni.hogander@intel.com>
Cc: stable@vger.kernel.org # v6.16+
Reviewed-by: Jouni Högander <jouni.hogander@intel.com>
Link: https://patch.msgid.link/20251106200000.1455164-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 2750f6765d6974f7e163c5d540a96c8703f6d8dd)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22a36e660d upstream.
Certain multi-GPU configurations (especially GFX12) may hit
data corruption when a DCC-compressed VRAM surface is shared across GPUs
using peer-to-peer (P2P) DMA transfers.
Such surfaces rely on device-local metadata and cannot be safely accessed
through a remote GPU’s page tables. Attempting to import a DCC-enabled
surface through P2P leads to incorrect rendering or GPU faults.
This change disables P2P for DCC-enabled VRAM buffers that are contiguous
and allocated on GFX12+ hardware. In these cases, the importer falls back
to the standard system-memory path, avoiding invalid access to compressed
surfaces.
Future work could consider optional migration (VRAM→System→VRAM) if a
performance regression is observed when `attach->peer2peer = false`.
Tested on:
- Dual RX 9700 XT (Navi4x) setup
- GNOME and Wayland compositor scenarios
- Confirmed no corruption after disabling P2P under these conditions
v2: Remove check TTM_PL_VRAM & TTM_PL_FLAG_CONTIGUOUS.
v3: simplify for upsteam and fix ip version check (Alex)
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9dff2bb709e6fbd97e263fd12bf12802d2b5a0cf)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6623c5f9fd upstream.
Fix a potential deadlock caused by inconsistent spinlock usage
between interrupt and process contexts in the userq fence driver.
The issue occurs when amdgpu_userq_fence_driver_process() is called
from both:
- Interrupt context: gfx_v11_0_eop_irq() -> amdgpu_userq_fence_driver_process()
- Process context: amdgpu_eviction_fence_suspend_worker() ->
amdgpu_userq_fence_driver_force_completion() -> amdgpu_userq_fence_driver_process()
In interrupt context, the spinlock was acquired without disabling
interrupts, leaving it in {IN-HARDIRQ-W} state. When the same lock
is acquired in process context, the kernel detects inconsistent
locking since the process context acquisition would enable interrupts
while holding a lock previously acquired in interrupt context.
Kernel log shows:
[ 4039.310790] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ 4039.310804] kworker/7:2/409 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 4039.310818] ffff9284e1bed000 (&fence_drv->fence_list_lock){?...}-{3:3},
[ 4039.310993] {IN-HARDIRQ-W} state was registered at:
[ 4039.311004] lock_acquire+0xc6/0x300
[ 4039.311018] _raw_spin_lock+0x39/0x80
[ 4039.311031] amdgpu_userq_fence_driver_process.part.0+0x30/0x180 [amdgpu]
[ 4039.311146] amdgpu_userq_fence_driver_process+0x17/0x30 [amdgpu]
[ 4039.311257] gfx_v11_0_eop_irq+0x132/0x170 [amdgpu]
Fix by using spin_lock_irqsave()/spin_unlock_irqrestore() to properly
manage interrupt state regardless of calling context.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ded3ad780cf97a04927773c4600823b84f7f3cc2)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d15deafab5 upstream.
Over allocation of save area is not fatal, only under allocation is.
ROCm has various components that independently claim authority over save
area size.
Unless KFD decides to claim single authority, relax size checks.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 15bd4958fe38e763bc17b607ba55155254a01f55)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c367af440e upstream.
data_reloc_print_warning_inode() calls btrfs_get_fs_root() to obtain
local_root, but fails to release its reference when paths_from_inode()
returns an error. This causes a potential memory leak.
Add a missing btrfs_put_root() call in the error path to properly
decrease the reference count of local_root.
Fixes: b9a9a85059 ("btrfs: output affected files when relocation fails")
CC: stable@vger.kernel.org # 6.6+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bfe3d755ef upstream.
When logging that a new name exists, we skip updating the inode's
last_log_commit field to prevent a later explicit fsync against the inode
from doing nothing (as updating last_log_commit makes btrfs_inode_in_log()
return true). We are detecting, at btrfs_log_inode(), that logging a new
name is happening by checking the logging mode is not LOG_INODE_EXISTS,
but that is not enough because we may log parent directories when logging
a new name of a file in LOG_INODE_ALL mode - we need to check that the
logging_new_name field of the log context too.
An example scenario where this results in an explicit fsync against a
directory not persisting changes to the directory is the following:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ touch /mnt/foo
$ sync
$ mkdir /mnt/dir
# Write some data to our file and fsync it.
$ xfs_io -c "pwrite -S 0xab 0 64K" -c "fsync" /mnt/foo
# Add a new link to our file. Since the file was logged before, we
# update it in the log tree by calling btrfs_log_new_name().
$ ln /mnt/foo /mnt/dir/bar
# fsync the root directory - we expect it to persist the dentry for
# the new directory "dir".
$ xfs_io -c "fsync" /mnt
<power fail>
After mounting the fs the entry for directory "dir" does not exists,
despite the explicit fsync on the root directory.
Here's why this happens:
1) When we fsync the file we log the inode, so that it's present in the
log tree;
2) When adding the new link we enter btrfs_log_new_name(), and since the
inode is in the log tree we proceed to updating the inode in the log
tree;
3) We first set the inode's last_unlink_trans to the current transaction
(early in btrfs_log_new_name());
4) We then eventually enter btrfs_log_inode_parent(), and after logging
the file's inode, we call btrfs_log_all_parents() because the inode's
last_unlink_trans matches the current transaction's ID (updated in the
previous step);
5) So btrfs_log_all_parents() logs the root directory by calling
btrfs_log_inode() for the root's inode with a log mode of LOG_INODE_ALL
so that new dentries are logged;
6) At btrfs_log_inode(), because the log mode is LOG_INODE_ALL, we
update root inode's last_log_commit to the last transaction that
changed the inode (->last_sub_trans field of the inode), which
corresponds to the current transaction's ID;
7) Then later when user space explicitly calls fsync against the root
directory, we enter btrfs_sync_file(), which calls skip_inode_logging()
and that returns true, since its call to btrfs_inode_in_log() returns
true and there are no ordered extents (it's a directory, never has
ordered extents). This results in btrfs_sync_file() returning without
syncing the log or committing the current transaction, so all the
updates we did when logging the new name, including logging the root
directory, are not persisted.
So fix this by but updating the inode's last_log_commit if we are sure
we are not logging a new name (if ctx->logging_new_name is false).
A test case for fstests will follow soon.
Reported-by: Vyacheslav Kovalevsky <slava.kovalevskiy.2014@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/03c5d7ec-5b3d-49d1-95bc-8970a7f82d87@gmail.com/
Fixes: 130341be7f ("btrfs: always update the logged transaction when logging new names")
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5fea61aa1c upstream.
scrub_raid56_parity_stripe() allocates a bio with bio_alloc(), but
fails to release it on some error paths, leading to a potential
memory leak.
Add the missing bio_put() calls to properly drop the bio reference
in those error cases.
Fixes: 1009254bf2 ("btrfs: scrub: use scrub_stripe to implement RAID56 P/Q scrub")
CC: stable@vger.kernel.org # 6.6+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a1ab50135 upstream.
The stripe offset calculation in the zoned code for raid0 and raid10
wrongly uses map->stripe_size to calculate it. In fact, map->stripe_size is
the size of the device extent composing the block group, which always is
the zone_size on the zoned setup.
Fix it by using BTRFS_STRIPE_LEN and BTRFS_STRIPE_LEN_SHIFT. Also, optimize
the calculation a bit by doing the common calculation only once.
Fixes: c0d90a79e8 ("btrfs: zoned: fix alloc_offset calculation for partly conventional block groups")
CC: stable@vger.kernel.org # 6.17+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 94f54924b9 upstream.
When a block group contains both conventional zone and sequential zone, the
capacity of the block group is wrongly set to the block group's full
length. The capacity should be calculated in btrfs_load_block_group_* using
the last allocation offset.
Fixes: 568220fa96 ("btrfs: zoned: support RAID0/1/10 on top of raid stripe tree")
CC: stable@vger.kernel.org # v6.12+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 281326be67 upstream.
The current single-bit error injection mechanism flips bits directly in ECC RAM
by performing write and read operations. When the ECC RAM is actively used by
the Ethernet or USB controller, this approach sometimes trigger a false
double-bit error.
Switch both Ethernet and USB EDAC devices to use the INTTEST register
(altr_edac_a10_device_inject_fops) for single-bit error injection, similar to
the existing double-bit error injection method.
Fixes: 064acbd4f4 ("EDAC, altera: Add Stratix10 peripheral support")
Signed-off-by: Niravkumar L Rabara <niravkumarlaxmidas.rabara@altera.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251111081333.1279635-1-niravkumarlaxmidas.rabara@altera.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e67526840 upstream.
Now we use virtual addresses to fill CSR_MERRENTRY/CSR_TLBRENTRY, but
hardware hope physical addresses. Now it works well because the high
bits are ignored above PA_BITS (48 bits), but explicitly use physical
addresses can avoid potential bugs. So fix it.
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce5ad03e45 upstream.
Now there 5 places which calculate max_pfn & max_low_pfn:
1. in fdt_setup() for FDT systems;
2. in memblock_init() for ACPI systems;
3. in init_numa_memory() for NUMA systems;
4. in arch_mem_init() to recalculate for "mem=" cmdline;
5. in paging_init() to recalculate for NUMA systems.
Since memblock_init() is called both for ACPI and FDT systems, move the
calculation out of the for_each_efi_memory_desc() loop can eliminate the
first case. The last case is very questionable (may be derived from the
MIPS/Loongson code) and breaks the "mem=" cmdline, so should be removed.
And then the NUMA version of paging_init() can be also eliminated.
After consolidation there are 3 places of calculation:
1. in memblock_init() for both ACPI and FDT systems;
2. in init_numa_memory() to recalculate for NUMA systems;
3. in arch_mem_init() to recalculate for the "mem=" cmdline.
For all cases the calculation is:
max_pfn = PFN_DOWN(memblock_end_of_DRAM());
max_low_pfn = min(PFN_DOWN(HIGHMEM_START), max_pfn);
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56b3c85e15 upstream.
When livepatch is attached to the same function as bpf trampoline with
a fexit program, bpf trampoline code calls register_ftrace_direct()
twice. The first time will fail with -EAGAIN, and the second time it
will succeed. This requires register_ftrace_direct() to unregister
the address on the first attempt. Otherwise, the bpf trampoline cannot
attach. Here is an easy way to reproduce this issue:
insmod samples/livepatch/livepatch-sample.ko
bpftrace -e 'fexit:cmdline_proc_show {}'
ERROR: Unable to attach probe: fexit:vmlinux:cmdline_proc_show...
Fix this by cleaning up the hash when register_ftrace_function_nolock hits
errors.
Also, move the code that resets ops->func and ops->trampoline to the error
path of register_ftrace_direct(); and add a helper function reset_direct()
in register_ftrace_direct() and unregister_ftrace_direct().
Fixes: d05cb47066 ("ftrace: Fix modification of direct_function hash while in use")
Cc: stable@vger.kernel.org # v6.6+
Reported-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
Closes: https://lore.kernel.org/live-patching/c5058315a39d4615b333e485893345be@crowdstrike.com/
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20251027175023.1521602-2-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fdf302e6be upstream.
Starting with Rust 1.91.0 (released 2025-10-30), in upstream commit
ab91a63d403b ("Ignore intrinsic calls in cross-crate-inlining cost model")
[1][2], `bindings.o` stops containing DWARF debug information because the
`Default` implementations contained `write_bytes()` calls which are now
ignored in that cost model (note that `CLIPPY=1` does not reproduce it).
This means `gendwarfksyms` complains:
RUSTC L rust/bindings.o
error: gendwarfksyms: process_module: dwarf_get_units failed: no debugging information?
There are several alternatives that would work here: conditionally
skipping in the cases needed (but that is subtle and brittle), forcing
DWARF generation with e.g. a dummy `static` (ugly and we may need to
do it in several crates), skipping the call to the tool in the Kbuild
command when there are no exports (fine) or teaching the tool to do so
itself (simple and clean).
Thus do the last one: don't attempt to process files if we have no symbol
versions to calculate.
[ I used the commit log of my patch linked below since it explained the
root issue and expanded it a bit more to summarize the alternatives.
- Miguel ]
Cc: stable@vger.kernel.org # Needed in 6.17.y.
Reported-by: Haiyue Wang <haiyuewa@163.com>
Closes: https://lore.kernel.org/rust-for-linux/b8c1c73d-bf8b-4bf2-beb1-84ffdcd60547@163.com/
Suggested-by: Miguel Ojeda <ojeda@kernel.org>
Link: https://lore.kernel.org/rust-for-linux/CANiq72nKC5r24VHAp9oUPR1HVPqT+=0ab9N0w6GqTF-kJOeiSw@mail.gmail.com/
Link: ab91a63d40 [1]
Link: https://github.com/rust-lang/rust/pull/145910 [2]
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Haiyue Wang <haiyuewa@163.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20251110131913.1789896-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3cd2018e15 upstream.
Since commit d24cfee7f6 ("spi: Fix acpi deferred irq probe"), the
acpi_dev_gpio_irq_get() call gets delayed till spi_probe() is called
on the SPI device.
If there is no driver for the SPI device then the move to spi_probe()
results in acpi_dev_gpio_irq_get() never getting called. This may
cause problems by leaving the GPIO pin floating because this call is
responsible for setting up the GPIO pin direction and/or bias according
to the values from the ACPI tables.
Re-add the removed acpi_dev_gpio_irq_get() in acpi_register_spi_device()
to ensure the GPIO pin is always correctly setup, while keeping the
acpi_dev_gpio_irq_get() call added to spi_probe() to deal with
-EPROBE_DEFER returns caused by the GPIO controller not having a driver
yet.
Link: https://bbs.archlinux.org/viewtopic.php?id=302348
Fixes: d24cfee7f6 ("spi: Fix acpi deferred irq probe")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hansg@kernel.org>
Link: https://patch.msgid.link/20251102190921.30068-1-hansg@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79280191c2 upstream.
cifs_pick_channel iterates candidate channels using cur. The
reconnect-state test mistakenly used a different variable.
This checked the wrong slot and would cause us to skip a healthy channel
and to dispatch on one that needs reconnect, occasionally failing
operations when a channel was down.
Fix by replacing for the correct variable.
Fixes: fc43a8ac39 ("cifs: cifs_pick_channel should try selecting active channels")
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 59b0afd01b upstream.
The qm_get_qos_value() function calls bus_find_device_by_name() which
increases the device reference count, but fails to call put_device()
to balance the reference count and lead to a device reference leak.
Add put_device() calls in both the error path and success path to
properly balance the reference count.
Found via static analysis.
Fixes: 22d7a6c39c ("crypto: hisilicon/qm - add pci bdf number check")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Longfang Liu <liulongfang@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 00fbff75c5 upstream.
When crashkernel is configured with a high reservation, shrinking its
value below the low crashkernel reservation causes two issues:
1. Invalid crashkernel resource objects
2. Kernel crash if crashkernel shrinking is done twice
For example, with crashkernel=200M,high, the kernel reserves 200MB of high
memory and some default low memory (say 256MB). The reservation appears
as:
cat /proc/iomem | grep -i crash
af000000-beffffff : Crash kernel
433000000-43f7fffff : Crash kernel
If crashkernel is then shrunk to 50MB (echo 52428800 >
/sys/kernel/kexec_crash_size), /proc/iomem still shows 256MB reserved:
af000000-beffffff : Crash kernel
Instead, it should show 50MB:
af000000-b21fffff : Crash kernel
Further shrinking crashkernel to 40MB causes a kernel crash with the
following trace (x86):
BUG: kernel NULL pointer dereference, address: 0000000000000038
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
<snip...>
Call Trace: <TASK>
? __die_body.cold+0x19/0x27
? page_fault_oops+0x15a/0x2f0
? search_module_extables+0x19/0x60
? search_bpf_extables+0x5f/0x80
? exc_page_fault+0x7e/0x180
? asm_exc_page_fault+0x26/0x30
? __release_resource+0xd/0xb0
release_resource+0x26/0x40
__crash_shrink_memory+0xe5/0x110
crash_shrink_memory+0x12a/0x190
kexec_crash_size_store+0x41/0x80
kernfs_fop_write_iter+0x141/0x1f0
vfs_write+0x294/0x460
ksys_write+0x6d/0xf0
<snip...>
This happens because __crash_shrink_memory()/kernel/crash_core.c
incorrectly updates the crashk_res resource object even when
crashk_low_res should be updated.
Fix this by ensuring the correct crashkernel resource object is updated
when shrinking crashkernel memory.
Link: https://lkml.kernel.org/r/20251101193741.289252-1-sourabhjain@linux.ibm.com
Fixes: 16c6006af4 ("kexec: enable kexec_crash_size to support two crash kernel regions")
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8c73eb7db upstream.
The user calls fsconfig twice, but when the program exits, free() only
frees ctx->source for the second fsconfig, not the first.
Regarding fc->source, there is no code in the fs context related to its
memory reclamation.
To fix this memory leak, release the source memory corresponding to ctx
or fc before each parsing.
syzbot reported:
BUG: memory leak
unreferenced object 0xffff888128afa360 (size 96):
backtrace (crc 79c9c7ba):
kstrdup+0x3c/0x80 mm/util.c:84
smb3_fs_context_parse_param+0x229b/0x36c0 fs/smb/client/fs_context.c:1444
BUG: memory leak
unreferenced object 0xffff888112c7d900 (size 96):
backtrace (crc 79c9c7ba):
smb3_fs_context_fullpath+0x70/0x1b0 fs/smb/client/fs_context.c:629
smb3_fs_context_parse_param+0x2266/0x36c0 fs/smb/client/fs_context.c:1438
Reported-by: syzbot+72afd4c236e6bc3f4bac@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=72afd4c236e6bc3f4bac
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1a58d865f4 upstream.
The bus_find_device_by_name() function returns a device pointer with an
incremented reference count, but the original code was missing put_device()
calls in some return paths, leading to reference count leaks.
Fix this by ensuring put_device() is called before function exit after
bus_find_device_by_name() succeeds
This follows the same pattern used elsewhere in the kernel where
bus_find_device_by_name() is properly paired with put_device().
Found via static analysis and code review.
Fixes: 4f8ef33dd4 ("ASoC: soc_sdw_utils: skip the endpoint that doesn't present")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://patch.msgid.link/20251029071804.8425-1-linmq006@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05a1fc5efd upstream.
The PCM stream data in USB-audio driver is transferred over USB URB
packet buffers, and each packet size is determined dynamically. The
packet sizes are limited by some factors such as wMaxPacketSize USB
descriptor. OTOH, in the current code, the actually used packet sizes
are determined only by the rate and the PPS, which may be bigger than
the size limit above. This results in a buffer overflow, as reported
by syzbot.
Basically when the limit is smaller than the calculated packet size,
it implies that something is wrong, most likely a weird USB
descriptor. So the best option would be just to return an error at
the parameter setup time before doing any further operations.
This patch introduces such a sanity check, and returns -EINVAL when
the packet size is greater than maxpacksize. The comparison with
ep->packsize[1] alone should suffice since it's always equal or
greater than ep->packsize[0].
Reported-by: syzbot+bfd77469c8966de076f7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=bfd77469c8966de076f7
Link: https://lore.kernel.org/690b6b46.050a0220.3d0d33.0054.GAE@google.com
Cc: Lizhi Xu <lizhi.xu@windriver.com>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20251109091211.12739-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82420bd4e1 upstream.
After restructuring and splitting the HDMI codec driver code, each
HDMI codec driver contains the own build_controls and build_pcms ops.
A copy-n-paste error put the wrong entries for nvhdmi-mcp driver; both
build_controls and build_pcms are swapped. Unfortunately both
callbacks have the very same form, and the compiler didn't complain
it, either. This resulted in a NULL dereference because the PCM
instance hasn't been initialized at calling the build_controls
callback.
Fix it by passing the proper entries.
Fixes: ad781b550f ("ALSA: hda/hdmi: Rewrite to new probe method")
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220743
Link: https://patch.msgid.link/20251106104647.25805-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c2a936edd upstream.
Since commit 78524b05f1 ("mm, swap: avoid redundant swap device
pinning"), the common helper for allocating and preparing a folio in the
swap cache layer no longer tries to get a swap device reference
internally, because all callers of __read_swap_cache_async are already
holding a swap entry reference. The repeated swap device pinning isn't
needed on the same swap device.
Caller of VMA readahead is also holding a reference to the target entry's
swap device, but VMA readahead walks the page table, so it might encounter
swap entries from other devices, and call __read_swap_cache_async on
another device without holding a reference to it.
So it is possible to cause a UAF when swapoff of device A raced with
swapin on device B, and VMA readahead tries to read swap entries from
device A. It's not easy to trigger, but in theory, it could cause real
issues.
Make VMA readahead try to get the device reference first if the swap
device is a different one from the target entry.
Link: https://lkml.kernel.org/r/20251111-swap-fix-vma-uaf-v1-1-41c660e58562@tencent.com
Fixes: 78524b05f1 ("mm, swap: avoid redundant swap device pinning")
Suggested-by: Huang Ying <ying.huang@linux.alibaba.com>
Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Chris Li <chrisl@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f6ce7e714 upstream.
Patch series "mm/damon: fixes for the jiffies-related issues", v2.
On 32-bit systems, the kernel initializes jiffies to "-5 minutes" to make
jiffies wrap bugs appear earlier. However, this may cause the
time_before() series of functions to return unexpected values, resulting
in DAMON not functioning as intended. Meanwhile, similar issues exist in
some specific user operation scenarios.
This patchset addresses these issues. The first patch is about the
DAMON_STAT module, and the second patch is about the core layer's sysfs.
This patch (of 2):
In DAMON_STAT's damon_stat_damon_call_fn(), time_before_eq() is used to
avoid unnecessarily frequent stat update.
On 32-bit systems, the kernel initializes jiffies to "-5 minutes" to make
jiffies wrap bugs appear earlier. However, this causes time_before_eq()
in DAMON_STAT to unexpectedly return true during the first 5 minutes after
boot on 32-bit systems (see [1] for more explanation, which fixes another
jiffies-related issue before). As a result, DAMON_STAT does not update
any monitoring results during that period, which becomes more confusing
when DAMON_STAT_ENABLED_DEFAULT is enabled.
There is also an issue unrelated to the system's word size[2]: if the user
stops DAMON_STAT just after last_refresh_jiffies is updated and restarts
it after 5 seconds or a longer delay, last_refresh_jiffies will retain an
older value, causing time_before_eq() to return false and the update to
happen earlier than expected.
Fix these issues by making last_refresh_jiffies a global variable and
initializing it each time DAMON_STAT is started.
Link: https://lkml.kernel.org/r/20251030020746.967174-2-yanquanmin1@huawei.com
Link: https://lkml.kernel.org/r/20250822025057.1740854-1-ekffu200098@gmail.com [1]
Link: https://lore.kernel.org/all/20251028143250.50144-1-sj@kernel.org/ [2]
Fixes: fabdd1e911 ("mm/damon/stat: calculate and expose estimated memory bandwidth")
Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com>
Suggested-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: ze zuo <zuoze1@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d6c356dd6 upstream.
When emitting the order of the allocation for a hash table,
alloc_large_system_hash() unconditionally subtracts PAGE_SHIFT from log
base 2 of the allocation size. This is not correct if the allocation size
is smaller than a page, and yields a negative value for the order as seen
below:
TCP established hash table entries: 32 (order: -4, 256 bytes, linear) TCP
bind hash table entries: 32 (order: -2, 1024 bytes, linear)
Use get_order() to compute the order when emitting the hash table
information to correctly handle cases where the allocation size is smaller
than a page:
TCP established hash table entries: 32 (order: 0, 256 bytes, linear) TCP
bind hash table entries: 32 (order: 0, 1024 bytes, linear)
Link: https://lkml.kernel.org/r/20251028191020.413002-1-isaacmanjarres@google.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 895b4c0c79 upstream.
Pde is erased from subdir rbtree through rb_erase(), but not set the node
to EMPTY, which may result in uaf access. We should use RB_CLEAR_NODE()
set the erased node to EMPTY, then pde_subdir_next() will return NULL to
avoid uaf access.
We found an uaf issue while using stress-ng testing, need to run testcase
getdent and tun in the same time. The steps of the issue is as follows:
1) use getdent to traverse dir /proc/pid/net/dev_snmp6/, and current
pde is tun3;
2) in the [time windows] unregister netdevice tun3 and tun2, and erase
them from rbtree. erase tun3 first, and then erase tun2. the
pde(tun2) will be released to slab;
3) continue to getdent process, then pde_subdir_next() will return
pde(tun2) which is released, it will case uaf access.
CPU 0 | CPU 1
-------------------------------------------------------------------------
traverse dir /proc/pid/net/dev_snmp6/ | unregister_netdevice(tun->dev) //tun3 tun2
sys_getdents64() |
iterate_dir() |
proc_readdir() |
proc_readdir_de() | snmp6_unregister_dev()
pde_get(de); | proc_remove()
read_unlock(&proc_subdir_lock); | remove_proc_subtree()
| write_lock(&proc_subdir_lock);
[time window] | rb_erase(&root->subdir_node, &parent->subdir);
| write_unlock(&proc_subdir_lock);
read_lock(&proc_subdir_lock); |
next = pde_subdir_next(de); |
pde_put(de); |
de = next; //UAF |
rbtree of dev_snmp6
|
pde(tun3)
/ \
NULL pde(tun2)
Link: https://lkml.kernel.org/r/20251025024233.158363-1-albin_yang@163.com
Signed-off-by: Wei Yang <albinwyang@tencent.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: wangzijie <wangzijie1@honor.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9da90e618 upstream.
While connecting, the MAC address can already no longer be
changed. The change is already rejected if netif_carrier_ok(),
but of course that's not true yet while connecting. Check for
auth_data or assoc_data, so the MAC address cannot be changed.
Also more comprehensively check that there are no stations on
the interface being changed - if any peer station is added it
will know about our address already, so we cannot change it.
Cc: stable@vger.kernel.org
Fixes: 3c06e91b40 ("wifi: mac80211: Support POWERED_ADDR_CHANGE feature")
Link: https://patch.msgid.link/20251105154119.f9f6c1df81bb.I9bb3760ede650fb96588be0d09a5a7bdec21b217@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd4adb986a upstream.
The tracing selftest "event-filter-function.tc" was failing because it
first runs the "sample_events" function that triggers the kmem_cache_free
event and it looks at what function was used during a call to "ls".
But the first time it calls this, it could trigger events that are used to
pull pages into the page cache.
The rest of the test uses the function it finds during that call to see if
it will be called in subsequent "sample_events" calls. But if there's no
need to pull pages into the page cache, it will not trigger that function
and the test will fail.
Call the "sample_events" twice to trigger all the page cache work before
it calls it to find a function to use in subsequent checks.
Cc: stable@vger.kernel.org
Fixes: eb50d0f250 ("selftests/ftrace: Choose target function for filter test from samples")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f86d0534f upstream.
When a page fault occurs in a secret memory file created with
`memfd_secret(2)`, the kernel will allocate a new folio for it, mark the
underlying page as not-present in the direct map, and add it to the file
mapping.
If two tasks cause a fault in the same page concurrently, both could end
up allocating a folio and removing the page from the direct map, but only
one would succeed in adding the folio to the file mapping. The task that
failed undoes the effects of its attempt by (a) freeing the folio again
and (b) putting the page back into the direct map. However, by doing
these two operations in this order, the page becomes available to the
allocator again before it is placed back in the direct mapping.
If another task attempts to allocate the page between (a) and (b), and the
kernel tries to access it via the direct map, it would result in a
supervisor not-present page fault.
Fix the ordering to restore the direct map before the folio is freed.
Link: https://lkml.kernel.org/r/20251031120955.92116-1-lance.yang@linux.dev
Fixes: 1507f51255 ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Reported-by: Google Big Sleep <big-sleep-vuln-reports@google.com>
Closes: https://lore.kernel.org/linux-mm/CAEXGt5QeDpiHTu3K9tvjUTPqo+d-=wuCNYPa+6sWKrdQJ-ATdg@mail.gmail.com/
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49c8d2c1f9 upstream.
commit efa95b01da ("netpoll: fix use after free") incorrectly
ignored the refcount and prematurely set dev->npinfo to NULL during
netpoll cleanup, leading to improper behavior and memory leaks.
Scenario causing lack of proper cleanup:
1) A netpoll is associated with a NIC (e.g., eth0) and netdev->npinfo is
allocated, and refcnt = 1
- Keep in mind that npinfo is shared among all netpoll instances. In
this case, there is just one.
2) Another netpoll is also associated with the same NIC and
npinfo->refcnt += 1.
- Now dev->npinfo->refcnt = 2;
- There is just one npinfo associated to the netdev.
3) When the first netpolls goes to clean up:
- The first cleanup succeeds and clears np->dev->npinfo, ignoring
refcnt.
- It basically calls `RCU_INIT_POINTER(np->dev->npinfo, NULL);`
- Set dev->npinfo = NULL, without proper cleanup
- No ->ndo_netpoll_cleanup() is either called
4) Now the second target tries to clean up
- The second cleanup fails because np->dev->npinfo is already NULL.
* In this case, ops->ndo_netpoll_cleanup() was never called, and
the skb pool is not cleaned as well (for the second netpoll
instance)
- This leaks npinfo and skbpool skbs, which is clearly reported by
kmemleak.
Revert commit efa95b01da ("netpoll: fix use after free") and adds
clarifying comments emphasizing that npinfo cleanup should only happen
once the refcount reaches zero, ensuring stable and correct netpoll
behavior.
Cc: <stable@vger.kernel.org> # 3.17.x
Cc: Jay Vosburgh <jv@jvosburgh.net>
Fixes: efa95b01da ("netpoll: fix use after free")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251107-netconsole_torture-v10-1-749227b55f63@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a6b60cb14 upstream.
Because kthread_stop did not stop sc_task properly and returned -EINTR,
the sc_timer was not properly closed, ultimately causing the problem [1]
reported by syzbot when freeing sci due to the sc_timer not being closed.
Because the thread sc_task main function nilfs_segctor_thread() returns 0
when it succeeds, when the return value of kthread_stop() is not 0 in
nilfs_segctor_destroy(), we believe that it has not properly closed
sc_timer.
We use timer_shutdown_sync() to sync wait for sc_timer to shutdown, and
set the value of sc_task to NULL under the protection of lock
sc_state_lock, so as to avoid the issue caused by sc_timer not being
properly shutdowned.
[1]
ODEBUG: free active (active state 0) object: 00000000dacb411a object type: timer_list hint: nilfs_construction_timeout
Call trace:
nilfs_segctor_destroy fs/nilfs2/segment.c:2811 [inline]
nilfs_detach_log_writer+0x668/0x8cc fs/nilfs2/segment.c:2877
nilfs_put_super+0x4c/0x12c fs/nilfs2/super.c:509
Link: https://lkml.kernel.org/r/20251029225226.16044-1-konishi.ryusuke@gmail.com
Fixes: 3f66cc261c ("nilfs2: use kthread_create and kthread_stop for the log writer thread")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+24d8b70f039151f65590@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=24d8b70f039151f65590
Tested-by: syzbot+24d8b70f039151f65590@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Cc: <stable@vger.kernel.org> [6.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9fd7bb5083 upstream.
In DAMON's damon_sysfs_repeat_call_fn(), time_before() is used to compare
the current jiffies with next_update_jiffies to determine whether to
update the sysfs files at this moment.
On 32-bit systems, the kernel initializes jiffies to "-5 minutes" to make
jiffies wrap bugs appear earlier. However, this causes time_before() in
damon_sysfs_repeat_call_fn() to unexpectedly return true during the first
5 minutes after boot on 32-bit systems (see [1] for more explanation,
which fixes another jiffies-related issue before). As a result, DAMON
does not update sysfs files during that period.
There is also an issue unrelated to the system's word size[2]: if the
user stops DAMON just after next_update_jiffies is updated and restarts
it after 'refresh_ms' or a longer delay, next_update_jiffies will retain
an older value, causing time_before() to return false and the update to
happen earlier than expected.
Fix these issues by making next_update_jiffies a global variable and
initializing it each time DAMON is started.
Link: https://lkml.kernel.org/r/20251030020746.967174-3-yanquanmin1@huawei.com
Link: https://lkml.kernel.org/r/20250822025057.1740854-1-ekffu200098@gmail.com [1]
Link: https://lore.kernel.org/all/20251029013038.66625-1-sj@kernel.org/ [2]
Fixes: d809a7c64b ("mm/damon/sysfs: implement refresh_ms file internal work")
Suggested-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: ze zuo <zuoze1@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac1499fcd4 upstream.
The sit driver's packet transmission path calls: sit_tunnel_xmit() ->
update_or_create_fnhe(), which lead to fnhe_remove_oldest() being called
to delete entries exceeding FNHE_RECLAIM_DEPTH+random.
The race window is between fnhe_remove_oldest() selecting fnheX for
deletion and the subsequent kfree_rcu(). During this time, the
concurrent path's __mkroute_output() -> find_exception() can fetch the
soon-to-be-deleted fnheX, and rt_bind_exception() then binds it with a
new dst using a dst_hold(). When the original fnheX is freed via RCU,
the dst reference remains permanently leaked.
CPU 0 CPU 1
__mkroute_output()
find_exception() [fnheX]
update_or_create_fnhe()
fnhe_remove_oldest() [fnheX]
rt_bind_exception() [bind dst]
RCU callback [fnheX freed, dst leak]
This issue manifests as a device reference count leak and a warning in
dmesg when unregistering the net device:
unregister_netdevice: waiting for sitX to become free. Usage count = N
Ido Schimmel provided the simple test validation method [1].
The fix clears 'oldest->fnhe_daddr' before calling fnhe_flush_routes().
Since rt_bind_exception() checks this field, setting it to zero prevents
the stale fnhe from being reused and bound to a new dst just before it
is freed.
[1]
ip netns add ns1
ip -n ns1 link set dev lo up
ip -n ns1 address add 192.0.2.1/32 dev lo
ip -n ns1 link add name dummy1 up type dummy
ip -n ns1 route add 192.0.2.2/32 dev dummy1
ip -n ns1 link add name gretap1 up arp off type gretap \
local 192.0.2.1 remote 192.0.2.2
ip -n ns1 route add 198.51.0.0/16 dev gretap1
taskset -c 0 ip netns exec ns1 mausezahn gretap1 \
-A 198.51.100.1 -B 198.51.0.0/16 -t udp -p 1000 -c 0 -q &
taskset -c 2 ip netns exec ns1 mausezahn gretap1 \
-A 198.51.100.1 -B 198.51.0.0/16 -t udp -p 1000 -c 0 -q &
sleep 10
ip netns pids ns1 | xargs kill
ip netns del ns1
Cc: stable@vger.kernel.org
Fixes: 67d6d681e1 ("ipv4: make exception cache less predictible")
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251111064328.24440-1-nashuiliang@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a073d637c8 upstream.
Now if the PTE/PMD is dirty with _PAGE_DIRTY but without _PAGE_MODIFIED,
after {pte,pmd}_modify() we lose _PAGE_DIRTY, then {pte,pmd}_dirty()
return false and lead to data loss. This can happen in certain scenarios
such as HW PTW doesn't set _PAGE_MODIFIED automatically, so here we need
_PAGE_MODIFIED to record the dirty status (_PAGE_DIRTY).
The new modification involves checking whether the original PTE/PMD has
the _PAGE_DIRTY flag. If it exists, the _PAGE_MODIFIED bit is also set,
ensuring that the {pte,pmd}_dirty() interface can always return accurate
information.
Cc: stable@vger.kernel.org
Co-developed-by: Liupu Wang <wangliupu@loongson.cn>
Signed-off-by: Liupu Wang <wangliupu@loongson.cn>
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eeeeaafa62 upstream.
CSR.FWPC and CSR.MWPC are 32bit registers, so use csr_read32() rather
than csr_read64() to read the values of FWPC/MWPC.
Cc: stable@vger.kernel.org
Fixes: edffa33c7b ("LoongArch: Add hardware breakpoints/watchpoints support")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 43a9e6a10b upstream.
1. Use phys_addr_t instead of u64, which can work for both 32/64 bits.
2. Check whether the input physical address is above TO_PHYS_MASK (and
return NULL if yes) for the DMW version.
Note: In theory early_ioremap() also need the TO_PHYS_MASK checking, but
the UEFI BIOS pass some DMW virtual addresses.
Cc: stable@vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91a5409002 upstream.
maple_tree tracepoints contain pointers to function names. Such a pointer
is saved when a tracepoint logs an event. There's no guarantee that it's
still valid when the event is parsed later and the pointer is dereferenced.
The kernel warns about these unsafe pointers.
event 'ma_read' has unsafe pointer field 'fn'
WARNING: kernel/trace/trace.c:3779 at ignore_event+0x1da/0x1e4
Mark the function names as tracepoint_string() to fix the events.
One case that doesn't work without my patch would be trace-cmd record
to save the binary ringbuffer and trace-cmd report to parse it in
userspace. The address of __func__ can't be dereferenced from
userspace but tracepoint_string will add an entry to
/sys/kernel/tracing/printk_formats
Link: https://lkml.kernel.org/r/20251030155537.87972-1-martin@kaiser.cx
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4da4e4bde1 upstream.
The `len` member of the sk_buff is an unsigned int. This is cast to
`ssize_t` (a signed type) for the first sk_buff in the comparison,
but not the second sk_buff. On 32-bit systems, this can result in
an integer underflow for certain values because unsigned arithmetic
is being used.
This appears to be an oversight: if the intention was to use unsigned
arithmetic, then the first cast would have been omitted. The change
ensures both len values are cast to `ssize_t`.
The underflow causes an issue with ktls when multiple TLS PDUs are
included in a single TCP segment. The mainline kernel does not use
strparser for ktls anymore, but this is still useful for other
features that still use strparser, and for backporting.
Signed-off-by: Nate Karstens <nate.karstens@garmin.com>
Cc: stable@vger.kernel.org
Fixes: 43a0c6751a ("strparser: Stream parser for messages")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251106222835.1871628-1-nate.karstens@garmin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f5548c318d upstream.
Currently, scan_get_next_rmap_item() walks every page address in a VMA to
locate mergeable pages. This becomes highly inefficient when scanning
large virtual memory areas that contain mostly unmapped regions, causing
ksmd to use large amount of cpu without deduplicating much pages.
This patch replaces the per-address lookup with a range walk using
walk_page_range(). The range walker allows KSM to skip over entire
unmapped holes in a VMA, avoiding unnecessary lookups. This problem was
previously discussed in [1].
Consider the following test program which creates a 32 TiB mapping in the
virtual address space but only populates a single page:
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>
/* 32 TiB */
const size_t size = 32ul * 1024 * 1024 * 1024 * 1024;
int main() {
char *area = mmap(NULL, size, PROT_READ | PROT_WRITE,
MAP_NORESERVE | MAP_PRIVATE | MAP_ANON, -1, 0);
if (area == MAP_FAILED) {
perror("mmap() failed\n");
return -1;
}
/* Populate a single page such that we get an anon_vma. */
*area = 0;
/* Enable KSM. */
madvise(area, size, MADV_MERGEABLE);
pause();
return 0;
}
$ ./ksm-sparse &
$ echo 1 > /sys/kernel/mm/ksm/run
Without this patch ksmd uses 100% of the cpu for a long time (more then 1
hour in my test machine) scanning all the 32 TiB virtual address space
that contain only one mapped page. This makes ksmd essentially deadlocked
not able to deduplicate anything of value. With this patch ksmd walks
only the one mapped page and skips the rest of the 32 TiB virtual address
space, making the scan fast using little cpu.
Link: https://lkml.kernel.org/r/20251023035841.41406-1-pedrodemargomes@gmail.com
Link: https://lkml.kernel.org/r/20251022153059.22763-1-pedrodemargomes@gmail.com
Link: https://lore.kernel.org/linux-mm/423de7a3-1c62-4e72-8e79-19a6413e420c@redhat.com/ [1]
Fixes: 31dbd01f31 ("ksm: Kernel SamePage Merging")
Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com>
Co-developed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: craftfever <craftfever@airmail.cc>
Closes: https://lkml.kernel.org/r/020cf8de6e773bb78ba7614ef250129f11a63781@murena.io
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98a5fd31cb upstream.
When the per-IP connection limit is exceeded in ksmbd_kthread_fn(),
the code sets ret = -EAGAIN and continues the accept loop without
closing the just-accepted socket. That leaks one socket per rejected
attempt from a single IP and enables a trivial remote DoS.
Release client_sk before continuing.
This bug was found with ZeroPath.
Cc: stable@vger.kernel.org
Signed-off-by: Joshua Rogers <linux@joshua.hu>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a7348a9ed upstream.
nfsd exports a "pseudo root filesystem" which is used by NFSv4 to find
the various exported filesystems using LOOKUP requests from a known root
filehandle. NFSv3 uses the MOUNT protocol to find those exported
filesystems and so is not given access to the pseudo root filesystem.
If a v3 (or v2) client uses a filehandle from that filesystem,
nfsd_set_fh_dentry() will report an error, but still stores the export
in "struct svc_fh" even though it also drops the reference (exp_put()).
This means that when fh_put() is called an extra reference will be dropped
which can lead to use-after-free and possible denial of service.
Normal NFS usage will not provide a pseudo-root filehandle to a v3
client. This bug can only be triggered by the client synthesising an
incorrect filehandle.
To fix this we move the assignments to the svc_fh later, after all
possible error cases have been detected.
Reported-and-tested-by: tianshuo han <hantianshuo233@gmail.com>
Fixes: ef7f6c4904 ("nfsd: move V4ROOT version check to nfsd_set_fh_dentry()")
Signed-off-by: NeilBrown <neil@brown.name>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a4821412c upstream.
The current scheme for handling LBRV when nested is used is very
complicated, especially when L1 does not enable LBRV (i.e. does not set
LBR_CTL_ENABLE_MASK).
To avoid copying LBRs between VMCB01 and VMCB02 on every nested
transition, the current implementation switches between using VMCB01 or
VMCB02 as the source of truth for the LBRs while L2 is running. If L2
enables LBR, VMCB02 is used as the source of truth. When L2 disables
LBR, the LBRs are copied to VMCB01 and VMCB01 is used as the source of
truth. This introduces significant complexity, and incorrect behavior in
some cases.
For example, on a nested #VMEXIT, the LBRs are only copied from VMCB02
to VMCB01 if LBRV is enabled in VMCB01. This is because L2's writes to
MSR_IA32_DEBUGCTLMSR to enable LBR are intercepted and propagated to
VMCB01 instead of VMCB02. However, LBRV is only enabled in VMCB02 when
L2 is running.
This means that if L2 enables LBR and exits to L1, the LBRs will not be
propagated from VMCB02 to VMCB01, because LBRV is disabled in VMCB01.
There is no meaningful difference in CPUID rate in L2 when copying LBRs
on every nested transition vs. the current approach, so do the simple
and correct thing and always copy LBRs between VMCB01 and VMCB02 on
nested transitions (when LBRV is disabled by L1). Drop the conditional
LBRs copying in __svm_{enable/disable}_lbrv() as it is now unnecessary.
VMCB02 becomes the only source of truth for LBRs when L2 is running,
regardless of LBRV being enabled by L1, drop svm_get_lbr_vmcb() and use
svm->vmcb directly in its place.
Fixes: 1d5a1b5860 ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running")
Cc: stable@vger.kernel.org
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251108004524.1600006-4-yosry.ahmed@linux.dev
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fbe5e5f030 upstream.
svm_update_lbrv() is called when MSR_IA32_DEBUGCTLMSR is updated, and on
nested transitions where LBRV is used. It checks whether LBRV enablement
needs to be changed in the current VMCB, and if it does, it also
recalculate intercepts to LBR MSRs.
However, there are cases where intercepts need to be updated even when
LBRV enablement doesn't. Example scenario:
- L1 has MSR_IA32_DEBUGCTLMSR cleared.
- L1 runs L2 without LBR_CTL_ENABLE (no LBRV).
- L2 sets DEBUGCTLMSR_LBR in MSR_IA32_DEBUGCTLMSR, svm_update_lbrv()
sets LBR_CTL_ENABLE in VMCB02 and disables intercepts to LBR MSRs.
- L2 exits to L1, svm_update_lbrv() is not called on this transition.
- L1 clears MSR_IA32_DEBUGCTLMSR, svm_update_lbrv() finds that
LBR_CTL_ENABLE is already cleared in VMCB01 and does nothing.
- Intercepts remain disabled, L1 reads to LBR MSRs read the host MSRs.
Fix it by always recalculating intercepts in svm_update_lbrv().
Fixes: 1d5a1b5860 ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running")
Cc: stable@vger.kernel.org
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251108004524.1600006-3-yosry.ahmed@linux.dev
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc55b3c3f6 upstream.
The APM lists the DbgCtlMsr field as being tracked by the VMCB_LBR clean
bit. Always clear the bit when MSR_IA32_DEBUGCTLMSR is updated.
The history is complicated, it was correctly cleared for L1 before
commit 1d5a1b5860 ("KVM: x86: nSVM: correctly virtualize LBR msrs when
L2 is running"). At that point svm_set_msr() started to rely on
svm_update_lbrv() to clear the bit, but when nested virtualization
is enabled the latter does not always clear it even if MSR_IA32_DEBUGCTLMSR
changed. Go back to clearing it directly in svm_set_msr().
Fixes: 1d5a1b5860 ("KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running")
Reported-by: Matteo Rizzo <matteorizzo@google.com>
Reported-by: evn@google.com
Co-developed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251108004524.1600006-2-yosry.ahmed@linux.dev
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f9eacf4f0 upstream.
32bit ID registers aren't getting much love these days, and are
often missed in updates. One of these updates broke restoring
a GICv2 guest on a GICv3 machine.
Instead of performing a piecemeal fix, just bite the bullet
and make all 32bit ID regs fully writable. KVM itself never
relies on them for anything, and if the VMM wants to mess up
the guest, so be it.
Fixes: 5cb57a1aff ("KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: stable@vger.kernel.org
Reviewed-by: Oliver Upton <oupton@kernel.org>
Link: https://patch.msgid.link/20251030122707.2033690-2-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae431059e7 upstream.
When unbinding a memslot from a guest_memfd instance, remove the bindings
even if the guest_memfd file is dying, i.e. even if its file refcount has
gone to zero. If the memslot is freed before the file is fully released,
nullifying the memslot side of the binding in kvm_gmem_release() will
write to freed memory, as detected by syzbot+KASAN:
==================================================================
BUG: KASAN: slab-use-after-free in kvm_gmem_release+0x176/0x440 virt/kvm/guest_memfd.c:353
Write of size 8 at addr ffff88807befa508 by task syz.0.17/6022
CPU: 0 UID: 0 PID: 6022 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
kvm_gmem_release+0x176/0x440 virt/kvm/guest_memfd.c:353
__fput+0x44c/0xa70 fs/file_table.c:468
task_work_run+0x1d4/0x260 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xe9/0x130 kernel/entry/common.c:43
exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
do_syscall_64+0x2bd/0xfa0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fbeeff8efc9
</TASK>
Allocated by task 6023:
kasan_save_stack mm/kasan/common.c:56 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:77
poison_kmalloc_redzone mm/kasan/common.c:397 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:414
kasan_kmalloc include/linux/kasan.h:262 [inline]
__kmalloc_cache_noprof+0x3e2/0x700 mm/slub.c:5758
kmalloc_noprof include/linux/slab.h:957 [inline]
kzalloc_noprof include/linux/slab.h:1094 [inline]
kvm_set_memory_region+0x747/0xb90 virt/kvm/kvm_main.c:2104
kvm_vm_ioctl_set_memory_region+0x6f/0xd0 virt/kvm/kvm_main.c:2154
kvm_vm_ioctl+0x957/0xc60 virt/kvm/kvm_main.c:5201
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 6023:
kasan_save_stack mm/kasan/common.c:56 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:77
kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
poison_slab_object mm/kasan/common.c:252 [inline]
__kasan_slab_free+0x5c/0x80 mm/kasan/common.c:284
kasan_slab_free include/linux/kasan.h:234 [inline]
slab_free_hook mm/slub.c:2533 [inline]
slab_free mm/slub.c:6622 [inline]
kfree+0x19a/0x6d0 mm/slub.c:6829
kvm_set_memory_region+0x9c4/0xb90 virt/kvm/kvm_main.c:2130
kvm_vm_ioctl_set_memory_region+0x6f/0xd0 virt/kvm/kvm_main.c:2154
kvm_vm_ioctl+0x957/0xc60 virt/kvm/kvm_main.c:5201
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Deliberately don't acquire filemap invalid lock when the file is dying as
the lifecycle of f_mapping is outside the purview of KVM. Dereferencing
the mapping is *probably* fine, but there's no need to invalidate anything
as memslot deletion is responsible for zapping SPTEs, and the only code
that can access the dying file is kvm_gmem_release(), whose core code is
mutually exclusive with unbinding.
Note, the mutual exclusivity is also what makes it safe to access the
bindings on a dying gmem instance. Unbinding either runs with slots_lock
held, or after the last reference to the owning "struct kvm" is put, and
kvm_gmem_release() nullifies the slot pointer under slots_lock, and puts
its reference to the VM after that is done.
Reported-by: syzbot+2479e53d0db9b32ae2aa@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68fa7a22.a70a0220.3bf6c6.008b.GAE@google.com
Tested-by: syzbot+2479e53d0db9b32ae2aa@syzkaller.appspotmail.com
Fixes: a7800aa80e ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
Cc: stable@vger.kernel.org
Cc: Hillf Danton <hdanton@sina.com>
Reviewed-By: Vishal Annapurve <vannapurve@google.com>
Link: https://patch.msgid.link/20251104011205.3853541-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 237e74bfa2 upstream.
VM fails to boot with 256 vCPUs, the detailed command is
qemu-system-loongarch64 -smp 256
and there is an error reported as follows:
KVM_LOONGARCH_EXTIOI_INIT_NUM_CPU failed: Invalid argument
There is typo issue in function kvm_eiointc_ctrl_access() when set
max supported vCPUs.
Cc: stable@vger.kernel.org
Fixes: 47256c4c8b ("LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_eiointc_ctrl_access()")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3c9515e4f upstream.
When timer is fired in oneshot mode, CSR.TVAL will stop with value -1
rather than 0. However when the register CSR.TVAL is restored, it will
continue to count down rather than stop there.
Now the method is to write 0 to CSR.TVAL, wait to count down for 1 cycle
at least, which is 10ns with a timer freq 100MHz, and then retore timer
interrupt status. Here add 2 cycles delay to assure that timer interrupt
is injected.
With this patch, timer selftest case passes to run always.
Cc: stable@vger.kernel.org
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5001bcf86e upstream.
On LoongArch system, guest PMU hardware is shared by guest and host but
PMU interrupt is separated. PMU is pass-through to VM, and there is PMU
context switch when exit to host and return to guest.
There is optimiation to check whether PMU is enabled by guest. If not,
it is not necessary to return to guest. However, if it is enabled, PMU
context for guest need switch on. Now KVM_REQ_PMU notification is set
on vCPU context switch, but it is missing if there is no vCPU context
switch while PMU is used by guest VM, so fix it.
Cc: <stable@vger.kernel.org>
Fixes: f4e40ea9f7 ("LoongArch: KVM: Add PMU support for guest")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a78eb69d60 ]
In uclogic_params_ugee_v2_init_event_hooks(), the memory allocated for
event_hook is not freed in the next error path. Fix that by freeing it.
Fixes: a251d6576d ("HID: uclogic: Handle wireless device reconnection")
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 53f731f5bb ]
Use a scope-based cleanup helper for the buffer allocated with kmalloc()
in ntrig_report_version() to simplify the cleanup logic and prevent
memory leaks (specifically the !hid_is_usb()-case one).
[jkosina@suse.com: elaborate on the actual existing leak]
Fixes: 185c926283 ("HID: hid-ntrig: fix unable to handle page fault in ntrig_report_version()")
Signed-off-by: Masami Ichikawa <masami256@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6504297872 ]
The VBUS supply regulator is currently assigned to the PHY node.
This causes the VBUS to be always on, even when the controller
needs to be switched to peripheral mode.
Fix the OTG role switching by adding a connector node and moving
the VBUS supply regulator to that node. This way the VBUS gets
correctly switched according to the current role.
Fixes: 946ab10e3f ("arm64: dts: Add support for Kontron OSM-S i.MX8MP SoM and BL carrier board")
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ec4daace64 ]
The gpio0_mipi_csi DT nodes are enabled by default, but they are
dependent on the irqsteer_csi nodes, which are not enabled. This causes
the gpio0_mipi_csi GPIOs to be probe deferred. Since these GPIOs can be
used independently of the CSI controller, enable irqsteer_csi by default
too to prevent them from being deferred and to ensure they work out of
the box.
Fixes: 2217f82437 ("arm64: dts: imx8: add capture controller for i.MX8's img subsystem")
Signed-off-by: João Paulo Gonçalves <joao.goncalves@toradex.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f31e261712 ]
Rename the 'ssi2' and 'aud3' nodes to 'mux-ssi2' and 'mux-aud3' in the
audmux configuration of imx51-zii-rdu1.dts to comply with the naming
convention in imx-audmux.yaml.
This fixes the following dt-schema warning:
imx51-zii-rdu1.dtb: audmux@83fd0000 (fsl,imx51-audmux): 'aud3', 'ssi2'
do not match any of the regexes: '^mux-[0-9a-z]*$', '^pinctrl-[0-9]+$'
Fixes: ceef0396f3 ("ARM: dts: imx: add ZII RDU1 board")
Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 62bf7708fe ]
The 'report-rate-hz' property for the edt-ft5x06 driver was added and
handled in the Linux kernel by me with patches [1] and [2] for this
specific board.
The v1 upstream version, which was the one applied to the customer's
kernel, used the 'report-rate' property, which was written directly to
the controller register. During review, the 'hz' suffix was added,
changing its handling so that writing the value directly to the register
was no longer possible for the M06 controller.
Once the patches were accepted in mainline, I did not reapply them to
the customer's kernel, and when upstreaming the DTS for this board, I
forgot to correct the 'report-rate-hz' property value.
The property must be set to 60 because this board uses the M06 controller,
which expects the report rate in units of 10 Hz, meaning the actual value
written to the register is 6.
[1] 625f829586 ("dt-bindings: input: touchscreen: edt-ft5x06: add report-rate-hz")
[2] 5bcee83a40 ("Input: edt-ft5x06 - set report rate by dts property")
Fixes: ffea3cac94 ("ARM: dts: imx6ul: support Engicam MicroGEA RMM board")
Co-developed-by: Michael Trimarchi <michael@amarulasolutions.com>
Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com>
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b3fd04e23f ]
Unify the naming of the existing GPU OPP table nodes found in the RK3588
and RK3588J SoC dtsi files with the other SoC's GPU OPP nodes, following
the more "modern" node naming scheme.
Fixes: a7b2070505 ("arm64: dts: rockchip: Split GPU OPPs of RK3588 and RK3588j")
Signed-off-by: Dragan Simic <dsimic@manjaro.org>
[opp-table also is way too generic on systems with like 4-5 opp-tables]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3c723f4497 ]
Couple of independent fixes:
1. Wire in SIGSEGV handler that terminates the test with a failure code.
2. Use "--lock-cgroup" instead of "-g"; "-g" was proposed but never
merged. See commit 4d1792d0a2 ("perf lock contention: Add
--lock-cgroup option")
3. Call cleanup() on every normal exit so trap_cleanup() doesn't mistake
it for an unexpected signal and emit a false-negative "Unexpected
signal in main" message.
Before patch:
# ./perf test -vv "lock contention"
85: kernel lock contention analysis test:
--- start ---
test child forked, pid 610711
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --lock-cgroup
Unexpected signal in test_aggr_cgroup
---- end(0) ----
85: kernel lock contention analysis test : Ok
After patch:
# ./perf test -vv "lock contention"
85: kernel lock contention analysis test:
--- start ---
test child forked, pid 602637
Testing perf lock record and perf lock contention
Testing perf lock contention --use-bpf
Testing perf lock record and perf lock contention at the same time
Testing perf lock contention --threads
Testing perf lock contention --lock-addr
Testing perf lock contention --lock-cgroup
Testing perf lock contention --type-filter (w/ spinlock)
Testing perf lock contention --lock-filter (w/ tasklist_lock)
Testing perf lock contention --callstack-filter (w/ unix_stream)
[Skip] Could not find 'unix_stream'
Testing perf lock contention --callstack-filter with task aggregation
[Skip] Could not find 'unix_stream'
Testing perf lock contention --cgroup-filter
Testing perf lock contention CSV output
---- end(0) ----
85: kernel lock contention analysis test : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Tycho Andersen <tycho@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a09e5967ad ]
This is one more remnant of the BUILD_NONDISTRO series to make building
with binutils-devel opt-in due to license incompatibility.
In this case just the references at link time were still in place, which
make building the test-all.bin file fail, which wasn't detected before
probably because the last test was done with binutils-devel available,
doh.
Now:
$ rpm -q binutils-devel
package binutils-devel is not installed
$ file /tmp/build/perf-tools/feature/test-all.bin
/tmp/build/perf-tools/feature/test-all.bin: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2,
BuildID[sha1]=4b5388a346b51f1b993f0b0dbd49f4570769b03c, for GNU/Linux 3.2.0, not stripped
$
Fixes: 970ae86307 ("perf build: The bfd features are opt-in, stop testing for them by default")
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 85c894a80a ]
With commit f0d0f978f3 ("perf header: Don't write empty BPF/BTF
info"), the write_bpf_( prog_info() | btf() ) functions exit without
writing anything if env->bpf_prog.(infos| btfs)_cnt is zero.
process_bpf_( prog_info() | btf() ), however, still expect a "count"
value to exist in the data file. If btf information is empty, for
example, process_bpf_btf will read garbage or some other data as the
number of btf nodes in the data file. As a result, the data file will
not be processed correctly.
Instead, write the count to the data file and exit if it is zero.
Fixes: f0d0f978f3 ("perf header: Don't write empty BPF/BTF info")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 78f0e33cd6 ]
grab_requested_mnt_ns was changed to return error codes on failure, but
its callers were not updated to check for error pointers, still checking
only for a NULL return value.
This commit updates the callers to use IS_ERR() or IS_ERR_OR_NULL() and
PTR_ERR() to correctly check for and propagate errors.
This also makes sure that the logic actually works and mount namespace
file descriptors can be used to refere to mounts.
Christian Brauner <brauner@kernel.org> says:
Rework the patch to be more ergonomic and in line with our overall error
handling patterns.
Fixes: 7b9d14af87 ("fs: allow mount namespace fd")
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Andrei Vagin <avagin@google.com>
Link: https://patch.msgid.link/20251111062815.2546189-1-avagin@google.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90f601b497 ]
bm_register_write() opens an executable file using open_exec(), which
internally calls do_open_execat() and denies write access on the file to
avoid modification while it is being executed.
However, when an error occurs, bm_register_write() closes the file using
filp_close() directly. This does not restore the write permission, which
may cause subsequent write operations on the same file to fail.
Fix this by calling exe_file_allow_write_access() before filp_close() to
restore the write permission properly.
Fixes: e7850f4d84 ("binfmt_misc: fix possible deadlock in bm_register_write")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Link: https://patch.msgid.link/20251105022923.1813587-1-zilin@seu.edu.cn
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97315e7c90 ]
This was supposed to pass "onenand" instead of "&onenand" with the
ampersand. Passing a random stack address which will be gone when the
function ends makes no sense. However the good thing is that the pointer
is never used, so this doesn't cause a problem at run time.
Fixes: e23abf4b77 ("mtd: OneNAND: S5PC110: Implement DMA interrupt method")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 330e2c5148 ]
When a process tries to access an entry in /afs, normally what happens is
that an automount dentry is created by ->lookup() and then triggered, which
jumps through the ->d_automount() op. Currently, afs_dynroot_lookup() does
not do cell DNS lookup, leaving that to afs_d_automount() to perform -
however, it is possible to use access() or stat() on the automount point,
which will always return successfully, have briefly created an afs_cell
record if one did not already exist.
This means that something like:
test -d "/afs/.west" && echo Directory exists
will print "Directory exists" even though no such cell is configured. This
breaks the "west" python module available on PIP as it expects this access
to fail.
Now, it could be possible to make afs_dynroot_lookup() perform the DNS[*]
lookup, but that would make "ls --color /afs" do this for each cell in /afs
that is listed but not yet probed. kafs-client, probably wrongly, preloads
the entire cell database and all the known cells are then listed in /afs -
and doing ls /afs would be very, very slow, especially if any cell supplied
addresses but was wholly inaccessible.
[*] When I say "DNS", actually read getaddrinfo(), which could use any one
of a host of mechanisms. Could also use static configuration.
To fix this, make the following changes:
(1) Create an enum to specify the origination point of a call to
afs_lookup_cell() and pass this value into that function in place of
the "excl" parameter (which can be derived from it). There are six
points of origination:
- Cell preload through /proc/net/afs/cells
- Root cell config through /proc/net/afs/rootcell
- Lookup in dynamic root
- Automount trigger
- Direct mount with mount() syscall
- Alias check where YFS tells us the cell name is different
(2) Add an extra state into the afs_cell state machine to indicate a cell
that's been initialised, but not yet looked up. This is separate from
one that can be considered active and has been looked up at least
once.
(3) Make afs_lookup_cell() vary its behaviour more, depending on where it
was called from:
If called from preload or root cell config, DNS lookup will not happen
until we definitely want to use the cell (dynroot mount, automount,
direct mount or alias check). The cell will appear in /afs but stat()
won't trigger DNS lookup.
If the cell already exists, dynroot will not wait for the DNS lookup
to complete. If the cell did not already exist, dynroot will wait.
If called from automount, direct mount or alias check, it will wait
for the DNS lookup to complete.
(4) Make afs_lookup_cell() return an error if lookup failed in one way or
another. We try to return -ENOENT if the DNS says the cell does not
exist and -EDESTADDRREQ if we couldn't access the DNS.
Reported-by: Markus Suvanto <markus.suvanto@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220685
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/1784747.1761158912@warthog.procyon.org.uk
Fixes: 1d0b929fc0 ("afs: Change dynroot to create contents on demand")
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 44e8241c51 upstream.
On big endian arm kernels, the arm optimized Curve25519 code produces
incorrect outputs and fails the Curve25519 test. This has been true
ever since this code was added.
It seems that hardly anyone (or even no one?) actually uses big endian
arm kernels. But as long as they're ostensibly supported, we should
disable this code on them so that it's not accidentally used.
Note: for future-proofing, use !CPU_BIG_ENDIAN instead of
CPU_LITTLE_ENDIAN. Both of these are arch-specific options that could
get removed in the future if big endian support gets dropped.
Fixes: d8f1308a02 ("crypto: arm/curve25519 - wire up NEON implementation")
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251104054906.716914-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 14473a1f88 ]
The irq_domain_free_irqs() helper requires that the irq_domain_ops->free
callback is implemented. Otherwise, the kernel reports the warning message
"NULL pointer, cannot free irq" when irq_dispose_mapping() is invoked to
release the per-HART local interrupts.
Set irq_domain_ops->free to irq_domain_free_irqs_top() to cure that.
Fixes: 832f15f426 ("RISC-V: Treat IPIs as normal Linux IRQs")
Signed-off-by: Nick Hu <nick.hu@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251114-rv-intc-fix-v1-1-a3edd1c1a868@sifive.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b0c8e6d3d8 ]
The usage pattern for widen_imprecise_scalars() looks as follows:
prev_st = find_prev_entry(env, ...);
queued_st = push_stack(...);
widen_imprecise_scalars(env, prev_st, queued_st);
Where prev_st is an ancestor of the queued_st in the explored states
tree. This ancestor is not guaranteed to have same allocated stack
depth as queued_st. E.g. in the following case:
def main():
for i in 1..2:
foo(i) // same callsite, differnt param
def foo(i):
if i == 1:
use 128 bytes of stack
iterator based loop
Here, for a second 'foo' call prev_st->allocated_stack is 128,
while queued_st->allocated_stack is much smaller.
widen_imprecise_scalars() needs to take this into account and avoid
accessing bpf_verifier_state->frame[*]->stack out of bounds.
Fixes: 2793a8b015 ("bpf: exact states comparison for iterator convergence checks")
Reported-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20251114025730.772723-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4ef9274362 ]
syzbot found that cls_bpf_classify() is able to change
tc_skb_cb(skb)->drop_reason triggering a warning in sk_skb_reason_drop().
WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 __sk_skb_reason_drop net/core/skbuff.c:1189 [inline]
WARNING: CPU: 0 PID: 5965 at net/core/skbuff.c:1192 sk_skb_reason_drop+0x76/0x170 net/core/skbuff.c:1214
struct tc_skb_cb has been added in commit ec624fe740 ("net/sched:
Extend qdisc control block with tc control block"), which added a wrong
interaction with db58ba4592 ("bpf: wire in data and data_end for
cls_act_bpf").
drop_reason was added later.
Add bpf_prog_run_data_pointers() helper to save/restore the net_sched
storage colliding with BPF data_meta/data_end.
Fixes: ec624fe740 ("net/sched: Extend qdisc control block with tc control block")
Reported-by: syzbot <syzkaller@googlegroups.com>
Closes: https://lore.kernel.org/netdev/6913437c.a70a0220.22f260.013b.GAE@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20251112125516.1563021-1-edumazet@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0a4a18e888 ]
The MODULE_PARM_DESC string for the "active" parameter is missing a
space and has an extraneous trailing ']' character. Correct these.
Before patch:
$ modinfo -p ./drm_client_lib.ko
active:Choose which drm client to start, default isfbdev] (string)
After patch:
$ modinfo -p ./drm_client_lib.ko
active:Choose which drm client to start, default is fbdev (string)
Fixes: f7b42442c4 ("drm/log: Introduce a new boot logger to draw the kmsg on the screen")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251112010920.2355712-1-rdunlap@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 214291cbaa ]
The following lockdep splat was observed while kernel auto-online a CXL
memory region:
======================================================
WARNING: possible circular locking dependency detected
6.17.0djtest+ #53 Tainted: G W
------------------------------------------------------
systemd-udevd/3334 is trying to acquire lock:
ffffffff90346188 (hmem_resource_lock){+.+.}-{4:4}, at: hmem_register_resource+0x31/0x50
but task is already holding lock:
ffffffff90338890 ((node_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x2e/0x70
which lock already depends on the new lock.
[..]
Chain exists of:
hmem_resource_lock --> mem_hotplug_lock --> (node_chain).rwsem
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
rlock((node_chain).rwsem);
lock(mem_hotplug_lock);
lock((node_chain).rwsem);
lock(hmem_resource_lock);
The lock ordering can cause potential deadlock. There are instances
where hmem_resource_lock is taken after (node_chain).rwsem, and vice
versa.
Split out the target update section of hmat_register_target() so that
hmat_callback() only envokes that section instead of attempt to register
hmem devices that it does not need to.
[ dj: Fix up comment to be closer to 80cols. (Jonathan) ]
Fixes: cf8741ac57 ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device")
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Tested-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20251105235115.85062-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7132f7e025 ]
When the BO pointer provided to amdgpu_bo_create_kernel() points to
non-NULL, amdgpu_bo_create_kernel() takes it as a hint to pin that address
rather than allocate a new BO.
This functionality is never desired for allocating ISP buffers. A new BO
should always be created when isp_kernel_buffer_alloc() is called, per the
description for isp_kernel_buffer_alloc().
Ensure this by zeroing *bo right before the amdgpu_bo_create_kernel() call.
Fixes: 55d42f6169 ("drm/amd/amdgpu: Add helper functions for isp buffers")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 73c8c29baac7f0c7e703d92eba009008cbb5228e)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 632108ec07 ]
In snd_usb_create_streams(), for UAC version 3 devices, the Interface
Association Descriptor (IAD) is retrieved via usb_ifnum_to_if(). If this
call fails, a fallback routine attempts to obtain the IAD from the next
interface and sets a BADD profile. However, snd_usb_mixer_controls_badd()
assumes that the IAD retrieved from usb_ifnum_to_if() is always valid,
without performing a NULL check. This can lead to a NULL pointer
dereference when usb_ifnum_to_if() fails to find the interface descriptor.
This patch adds a NULL pointer check after calling usb_ifnum_to_if() in
snd_usb_mixer_controls_badd() to prevent the dereference.
This issue was discovered by syzkaller, which triggered the bug by sending
a crafted USB device descriptor.
Fixes: 17156f23e9 ("ALSA: usb: add UAC3 BADD profiles support")
Signed-off-by: Haein Lee <lhi0729@kaist.ac.kr>
Link: https://patch.msgid.link/vwhzmoba9j2f.vwhzmob9u9e2.g6@dooray.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b623390045 ]
The utimes01 and utime06 tests fail when delegated timestamps are
enabled, specifically in subtests that modify the atime and mtime
fields using the 'nobody' user ID.
The problem can be reproduced as follow:
# echo "/media *(rw,no_root_squash,sync)" >> /etc/exports
# export -ra
# mount -o rw,nfsvers=4.2 127.0.0.1:/media /tmpdir
# cd /opt/ltp
# ./runltp -d /tmpdir -s utimes01
# ./runltp -d /tmpdir -s utime06
This issue occurs because nfs_setattr does not verify the inode's
UID against the caller's fsuid when delegated timestamps are
permitted for the inode.
This patch adds the UID check and if it does not match then the
request is sent to the server for permission checking.
Fixes: e12912d941 ("NFSv4: Add support for delegated atime and mtime attributes")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1f214e9c3a ]
The Smatch static checker noted that in _nfs4_proc_lookupp(), the flag
RPC_TASK_TIMEOUT is being passed as an argument to nfs4_init_sequence(),
which is clearly incorrect.
Since LOOKUPP is an idempotent operation, nfs4_init_sequence() should
not ask the server to cache the result. The RPC_TASK_TIMEOUT flag needs
to be passed down to the RPC layer.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Fixes: 76998ebb91 ("NFSv4: Observe the NFS_MOUNT_SOFTREVAL flag in _nfs4_proc_lookupp")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aae9db5739 ]
1) finish_no_open() takes ERR_PTR() as dentry now.
2) caller of ->atomic_open() will call d_lookup_done() itself, no
need to do it here.
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Stable-dep-of: 85d2c2392a ("NFSv2/v3: Fix error handling in nfs_atomic_open_v23()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fb2cba0854 ]
If the TLS security policy is of type RPC_XPRTSEC_TLS_X509, then the
cert_serial and privkey_serial fields need to match as well since they
define the client's identity, as presented to the server.
Fixes: 90c9550a8d ("NFS: support the kernel keyring for TLS")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ab523ce78 ]
The default setting for the transport security policy must be
RPC_XPRTSEC_NONE, when using a TCP or RDMA connection without TLS.
Conversely, when using TLS, the security policy needs to be set.
Fixes: 6c0a8c5fcf ("NFS: Have struct nfs_client carry a TLS policy field")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7aca00d950 ]
Don't try to add an RDMA transport to a client that is already marked as
being a TCP/TLS transport.
Fixes: 04a1526366 ("pnfs/flexfiles: connect to NFSv3 DS using TLS if MDS connection uses TLS")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b6eddc63c ]
The probe function enables regulators at the beginning
but fails to disable them in its error handling path.
If any operation after enabling the regulators fails,
the probe will exit with an error, leaving the regulators
permanently enabled, which could lead to a resource leak.
Add a proper error handling path to call regulator_bulk_disable()
before returning an error.
Fixes: 9a397f4736 ("ASoC: cs4271: add regulator consumer support")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20251105062246.1955-1-vulab@iscas.ac.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 636f4618b1 ]
In the commit referenced by the Fixes tag,
devm_gpiod_get_optional() was replaced by manual
GPIO management, relying on the regulator core to release the
GPIO descriptor. However, this approach does not account for the
error path: when regulator registration fails, the core never
takes over the GPIO, resulting in a resource leak.
Add gpiod_put() before returning on regulator registration failure.
Fixes: 5e6f3ae5c1 ("regulator: fixed: Let core handle GPIO descriptor")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Link: https://patch.msgid.link/20251028172828.625-1-vulab@iscas.ac.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2d0e88f3fd ]
io_buffer_register_bvec() currently uses blk_rq_nr_phys_segments() as
the number of bvecs in the request. However, bvecs may be split into
multiple segments depending on the queue limits. Thus, the number of
segments may overestimate the number of bvecs. For ublk devices, the
only current users of io_buffer_register_bvec(), virt_boundary_mask,
seg_boundary_mask, max_segments, and max_segment_size can all be set
arbitrarily by the ublk server process.
Set imu->nr_bvecs based on the number of bvecs the rq_for_each_bvec()
loop actually yields. However, continue using blk_rq_nr_phys_segments()
as an upper bound on the number of bvecs when allocating imu to avoid
needing to iterate the bvecs a second time.
Link: https://lore.kernel.org/io-uring/20251111191530.1268875-1-csander@purestorage.com/
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Fixes: 27cb27b6d5 ("io_uring: add support for kernel registered bvecs")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90918e3b64 ]
Sequence adjustment may be required for FTP traffic with PASV/EPSV modes.
due to need to re-write packet payload (IP, port) on the ftp control
connection. This can require changes to the TCP length and expected
seq / ack_seq.
The easiest way to reproduce this issue is with PASV mode.
Example ruleset:
table inet ftp_nat {
ct helper ftp_helper {
type "ftp" protocol tcp
l3proto inet
}
chain prerouting {
type filter hook prerouting priority 0; policy accept;
tcp dport 21 ct state new ct helper set "ftp_helper"
}
}
table ip nat {
chain prerouting {
type nat hook prerouting priority -100; policy accept;
tcp dport 21 dnat ip prefix to ip daddr map {
192.168.100.1 : 192.168.13.2/32 }
}
chain postrouting {
type nat hook postrouting priority 100 ; policy accept;
tcp sport 21 snat ip prefix to ip saddr map {
192.168.13.2 : 192.168.100.1/32 }
}
}
Note that the ftp helper gets assigned *after* the dnat setup.
The inverse (nat after helper assign) is handled by an existing
check in nf_nat_setup_info() and will not show the problem.
Topoloy:
+-------------------+ +----------------------------------+
| FTP: 192.168.13.2 | <-> | NAT: 192.168.13.3, 192.168.100.1 |
+-------------------+ +----------------------------------+
|
+-----------------------+
| Client: 192.168.100.2 |
+-----------------------+
ftp nat changes do not work as expected in this case:
Connected to 192.168.100.1.
[..]
ftp> epsv
EPSV/EPRT on IPv4 off.
ftp> ls
227 Entering passive mode (192,168,100,1,209,129).
421 Service not available, remote server has closed connection.
Kernel logs:
Missing nfct_seqadj_ext_add() setup call
WARNING: CPU: 1 PID: 0 at net/netfilter/nf_conntrack_seqadj.c:41
[..]
__nf_nat_mangle_tcp_packet+0x100/0x160 [nf_nat]
nf_nat_ftp+0x142/0x280 [nf_nat_ftp]
help+0x4d1/0x880 [nf_conntrack_ftp]
nf_confirm+0x122/0x2e0 [nf_conntrack]
nf_hook_slow+0x3c/0xb0
..
Fix this by adding the required extension when a conntrack helper is assigned
to a connection that has a nat binding.
Fixes: 1a64edf54f ("netfilter: nft_ct: add helper set support")
Signed-off-by: Andrii Melnychenko <a.melnychenko@vyos.io>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e060088db0 ]
l2cap_chan_put() is exported, so export also l2cap_chan_hold() for
modules.
l2cap_chan_hold() has use case in net/bluetooth/6lowpan.c
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4b747cc628 ]
Commit ac4e04d9e3 ("cpufreq: intel_pstate: Unchecked MSR aceess in
legacy mode") introduced a check for feature X86_FEATURE_IDA to verify
turbo mode support. Although this is the correct way to check for turbo
mode support, it causes issues on some platforms that disable turbo
during OS boot, but enable it later [1]. Before adding this feature
check, users were able to get turbo mode frequencies by writing 0 to
/sys/devices/system/cpu/intel_pstate/no_turbo post-boot.
To restore the old behavior on the affected systems while still
addressing the unchecked MSR issue on some Skylake-X systems, check
X86_FEATURE_IDA only immediately before updates of MSR_IA32_PERF_CTL
that may involve setting the Turbo Engage Bit (bit 32).
Fixes: ac4e04d9e3 ("cpufreq: intel_pstate: Unchecked MSR aceess in legacy mode")
Reported-by: Aaron Rainbolt <arainbolt@kfocus.org>
Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2122531 [1]
Tested-by: Aaron Rainbolt <arainbolt@kfocus.org>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject adjustment, changelog edits ]
Link: https://patch.msgid.link/20251111010840.141490-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0fce758706 ]
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPU via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function cppc_perf_ctrs_in_pcc() checks if the CPPC
perf-ctrs are in a PCC region for all the present CPUs, which breaks
when the kernel is booted with "nosmt=force".
Hence, limit the check only to the online CPUs.
Fixes: ae2df912d1 ("ACPI: CPPC: Disable FIE if registers in PCC regions")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-5-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8821c8e80a ]
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function cppc_allow_fast_switch() checks for the validity
of the _CPC object for all the present CPUs. This breaks when the
kernel is booted with "nosmt=force".
Check fast_switch capability only on online CPUs
Fixes: 15eece6c5b ("ACPI: CPPC: Fix NULL pointer dereference when nosmp is used")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-4-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2c26c82f7 ]
For HSRv0, the path_id has the following meaning:
- 0000: PRP supervision frame
- 0001-1001: HSR ring identifier
- 1010-1011: Frames from PRP network (A/B, with RedBoxes)
- 1111: HSR supervision frame
Follow the IEC 62439-3:2010 standard more closely by setting the right
path_id for HSRv0 supervision frames (actually, it is correctly set when
the frame is constructed, but hsr_set_path_id() overwrites it) and set a
fixed HSR ring identifier of 1. The ring identifier seems to be generally
unused and we ignore it anyways on reception, but some fixed identifier is
definitely better than using one identifier in one direction and a wrong
identifier in the other.
This was also the behavior before commit f266a683a4 ("net/hsr: Better
frame dispatch") which introduced the alternating path_id. This was later
moved to hsr_set_path_id() in commit 451d8123f8 ("net: prp: add packet
handling support").
The IEC 62439-3:2010 also contains 6 unused bytes after the MacAddressA in
the HSRv0 supervision frames. Adjust a TODO comment accordingly.
Fixes: f266a683a4 ("net/hsr: Better frame dispatch")
Fixes: 451d8123f8 ("net: prp: add packet handling support")
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/ea0d5133cd593856b2fa673d6e2067bf1d4d1794.1762876095.git.fmaurer@redhat.com
Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0eff2eaa53 ]
The purpose of commit 703eec1b24 ("virtio_net: fixing XDP for fully
checksummed packets handling") is to record the flags in advance, as
their value may be overwritten in the XDP case. However, the flags
recorded under big mode are incorrect, because in big mode, the passed
buf does not point to the rx buffer, but rather to the page of the
submitted buffer. This commit fixes this issue.
For the small mode, the commit c11a49d58a ("virtio_net: Fix mismatched
buf address when unmapping for small packets") fixed it.
Tested-by: Alyssa Ross <hi@alyssa.is>
Fixes: 703eec1b24 ("virtio_net: fixing XDP for fully checksummed packets handling")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20251111090828.23186-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1a222625b4 ]
One of the factors of a link's grade is the channel load, which is
calculated from the AP's bss load element.
The current code takes this element from the beacon for an active link,
and from bss->ies for an inactive link.
bss->ies is set to either the beacon's ies or to the probe response
ones, with preference to the probe response (meaning that if there was
even one probe response, the ies of it will be stored in bss->ies and
won't be overiden by the beacon ies).
The probe response can be very old, i.e. from the connection time,
where a beacon is updated before each link selection (which is
triggered only after a passive scan).
In such case, the bss load element in the probe response will not
include the channel load caused by the STA, where the beacon will.
This will cause the inactive link to always have a lower channel
load, and therefore an higher grade than the active link's one.
This causes repeated link switches, causing the throughput to drop.
Fix this by always taking the ies from the beacon, as those are for
sure new.
Fixes: d1e879ec60 ("wifi: iwlwifi: add iwlmld sub-driver")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20251110145652.b493dbb1853a.I058ba7309c84159f640cc9682d1bda56dd56a536@changeid
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0345552a65 ]
After commit 100dfa74cad9 ("inet: dev_queue_xmit() llist adoption")
I started seeing many qdisc requeues on IDPF under high TX workload.
$ tc -s qd sh dev eth1 handle 1: ; sleep 1; tc -s qd sh dev eth1 handle 1:
qdisc mq 1: root
Sent 43534617319319 bytes 268186451819 pkt (dropped 0, overlimits 0 requeues 3532840114)
backlog 1056Kb 6675p requeues 3532840114
qdisc mq 1: root
Sent 43554665866695 bytes 268309964788 pkt (dropped 0, overlimits 0 requeues 3537737653)
backlog 781164b 4822p requeues 3537737653
This is caused by try_bulk_dequeue_skb() being only limited by BQL budget.
perf record -C120-239 -e qdisc:qdisc_dequeue sleep 1 ; perf script
...
netperf 75332 [146] 2711.138269: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1292 skbaddr=0xff378005a1e9f200
netperf 75332 [146] 2711.138953: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1213 skbaddr=0xff378004d607a500
netperf 75330 [144] 2711.139631: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1233 skbaddr=0xff3780046be20100
netperf 75333 [147] 2711.140356: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1093 skbaddr=0xff37800514845b00
netperf 75337 [151] 2711.141037: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1353 skbaddr=0xff37800460753300
netperf 75337 [151] 2711.141877: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1367 skbaddr=0xff378004e72c7b00
netperf 75330 [144] 2711.142643: qdisc:qdisc_dequeue: dequeue ifindex=5 qdisc handle=0x80150000 parent=0x10013 txq_state=0x0 packets=1202 skbaddr=0xff3780045bd60000
...
This is bad because :
1) Large batches hold one victim cpu for a very long time.
2) Driver often hit their own TX ring limit (all slots are used).
3) We call dev_requeue_skb()
4) Requeues are using a FIFO (q->gso_skb), breaking qdisc ability to
implement FQ or priority scheduling.
5) dequeue_skb() gets packets from q->gso_skb one skb at a time
with no xmit_more support. This is causing many spinlock games
between the qdisc and the device driver.
Requeues were supposed to be very rare, lets keep them this way.
Limit batch sizes to /proc/sys/net/core/dev_weight (default 64) as
__qdisc_run() was designed to use.
Fixes: 5772e9a346 ("qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://patch.msgid.link/20251109161215.2574081-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5eba42f01 ]
Currently, CQs without a completion function are assigned the
mlx5_add_cq_to_tasklet function by default. This is problematic since
only user CQs created through the mlx5_ib driver are intended to use
this function.
Additionally, all CQs that will use doorbells instead of polling for
completions must call mlx5_cq_arm. However, the default CQ creation flow
leaves a valid value in the CQ's arm_db field, allowing FW to send
interrupts to polling-only CQs in certain corner cases.
These two factors would allow a polling-only kernel CQ to be triggered
by an EQ interrupt and call a completion function intended only for user
CQs, causing a null pointer exception.
Some areas in the driver have prevented this issue with one-off fixes
but did not address the root cause.
This patch fixes the described issue by adding defaults to the create CQ
flow. It adds a default dummy completion function to protect against
null pointer exceptions, and it sets an invalid command sequence number
by default in kernel CQs to prevent the FW from sending an interrupt to
the CQ until it is armed. User CQs are responsible for their own
initialization values.
Callers of mlx5_core_create_cq are responsible for changing the
completion function and arming the CQ per their needs.
Fixes: cdd04f4d4d ("net/mlx5: Add support to create SQ and CQ for ASO")
Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Leon Romanovsky <leon@kernel.org>
Link: https://patch.msgid.link/1762681743-1084694-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a315b723e8 ]
Completion queues (CQs) in mlx5 use the same global doorbell, which may
become contended when accessed concurrently from many cores.
This patch prepares the CQ management code for supporting different
doorbells per CQ. This will be used in downstream patches to allow
separate doorbells to be used by channels CQs.
The main change is moving the 'uar' pointer from struct mlx5_core_cq to
struct mlx5e_cq, as the uar page to be used is better off stored
directly there. Other users of mlx5_core_cq also store the UAR to be
used separately and therefore the pointer being removed is dead weight
for them. As evidence, in this patch there are two users which set the
mcq.uar pointer but didn't use it, Software Steering and old Innova CQ
creation code. Instead, they rang the doorbell directly from another
pointer.
The 'uar' pointer added to struct mlx5e_cq remains in a hot cacheline
(as before), because it may get accessed for each packet.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: e5eba42f01 ("mlx5: Fix default values in create CQ")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa4595d0ad ]
The global doorbell is used for more than just Ethernet resources, so
move it out of mlx5e_hw_objs into a common place (mlx5_priv), to avoid
non-Ethernet modules (e.g. HWS, ASO) depending on Ethernet structs.
Use this opportunity to consolidate it with the 'uar' pointer already
there, which was used as an RX doorbell. Underneath the 'uar' pointer is
identical to 'bfreg->up', so store a single resource and use that
instead.
For CQ doorbells, care is taken to always use bfreg->up->index instead
of bfreg->index, which may refer to a subsequent UAR page from the same
ALLOC_UAR batch on some NICs.
This paves the way for cleanly supporting multiple doorbells in the
Ethernet driver.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: e5eba42f01 ("mlx5: Fix default values in create CQ")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a7bf4d5063 ]
The previous calculation used roundup() which caused an overflow for
rates between 25.5Gbps and 26Gbps.
For example, a rate of 25.6Gbps would result in using 100Mbps units with
value of 256, which would overflow the 8 bits field.
Simplify the upper_limit_mbps calculation by removing the
unnecessary roundup, and adjust the comparison to use <= to correctly
handle the boundary condition.
Fixes: d8880795da ("net/mlx5e: Implement DCBNL IEEE max rate")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1762681073-1084058-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 485e0626e5 ]
This handles PA Sync Lost event which previously was assumed to be
handled with BIG Sync Lost but their lifetime are not the same thus why
there are 2 different events to inform when each sync is lost.
Fixes: b2a5f2e1c1 ("Bluetooth: hci_event: Add support for handling LE BIG Sync Lost event")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 60e6489f8e ]
Quang Le reported that the AF_UNIX GC could garbage-collect a
receive queue of an alive in-flight socket, with a nice repro.
The repro consists of three stages.
1)
1-a. Create a single cyclic reference with many sockets
1-b. close() all sockets
1-c. Trigger GC
2)
2-a. Pass sk-A to an embryo sk-B
2-b. Pass sk-X to sk-X
2-c. Trigger GC
3)
3-a. accept() the embryo sk-B
3-b. Pass sk-B to sk-C
3-c. close() the in-flight sk-A
3-d. Trigger GC
As of 2-c, sk-A and sk-X are linked to unix_unvisited_vertices,
and unix_walk_scc() groups them into two different SCCs:
unix_sk(sk-A)->vertex->scc_index = 2 (UNIX_VERTEX_INDEX_START)
unix_sk(sk-X)->vertex->scc_index = 3
Once GC completes, unix_graph_grouped is set to true.
Also, unix_graph_maybe_cyclic is set to true due to sk-X's
cyclic self-reference, which makes close() trigger GC.
At 3-b, unix_add_edge() allocates unix_sk(sk-B)->vertex and
links it to unix_unvisited_vertices.
unix_update_graph() is called at 3-a. and 3-b., but neither
unix_graph_grouped nor unix_graph_maybe_cyclic is changed
because both sk-B's listener and sk-C are not in-flight.
3-c decrements sk-A's file refcnt to 1.
Since unix_graph_grouped is true at 3-d, unix_walk_scc_fast()
is finally called and iterates 3 sockets sk-A, sk-B, and sk-X:
sk-A -> sk-B (-> sk-C)
sk-X -> sk-X
This is totally fine. All of them are not yet close()d and
should be grouped into different SCCs.
However, unix_vertex_dead() misjudges that sk-A and sk-B are
in the same SCC and sk-A is dead.
unix_sk(sk-A)->scc_index == unix_sk(sk-B)->scc_index <-- Wrong!
&&
sk-A's file refcnt == unix_sk(sk-A)->vertex->out_degree
^-- 1 in-flight count for sk-B
-> sk-A is dead !?
The problem is that unix_add_edge() does not initialise scc_index.
Stage 1) is used for heap spraying, making a newly allocated
vertex have vertex->scc_index == 2 (UNIX_VERTEX_INDEX_START)
set by unix_walk_scc() at 1-c.
Let's track the max SCC index from the previous unix_walk_scc()
call and assign the max + 1 to a new vertex's scc_index.
This way, we can continue to avoid Tarjan's algorithm while
preventing misjudgments.
Fixes: ad081928a8 ("af_unix: Avoid Tarjan's algorithm if unnecessary.")
Reported-by: Quang Le <quanglex97@gmail.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251109025233.3659187-1-kuniyu@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4b00d132d ]
The am65_cpsw_iet_verify_wait() function attempts verification 20 times,
toggling the AM65_CPSW_PN_IET_MAC_LINKFAIL bit in each iteration. When
the LINKFAIL bit transitions from 1 to 0, the MAC merge layer initiates
the verification process and waits for the timeout configured in
MAC_VERIFY_CNT before automatically retransmitting. The MAC_VERIFY_CNT
register is configured according to the user-defined verify/response
timeout in am65_cpsw_iet_set_verify_timeout_count(). As per IEEE 802.3
Clause 99, the hardware performs this automatic retry up to 3 times.
Current implementation toggles LINKFAIL after the user-configured
verify/response timeout in each iteration, forcing the hardware to
restart verification instead of respecting the MAC_VERIFY_CNT timeout.
This bypasses the hardware's automatic retry mechanism.
Fix this by moving the LINKFAIL bit toggle outside the retry loop and
reducing the retry count from 20 to 3. The software now only monitors
the status register while the hardware autonomously handles the 3
verification attempts at proper MAC_VERIFY_CNT intervals.
Fixes: 49a2eb9068 ("net: ethernet: ti: am65-cpsw-qos: Add Frame Preemption MAC Merge support")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
Link: https://patch.msgid.link/20251106092305.1437347-3-a-garg7@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49b3916465 ]
The CPSW module uses the MAC_VERIFY_CNT bit field in the
CPSW_PN_IET_VERIFY_REG_k register to set the verify/response timeout
count. This register specifies the number of clock cycles to wait before
resending a verify packet if the verification fails.
The verify/response timeout count, as being set by the function
am65_cpsw_iet_set_verify_timeout_count() is hardcoded for 125MHz
clock frequency, which varies based on PHY mode and link speed.
The respective clock frequencies are as follows:
- RGMII mode:
* 1000 Mbps: 125 MHz
* 100 Mbps: 25 MHz
* 10 Mbps: 2.5 MHz
- QSGMII/SGMII mode: 125 MHz (all speeds)
Fix this by adding logic to calculate the correct timeout counts
based on the actual PHY interface mode and link speed.
Fixes: 49a2eb9068 ("net: ethernet: ti: am65-cpsw-qos: Add Frame Preemption MAC Merge support")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
Link: https://patch.msgid.link/20251106092305.1437347-2-a-garg7@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3072f00bba ]
In tls_handshake_accept(), a netlink message is allocated using
genlmsg_new(). In the error handling path, genlmsg_cancel() is called
to cancel the message construction, but the message itself is not freed.
This leads to a memory leak.
Fix this by calling nlmsg_free() in the error path after genlmsg_cancel()
to release the allocated memory.
Fixes: 2fd5532044 ("net/handshake: Add a kernel API for requesting a TLSv1.3 handshake")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20251106144511.3859535-1-zilin@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ec33f2e5a2 ]
The current CLC proposal message construction uses a mix of
`ini->smc_type_v1/v2` and `pclc_base->hdr.typev1/v2` to decide whether
to include optional extensions (IPv6 prefix extension for v1, and v2
extension). This leads to a critical inconsistency: when
`smc_clc_prfx_set()` fails - for example, in IPv6-only environments with
only link-local addresses, or when the local IP address and the outgoing
interface’s network address are not in the same subnet.
As a result, the proposal message is assembled using the stale
`ini->smc_type_v1` value—causing the IPv6 prefix extension to be
included even though the header indicates v1 is not supported.
The peer then receives a malformed CLC proposal where the header type
does not match the payload, and immediately resets the connection.
The fix ensures consistency between the CLC header flags and the actual
payload by synchronizing `ini->smc_type_v1` with `pclc_base->hdr.typev1`
when prefix setup fails.
Fixes: 8c3dca341a ("net/smc: build and send V2 CLC proposal")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://patch.msgid.link/20251107024029.88753-1-alibuda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 762e7e174d ]
Broadcom switches locally terminate link local traffic and do not
forward it, so we should not mark it as offloaded.
In some situations we still want/need to flood this traffic, e.g. if STP
is disabled, or it is explicitly enabled via the group_fwd_mask. But if
the skb is marked as offloaded, the kernel will assume this was already
done in hardware, and the packets never reach other bridge ports.
So ensure that link local traffic is never marked as offloaded, so that
the kernel can forward/flood these packets in software if needed.
Since the local termination in not configurable, check the destination
MAC, and never mark packets as offloaded if it is a link local ether
address.
While modern switches set the tag reason code to BRCM_EG_RC_PROT_TERM
for trapped link local traffic, they also set it for link local traffic
that is flooded (01:80:c2:00:00:10 to 01:80:c2:00:00:2f), so we cannot
use it and need to look at the destination address for them as well.
Fixes: 964dbf186e ("net: dsa: tag_brcm: add support for legacy tags")
Fixes: 0e62f543be ("net: dsa: Fix duplicate frames flooded by learning")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251109134635.243951-1-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1534ff7775 ]
syzbot reported a possible shift-out-of-bounds [1]
Blamed commit added rto_alpha_max and rto_beta_max set to 1000.
It is unclear if some sctp users are setting very large rto_alpha
and/or rto_beta.
In order to prevent user regression, perform the test at run time.
Also add READ_ONCE() annotations as sysctl values can change under us.
[1]
UBSAN: shift-out-of-bounds in net/sctp/transport.c:509:41
shift exponent 64 is too large for 32-bit type 'unsigned int'
CPU: 0 UID: 0 PID: 16704 Comm: syz.2.2320 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120
ubsan_epilogue lib/ubsan.c:233 [inline]
__ubsan_handle_shift_out_of_bounds+0x27f/0x420 lib/ubsan.c:494
sctp_transport_update_rto.cold+0x1c/0x34b net/sctp/transport.c:509
sctp_check_transmitted+0x11c4/0x1c30 net/sctp/outqueue.c:1502
sctp_outq_sack+0x4ef/0x1b20 net/sctp/outqueue.c:1338
sctp_cmd_process_sack net/sctp/sm_sideeffect.c:840 [inline]
sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1372 [inline]
Fixes: b58537a1f5 ("net: sctp: fix permissions for rto_alpha and rto_beta knobs")
Reported-by: syzbot+f8c46c8b2b7f6e076e99@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/690c81ae.050a0220.3d0d33.014e.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20251106111054.3288127-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 41bf23338a ]
Contrary to what was stated on d36349ea73 ("Bluetooth: hci_conn:
Fix running bis_cleanup for hci_conn->type PA_LINK") the PA_LINK does
in fact needs to run bis_cleanup in order to terminate the PA Sync,
since that is bond to the listening socket which is the entity that
controls the lifetime of PA Sync, so if it is closed/released the PA
Sync shall be terminated, terminating the PA Sync shall not result in
the BIG Sync being terminated since once the later is established it
doesn't depend on the former anymore.
If the use user wants to reconnect/rebind a number of BIS(s) it shall
keep the socket open until it no longer needs the PA Sync, which means
it retains full control of the lifetime of both PA and BIG Syncs.
Fixes: d36349ea73 ("Bluetooth: hci_conn: Fix running bis_cleanup for hci_conn->type PA_LINK")
Fixes: a7bcffc673 ("Bluetooth: Add PA_LINK to distinguish BIG sync and PA sync connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 98454bc812 ]
disconnect_all_peers() calls sleeping function (l2cap_chan_close) under
spinlock. Holding the lock doesn't actually do any good -- we work on a
local copy of the list, and the lock doesn't protect against peer->chan
having already been freed.
Fix by taking refcounts of peer->chan instead. Clean up the code and
old comments a bit.
Take devices_lock instead of RCU, because the kfree_rcu();
l2cap_chan_put(); construct in chan_close_cb() does not guarantee
peer->chan is necessarily valid in RCU.
Also take l2cap_chan_lock() which is required for l2cap_chan_close().
Log: (bluez 6lowpan-tester Client Connect - Disable)
------
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:575
...
<TASK>
...
l2cap_send_disconn_req (net/bluetooth/l2cap_core.c:938 net/bluetooth/l2cap_core.c:1495)
...
? __pfx_l2cap_chan_close (net/bluetooth/l2cap_core.c:809)
do_enable_set (net/bluetooth/6lowpan.c:1048 net/bluetooth/6lowpan.c:1068)
------
Fixes: 9030582963 ("Bluetooth: 6lowpan: Converting rwlocks to use RCU")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b454505bf5 ]
Bluetooth 6lowpan.c confuses BDADDR_LE and ADDR_LE_DEV address types,
e.g. debugfs "connect" command takes the former, and "disconnect" and
"connect" to already connected device take the latter. This is due to
using same value both for l2cap_chan_connect and hci_conn_hash_lookup_le
which take different dst_type values.
Fix address type passed to hci_conn_hash_lookup_le().
Retain the debugfs API difference between "connect" and "disconnect"
commands since it's been like this since 2015 and nobody apparently
complained.
Fixes: f5ad4ffceb ("Bluetooth: 6lowpan: Use hci_conn_hash_lookup_le() when possible")
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3b78f50918 ]
Bluetooth 6lowpan.c netdev has header_ops, so it must set link-local
header for RX skb, otherwise things crash, eg. with AF_PACKET SOCK_RAW
Add missing skb_reset_mac_header() for uncompressed ipv6 RX path.
For the compressed one, it is done in lowpan_header_decompress().
Log: (BlueZ 6lowpan-tester Client Recv Raw - Success)
------
kernel BUG at net/core/skbuff.c:212!
Call Trace:
<IRQ>
...
packet_rcv (net/packet/af_packet.c:2152)
...
<TASK>
__local_bh_enable_ip (kernel/softirq.c:407)
netif_rx (net/core/dev.c:5648)
chan_recv_cb (net/bluetooth/6lowpan.c:294 net/bluetooth/6lowpan.c:359)
------
Fixes: 18722c2470 ("Bluetooth: Enable 6LoWPAN support for BT LE devices")
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 55fb52ffdd ]
mesh_send_done timer is not canceled when hdev is removed, which causes
crash if the timer triggers after hdev is gone.
Cancel the timer when MGMT removes the hdev, like other MGMT timers.
Should fix the BUG: sporadically seen by BlueZ test bot
(in "Mesh - Send cancel - 1" test).
Log:
------
BUG: KASAN: slab-use-after-free in run_timer_softirq+0x76b/0x7d0
...
Freed by task 36:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
__kasan_save_free_info+0x3a/0x60
__kasan_slab_free+0x43/0x70
kfree+0x103/0x500
device_release+0x9a/0x210
kobject_put+0x100/0x1e0
vhci_release+0x18b/0x240
------
Fixes: b338d91703 ("Bluetooth: Implement support for Mesh")
Link: https://lore.kernel.org/linux-bluetooth/67364c09.0c0a0220.113cba.39ff@mx.google.com/
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe4b3a34e9 ]
It's used to work around an objtool issue since commit abb2a55722
("LoongArch: Add cflag -fno-isolate-erroneous-paths-dereference"), but
it's then passed to bindgen and cause an error because Clang does not
have this option.
Fixes: abb2a55722 ("LoongArch: Add cflag -fno-isolate-erroneous-paths-dereference")
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Tested-by: Mingcong Bai <jeffbai@aosc.io>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a0de636ed7 ]
As the name suggests this function modifies the register in an
extended page. It has the same parameters as phy_modify_mmd.
This function was introduce because there are many places in the
code where the registers was read then the value was modified and
written back. So replace all this code with this function to make
it clear.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://patch.msgid.link/20250818075121.1298170-3-horatiu.vultur@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 96a9178a29 ("net: phy: micrel: lan8814 fix reset of the QSGMII interface")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 57531b3416 ]
It seems that most of the tests prepare the interfaces once before the test
run (setup_prepare()), rely on setup_wait() to wait for link and only then
run the test(s).
local_termination brings the physical interfaces down and up during test
run but never wait for them to come up. If the auto-negotiation takes
some seconds, first test packets are being lost, which leads to
false-negative test results.
Use setup_wait() in run_test() to make sure auto-negotiation has been
completed after all simple_if_init() calls on physical interfaces and test
packets will not be lost because of the race against link establishment.
Fixes: 90b9566aa5 ("selftests: forwarding: add a test for local_termination.sh")
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://patch.msgid.link/20251106161213.459501-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9065b96875 ]
When reporting tx completion using ieee80211_tx_status_xxx() family of
functions, the status part of the struct ieee80211_tx_info nested in the
skb is used to report things like transmit rates & retry count to mac80211
On the TX data path, this is correctly memset to 0 before calling
ieee80211_tx_status_ext(), but on the tx mgmt path this was not done.
This leads to mac80211 treating garbage values as valid transmit counters
(like tx retries for example) and accounting them as real statistics that
makes their way to userland via station dump.
The same issue was resolved in ath12k by commit 9903c0986f ("wifi:
ath12k: Add memset and update default rate value in wmi tx completion")
Tested-on: QCN9074 PCI WLAN.HK.2.9.0.1-01977-QCAHKSWPL_SILICONZ-1
Fixes: d5c65159f2 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Link: https://patch.msgid.link/20251104083957.717825-1-nico.escande@gmail.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ed80cc4667 ]
The Logitech G502 Hero Wireless's high resolution scrolling resets after
being unplugged without notifying the driver, causing extremely slow
scrolling.
The only indication of this is a battery update packet, so add a quirk to
detect when the device is unplugged and re-enable the scrolling.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218037
Signed-off-by: Stuart Hayhurst <stuart.a.hayhurst@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 437c23357d ]
There might be many reasons why a user is resizing a ring, e.g. moving
to huge pages or for some memory compaction using IORING_SETUP_NO_MMAP.
Don't bypass resizing, the user will definitely be surprised seeing 0
while the rings weren't actually moved to a new place.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 82ebecdc74 ]
We found an infinite loop bug in the exFAT file system that can lead to a
Denial-of-Service (DoS) condition. When a dentry in an exFAT filesystem is
malformed, the following system calls — SYS_openat, SYS_ftruncate, and
SYS_pwrite64 — can cause the kernel to hang.
Root cause analysis shows that the size validation code in exfat_find()
does not check whether dentry.stream.valid_size is negative. As a result,
the system calls mentioned above can succeed and eventually trigger the DoS
issue.
This patch adds a check for negative dentry.stream.valid_size to prevent
this vulnerability.
Co-developed-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Co-developed-by: Jihoon Kwon <jimmyxyz010315@gmail.com>
Signed-off-by: Jihoon Kwon <jimmyxyz010315@gmail.com>
Signed-off-by: Jaehun Gou <p22gone@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1141ed5234 ]
This patch adds ALWAYS_POLL quirk for the VRS R295 steering wheel joystick.
This device reboots itself every 8-10 seconds if it is not polled.
Signed-off-by: Oleg Makarenko <oleg@makarenk.ooo>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ff022f382 ]
I noticed xfstests generic/193 and generic/355 started failing against
knfsd after commit e7a8ebc305 ("NFSD: Offer write delegation for OPEN
with OPEN4_SHARE_ACCESS_WRITE").
I ran those same tests against ONTAP (which has had write delegation
support for a lot longer than knfsd) and they fail there too... so
while it's a new failure against knfsd, it isn't an entirely new
failure.
Add the NFS_INO_REVAL_FORCED flag so that the presence of a delegation
doesn't keep the inode from being revalidated to fetch the updated mode.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b73bc6a51f ]
Some third-party controllers, such as the PB Tails CHOC, won't always
respond quickly on startup. Since this packet is needed for probe, and only
once during probe, let's just wait an extra second, which makes connecting
consistent.
Signed-off-by: Vicki Pfau <vi@endrift.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0be4253bf8 ]
The Cooler Master Mice Dongle includes a vendor defined HID interface
alongside its mouse interface. Not polling it will cause the mouse to
stop responding to polls on any interface once woken up again after
going into power saving mode.
Add the HID_QUIRK_ALWAYS_POLL quirk alongside the Cooler Master VID and
the Dongle's PID.
Signed-off-by: Tristan Lobb <tristan.lobb@it-lobb.de>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7a84394f02 ]
The setting of delay_retrans is applied to synchronous RPC operations
because the retransmit count is stored in same struct nfs4_exception
that is passed each time an error is checked. However, for asynchronous
operations (READ, WRITE, LOCKU, CLOSE, DELEGRETURN), a new struct
nfs4_exception is made on the stack each time the task callback is
invoked. This means that the retransmit count is always zero and thus
delay_retrans never takes effect.
Apply delay_retrans to these operations by tracking and updating their
retransmit count.
Change-Id: Ieb33e046c2b277cb979caa3faca7f52faf0568c9
Signed-off-by: Joshua Watt <jpewhacker@gmail.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9bb3baa9d1 ]
Since the last renewal time was initialized to 0 and jiffies start
counting at -5 minutes, any clients connected in the first 5 minutes
after a reboot would have their renewal timer set to a very long
interval. If the connection was idle, this would result in the client
state timing out on the server and the next call to the server would
return NFS4ERR_BADSESSION.
Fix this by initializing the last renewal time to the current jiffies
instead of 0.
Signed-off-by: Joshua Watt <jpewhacker@gmail.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 883f309add ]
Previously, APU platforms (and other scenarios with uninitialized VRAM managers)
triggered a NULL pointer dereference in `ttm_resource_manager_usage()`. The root
cause is not that the `struct ttm_resource_manager *man` pointer itself is NULL,
but that `man->bdev` (the backing device pointer within the manager) remains
uninitialized (NULL) on APUs—since APUs lack dedicated VRAM and do not fully
set up VRAM manager structures. When `ttm_resource_manager_usage()` attempts to
acquire `man->bdev->lru_lock`, it dereferences the NULL `man->bdev`, leading to
a kernel OOPS.
1. **amdgpu_cs.c**: Extend the existing bandwidth control check in
`amdgpu_cs_get_threshold_for_moves()` to include a check for
`ttm_resource_manager_used()`. If the manager is not used (uninitialized
`bdev`), return 0 for migration thresholds immediately—skipping VRAM-specific
logic that would trigger the NULL dereference.
2. **amdgpu_kms.c**: Update the `AMDGPU_INFO_VRAM_USAGE` ioctl and memory info
reporting to use a conditional: if the manager is used, return the real VRAM
usage; otherwise, return 0. This avoids accessing `man->bdev` when it is
NULL.
3. **amdgpu_virt.c**: Modify the vf2pf (virtual function to physical function)
data write path. Use `ttm_resource_manager_used()` to check validity: if the
manager is usable, calculate `fb_usage` from VRAM usage; otherwise, set
`fb_usage` to 0 (APUs have no discrete framebuffer to report).
This approach is more robust than APU-specific checks because it:
- Works for all scenarios where the VRAM manager is uninitialized (not just APUs),
- Aligns with TTM's design by using its native helper function,
- Preserves correct behavior for discrete GPUs (which have fully initialized
`man->bdev` and pass the `ttm_resource_manager_used()` check).
v4: use ttm_resource_manager_used(&adev->mman.vram_mgr.manager) instead of checking the adev->gmc.is_app_apu flag (Christian)
Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee70bacef1 ]
The interrupt handler offloads the microphone detection logic to
nau8821_jdet_work(), which implies a sleep operation. However, before
being able to process any subsequent hotplug event, the interrupt
handler needs to wait for any prior scheduled work to complete.
Move the sleep out of jdet_work by converting it to a delayed work.
This eliminates the undesired blocking in the interrupt handler when
attempting to cancel a recently scheduled work item and should help
reducing transient input reports that might confuse user-space.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://patch.msgid.link/20251003-nau8821-jdet-fixes-v1-5-f7b0e2543f09@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d90ad28e8a ]
These syscalls call to vfs_fileattr_get/set functions which return
ENOIOCTLCMD if filesystem doesn't support setting file attribute on an
inode. For syscalls EOPNOTSUPP would be more appropriate return error.
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 69a8b62a7a ]
Similar to the ARM64 commit 3505f30fb6a9s ("ARM64 / ACPI: If we chose
to boot from acpi then disable FDT"), let's not do DT hardware probing
if ACPI is enabled in early boot. This avoids errors caused by
repeated driver probing.
Signed-off-by: Han Gao <rabenda.cn@gmail.com>
Link: https://lore.kernel.org/r/20250910112401.552987-1-rabenda.cn@gmail.com
[pjw@kernel.org: cleaned up patch description and subject]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae9e9f3d67 ]
openSBI v1.7 adds harts checks for ipi operations. Especially it
adds comparison between hmask passed as an argument from linux
and mask of online harts (from openSBI side). If they don't
fit each other the error occurs.
When cpu is offline, cpu_online_mask is explicitly cleared in
__cpu_disable. However, there is no explicit clearing of
mm_cpumask. mm_cpumask is used for rfence operations that
call openSBI RFENCE extension which uses ipi to remote harts.
If hart is offline there may be error if mask of linux is not
as mask of online harts in openSBI.
this patch adds explicit clearing of mm_cpumask for offline hart.
Signed-off-by: Danil Skrebenkov <danil.skrebenkov@cloudbear.ru>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20250919132849.31676-1-danil.skrebenkov@cloudbear.ru
[pjw@kernel.org: rewrote subject line for clarity]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9818af18db ]
Per Nathan, clang catches unused "static inline" functions in C files
since commit 6863f5643d ("kbuild: allow Clang to find unused static
inline functions for W=1 build").
Linus said:
> So I entirely ignore W=1 issues, because I think so many of the extra
> warnings are bogus.
>
> But if this one in particular is causing more problems than most -
> some teams do seem to use W=1 as part of their test builds - it's fine
> to send me a patch that just moves bad warnings to W=2.
>
> And if anybody uses W=2 for their test builds, that's THEIR problem..
Here is the change to bump the warning from W=1 to W=2.
Fixes: 6863f5643d ("kbuild: allow Clang to find unused static inline functions for W=1 build")
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251106105000.2103276-1-andriy.shevchenko@linux.intel.com
[nathan: Adjust comment as well]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0ec364c0c9 ]
Since commit a166563e7e ("arm64: mm: support large block mapping when
rodata=full"), __change_memory_common has more chance to fail due to
memory allocation failure when splitting page table. So check the return
value of set_memory_rox(), then bail out if it fails otherwise we may have
RW memory mapping for kprobes insn page.
Fixes: 195a1b7d83 ("arm64: kprobes: call set_memory_rox() for kprobe page")
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7bdd91abf0 ]
Enabling ASPM causes randoms hangs on Tahiti and Oland on Zen4.
It's unclear if this is a platform-specific or GPU-specific issue.
Disable ASPM on SI for the time being.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c05bcf6ae ]
On various SI GPUs, a flickering can be observed near the bottom
edge of the screen when using a single 4K 60Hz monitor over DP.
Disabling MCLK switching works around this problem.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d73b107a6 ]
This commit is necessary for DC to function well with chips
that use the legacy power management code, ie. SI and KV.
Communicate display information from DC to the legacy PM code.
Currently DC uses pm_display_cfg to communicate power management
requirements from the display code to the DPM code.
However, the legacy (non-DC) code path used different fields
and therefore could not take into account anything from DC.
Change the legacy display code to fill the same pm_display_cfg
struct as DC and use the same in the legacy DPM code.
To ease review and reduce churn, this commit does not yet
delete the now unneeded code, that is done in the next commit.
v2:
Rebase.
Fix single_display in amdgpu_dpm_pick_power_state.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b515dcb0dc ]
This commit adds the pixel_clock field to the display config
struct so that power management (DPM) can use it.
We currently don't have a proper bandwidth calculation on old
GPUs with DCE 6-10 because dce_calcs only supports DCE 11+.
So the power management (DPM) on these GPUs may need to make
ad-hoc decisions for display based on the pixel clock.
Also rename sym_clock to pixel_clock in dm_pp_single_disp_config
to avoid confusion with other code where the sym_clock refers to
the DisplayPort symbol clock.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b09cb2996c ]
commit c760bcda83 ("drm/amd: Check whether secure display TA loaded
successfully") attempted to fix extra messages, but failed to port the
cleanup that was in commit 5c6d52ff4b ("drm/amd: Don't try to enable
secure display TA multiple times") to prevent multiple tries.
Add that to the failure handling path even on a quick failure.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4679
Fixes: c760bcda83 ("drm/amd: Check whether secure display TA loaded successfully")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4104c0a454f6a4d1e0d14895d03c0e7bdd0c8240)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4cb5ac2626 ]
Shrikanth noted that the per-cpu reference counter was still some 10%
slower than the old immutable option (which removes the reference
counting entirely).
Further optimize the per-cpu reference counter by:
- switching from RCU to preempt;
- using __this_cpu_*() since we now have preempt disabled;
- switching from smp_load_acquire() to READ_ONCE().
This is all safe because disabling preemption inhibits the RCU grace
period exactly like rcu_read_lock().
Having preemption disabled allows using __this_cpu_*() provided the
only access to the variable is in task context -- which is the case
here.
Furthermore, since we know changing fph->state to FR_ATOMIC demands a
full RCU grace period we can rely on the implied smp_mb() from that to
replace the acquire barrier().
This is very similar to the percpu_down_read_internal() fast-path.
The reason this is significant for PowerPC is that it uses the generic
this_cpu_*() implementation which relies on local_irq_disable() (the
x86 implementation relies on it being a single memop instruction to be
IRQ-safe). Switching to preempt_disable() and __this_cpu*() avoids
this IRQ state swizzling. Also, PowerPC needs LWSYNC for the ACQUIRE
barrier, not having to use explicit barriers safes a bunch.
Combined this reduces the performance gap by half, down to some 5%.
Fixes: 760e6f7bef ("futex: Remove support for IMMUTABLE")
Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20251106092929.GR4067720@noisy.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b540de9e3b ]
Fix refcount leak in `smb2_set_path_attr` when path conversion fails.
Function `cifs_get_writable_path` returns `cfile` with its reference
counter `cfile->count` increased on success. Function `smb2_compound_op`
would decrease the reference counter for `cfile`, as stated in its
comment. By calling `smb2_rename_path`, the reference counter of `cfile`
would leak if `cifs_convert_path_to_utf16` fails in `smb2_set_path_attr`.
Fixes: 8de9e86c67 ("cifs: create a helper to find a writeable handle by path name")
Acked-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3362692fea ]
commit 978fa2f6d0 ("drm/amd/display: Use scaling for non-native
resolutions on eDP") started using the GPU scaler hardware to scale
when a non-native resolution was picked on eDP. This scaling was done
to fill the screen instead of maintain aspect ratio.
The idea was supposed to be that if a different scaling behavior is
preferred then the compositor would request it. The not following
aspect ratio behavior however isn't desirable, so adjust it to follow
aspect ratio and still try to fill screen.
Note: This will lead to black bars in some cases for non-native
resolutions. Compositors can request the previous behavior if desired.
Fixes: 978fa2f6d0 ("drm/amd/display: Use scaling for non-native resolutions on eDP")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4538
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 825df7ff4bb1a383ad4827545e09aec60d230770)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90b75e12a6 ]
These were not set so soft recovery was inadvertantly
disabled.
Fixes: 6ac55eab4f ("drm/amdgpu: move reset support type checks into the caller")
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1972763505d728c604b537180727ec8132e619df)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7d44ad6b43 ]
When tick values are large, the multiplication by NSEC_PER_SEC is larger
than 64 bits and results in bad conversions.
The issue is seen in PMU busyness counters that look like they have
wrapped around due to bad conversion. i915 PMU implementation returns
monotonically increasing counters. If a count is lesser than previous
one, it will only return the larger value until the smaller value
catches up. The user will see this as zero delta between two
measurements even though the engines are busy.
Fix it by using mul_u64_u32_div()
Fixes: 77cdd054dd ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14955
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://lore.kernel.org/r/20251016000350.1152382-2-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 2ada9cb1df3f5405a01d013b708b1b0914efccfe)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo: Added the Fixes tag while cherry-picking to fixes]
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 11ae737efe upstream.
Fix a verification failure. filter_udphdr() calls bpf_xdp_pull_data(),
which will invalidate all pkt pointers. Therefore, all ctx->data loaded
before filter_udphdr() cannot be used. Reload it to prevent verification
errors.
The error may not appear on some compiler versions if they decide to
load ctx->data after filter_udphdr() when it is first used.
Fixes: efec2e55bd ("selftests: drv-net: Pull data before parsing headers")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250925161452.1290694-1-ameryhung@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fdc93beead upstream.
[Why & How]
This fixes the black screen issue on certain APUs with HDMI,
accompanied by the following messages:
amdgpu 0000:c4:00.0: amdgpu: [drm] Failed to setup vendor info
frame on connector DP-1: -22
amdgpu 0000:c4:00.0: [drm] Cannot find any crtc or sizes [drm]
Cannot find any crtc or sizes
Fixes: 489f0f600c ("drm/amd/display: Fix DVI-D/HDMI adapters")
Suggested-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 678c901443a6d2e909e3b51331a20f9d8f84ce82)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 118800b079 upstream.
Reject modes with a pixel clock higher than the maximum display
clock. Use 400 MHz as a fallback value when the maximum display
clock is not known. Pixel clocks that are higher than the display
clock just won't work and are not supported.
With the addition of the YUV422 fallback, DC can now accidentally
select a mode requiring higher pixel clock than actually supported
when the DP version supports the required bandwidth but the clock
is otherwise too high for the display engine. DCE 6-10 don't
support these modes but they don't have a bandwidth calculation
to reject them properly.
Fixes: db291ed173 ("drm/amd/display: Add fallback path for YCBCR422")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 38ab33dbea upstream.
Align the function headers for `amdgpu_max_hdmi_pixel_clock` and
`amdgpu_connector_dvi_mode_valid` with the function implementations so
they match the expected kdoc style.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1199: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Returns the maximum supported HDMI (TMDS) pixel clock in KHz.
drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1212: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* Validates the given display mode on DVI and HDMI connectors.
Fixes: 585b2f685c ("drm/amdgpu: Respect max pixel clock for HDMI and DVI-D (v2)")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a26a6c93ed upstream.
After commit d50f210913 ("kbuild: align modinfo section for Secureboot
Authenticode EDK2 compat"), running modules_install with certain
versions of kmod (such as 29.1 in Ubuntu Jammy) in certain
configurations may fail with:
depmod: ERROR: kmod_builtin_iter_next: unexpected string without modname prefix
The additional padding bytes to ensure .modinfo is aligned within
vmlinux.unstripped are unexpected by kmod, as this section has always
just been null-terminated strings.
Strip the trailing padding bytes from modules.builtin.modinfo after it
has been extracted from vmlinux.unstripped to restore the format that
kmod expects while keeping .modinfo aligned within vmlinux.unstripped to
avoid regressing the Authenticode calculation fix for EDK2.
Cc: stable@vger.kernel.org
Fixes: d50f210913 ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat")
Reported-by: Omar Sandoval <osandov@fb.com>
Reported-by: Samir M <samir@linux.ibm.com>
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Closes: https://lore.kernel.org/7fef7507-ad64-4e51-9bb8-c9fb6532e51e@linux.ibm.com/
Tested-by: Omar Sandoval <osandov@fb.com>
Tested-by: Samir M <samir@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20251105-kbuild-fix-builtin-modinfo-for-kmod-v1-1-b419d8ad4606@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
[nathan: Apply to scripts/Makefile.vmlinux_o, location of
modules.builtin.modinfo rule prior to 39cfd5b121]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 543d350040 upstream.
Commit 4d330fe541 ("ACPI: SPCR: Support Precise Baud Rate field")
added support to use the precise baud rate available since SPCR 1.09
(revision 4) but failed to check the version of the table provided by
the firmware.
Accessing an older version of SPCR table causes accesses beyond the
end of the table and can lead to garbage data to be used for the baud
rate.
Check the version of the firmware provided SPCR to ensure that the
precise baudrate is vaild before using it.
Fixes: 4d330fe541 ("ACPI: SPCR: Support Precise Baud Rate field")
Signed-off-by: Punit Agrawal <punit.agrawal@oss.qualcomm.com>
Link: https://patch.msgid.link/20251024123125.1081612-1-punit.agrawal@oss.qualcomm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ddb711b6e upstream.
Optimize the time consumption of profile switching, init_profile saves
the common settings of different profiles, such as the dsp coefficients,
etc, which can greatly reduce the profile switching time comsumption and
remove the repetitive settings.
Fixes: e83dcd139e ("ASoC: tas2781: Add keyword "init" in profile section")
Signed-off-by: Shenghao Ding <shenghao-ding@ti.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2b32bc1d9 upstream.
After DME Link Startup, the error return value is set to the MIPI UniPro
GenericErrorCode which can be 0 (SUCCESS) or 1 (FAILURE). Upon failure
during driver probe, the error code 1 is propagated back to the driver
probe function which must return a negative value to indicate an error,
but 1 is not negative, so the probe is considered to be successful even
though it failed. Subsequently, removing the driver results in an oops
because it is not in a valid state.
This happens because none of the callers of ufshcd_init() expect a
non-negative error code.
Fix the return value and documentation to match actual usage.
Fixes: 69f5eb78d4 ("scsi: ufs: core: Move the ufshcd_device_init(hba, true) call")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20251024085918.31825-5-adrian.hunter@intel.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d34caa89a1 upstream.
ufshcd_link_startup() has a facility (link_startup_again) to issue
DME_LINKSTARTUP a 2nd time even though the 1st time was successful.
Some older hardware benefits from that, however the behaviour is
non-standard, and has been found to cause link startup to be unreliable
for some Intel Alder Lake based host controllers.
Add UFSHCD_QUIRK_PERFORM_LINK_STARTUP_ONCE to suppress
link_startup_again, in preparation for setting the quirk for affected
controllers.
Fixes: 7dc9fb47bc ("scsi: ufs: ufs-pci: Add support for Intel ADL")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20251024085918.31825-3-adrian.hunter@intel.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb44826c3b upstream.
Intel platforms with UFS, can support Suspend-to-Idle (S0ix) and
Suspend-to-RAM (S3). For S0ix the link state should be HIBERNATE. For
S3, state is lost, so the link state must be OFF. Driver policy,
expressed by spm_lvl, can be 3 (link HIBERNATE, device SLEEP) for S0ix
but must be changed to 5 (link OFF, device POWEROFF) for S3.
Fix support for S0ix/S3 by switching spm_lvl as needed. During suspend
->prepare(), if the suspend target state is not Suspend-to-Idle, ensure
the spm_lvl is at least 5 to ensure that resume will be possible from
deep sleep states. During suspend ->complete(), restore the spm_lvl to
its original value that is suitable for S0ix.
This fix is first needed in Intel Alder Lake based controllers.
Fixes: 7dc9fb47bc ("scsi: ufs: ufs-pci: Add support for Intel ADL")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20251024085918.31825-2-adrian.hunter@intel.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3838262b8 upstream.
Changing alignment of header would mean it's no longer safe to cast a
2 byte aligned pointer between formats. Use two 16 bit fields to make
it 2 byte aligned as previously.
This fixes the performance regression since
commit ("virtio_net: enable gso over UDP tunnel support.") as it uses
virtio_net_hdr_v1_hash_tunnel which embeds
virtio_net_hdr_v1_hash. Pktgen in guest + XDP_DROP on TAP + vhost_net
shows the TX PPS is recovered from 2.4Mpps to 4.45Mpps.
Fixes: 56a06bd40f ("virtio_net: enable gso over UDP tunnel support.")
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://patch.msgid.link/20251031060551.126-1-jasowang@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c71670396 upstream.
Since commit 4959aebba8 ("virtio-net: use mtu size as buffer length
for big packets"), when guest gso is off, the allocated size for big
packets is not MAX_SKB_FRAGS * PAGE_SIZE anymore but depends on
negotiated MTU. The number of allocated frags for big packets is stored
in vi->big_packets_num_skbfrags.
Because the host announced buffer length can be malicious (e.g. the host
vhost_net driver's get_rx_bufs is modified to announce incorrect
length), we need a check in virtio_net receive path. Currently, the
check is not adapted to the new change which can lead to NULL page
pointer dereference in the below while loop when receiving length that
is larger than the allocated one.
This commit fixes the received length check corresponding to the new
change.
Fixes: 4959aebba8 ("virtio-net: use mtu size as buffer length for big packets")
Cc: stable@vger.kernel.org
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://patch.msgid.link/20251030144438.7582-1-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 664ce10246 upstream.
8 and 16 bit formats use a different layout on
GB20x than they did on prior chips. Add the
corresponding DRM format modifiers to the list of
modifiers supported by the display engine on such
chips, and filter the supported modifiers for each
format based on its bytes per pixel in
nv50_plane_format_mod_supported().
Note this logic will need to be updated when GB10
support is added, since it is a GB20x chip that
uses the pre-GB20x sector layout for all formats.
Fixes: 6cc6e08d45 ("drm/nouveau/kms: add support for GB20x")
Signed-off-by: James Jones <jajones@nvidia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251030181153.1208-3-jajones@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1cf52a0d4b upstream.
The layout of bits within the individual tiles
(referred to as sectors in the
DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro)
changed for 8 and 16-bit surfaces starting in
Blackwell 2 GPUs (With the exception of GB10).
To denote the difference, extend the sector field
in the parametric format modifier definition used
to generate modifier values for NVIDIA hardware.
Without this change, it would be impossible to
differentiate the two layouts based on modifiers,
and as a result software could attempt to share
surfaces directly between pre-GB20x and GB20x
cards, resulting in corruption when the surface
was accessed on one of the GPUs after being
populated with content by the other.
Of note: This change causes the
DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D() macro to
evaluate its "s" parameter twice, with the side
effects that entails. I surveyed all usage of the
modifier in the kernel and Mesa code, and that
does not appear to be problematic in any current
usage, but I thought it was worth calling out.
Fixes: 6cc6e08d45 ("drm/nouveau/kms: add support for GB20x")
Signed-off-by: James Jones <jajones@nvidia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251030181153.1208-2-jajones@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d7bba1e83 upstream.
I think there are several things wrong with this function:
A) xfs_bmapi_write can return a much larger unwritten mapping than what
the caller asked for. We convert part of that range to written, but
return the entire written mapping to iomap even though that's
inaccurate.
B) The arguments to xfs_reflink_convert_cow_locked are wrong -- an
unwritten mapping could be *smaller* than the write range (or even
the hole range). In this case, we convert too much file range to
written state because we then return a smaller mapping to iomap.
C) It doesn't handle delalloc mappings. This I covered in the patch
that I already sent to the list.
D) Reassigning count_fsb to handle the hole means that if the second
cmap lookup attempt succeeds (due to racing with someone else) we
trim the mapping more than is strictly necessary. The changing
meaning of count_fsb makes this harder to notice.
E) The tracepoint is kinda wrong because @length is mutated. That makes
it harder to chase the data flows through this function because you
can't just grep on the pos/bytecount strings.
F) We don't actually check that the br_state = XFS_EXT_NORM assignment
is accurate, i.e that the cow fork actually contains a written
mapping for the range we're interested in
G) Somewhat inadequate documentation of why we need to xfs_trim_extent
so aggressively in this function.
H) Not sure why xfs_iomap_end_fsb is used here, the vfs already clamped
the write range to s_maxbytes.
Fix these issues, and then the atomic writes regressions in generic/760,
generic/617, generic/091, generic/263, and generic/521 all go away for
me.
Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d54eacd82 upstream.
With the 20 Oct 2025 release of fstests, generic/521 fails for me on
regular (aka non-block-atomic-writes) storage:
QA output created by 521
dowrite: write: Input/output error
LOG DUMP (8553 total operations):
1( 1 mod 256): SKIPPED (no operation)
2( 2 mod 256): WRITE 0x7e000 thru 0x8dfff (0x10000 bytes) HOLE
3( 3 mod 256): READ 0x69000 thru 0x79fff (0x11000 bytes)
4( 4 mod 256): FALLOC 0x53c38 thru 0x5e853 (0xac1b bytes) INTERIOR
5( 5 mod 256): COPY 0x55000 thru 0x59fff (0x5000 bytes) to 0x25000 thru 0x29fff
6( 6 mod 256): WRITE 0x74000 thru 0x88fff (0x15000 bytes)
7( 7 mod 256): ZERO 0xedb1 thru 0x11693 (0x28e3 bytes)
with a warning in dmesg from iomap about XFS trying to give it a
delalloc mapping for a directio write. Fix the software atomic write
iomap_begin code to convert the reservation into a written mapping.
This doesn't fix the data corruption problems reported by generic/760,
but it's a start.
Cc: stable@vger.kernel.org # v6.16
Fixes: bd1d2c21d5 ("xfs: add xfs_atomic_write_cow_iomap_begin()")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a4b61d9c2 upstream.
Recent AMD node rework removed the "search and count" method of caching AMD
root devices. This depended on the value from a Data Fabric register that was
expected to hold the PCI bus of one of the root devices attached to that
fabric.
However, this expectation is incorrect. The register, when read from PCI
config space, returns the bitwise-OR of the buses of all attached root
devices.
This behavior is benign on AMD reference design boards, since the bus numbers
are aligned. This results in a bitwise-OR value matching one of the buses. For
example, 0x00 | 0x40 | 0xA0 | 0xE0 = 0xE0.
This behavior breaks on boards where the bus numbers are not exactly aligned.
For example, 0x00 | 0x07 | 0xE0 | 0x15 = 0x1F.
The examples above are for AMD node 0. The first root device on other nodes
will not be 0x00. The first root device for other nodes will depend on the
total number of root devices, the system topology, and the specific PCI bus
number assignment.
For example, a system with 2 AMD nodes could have this:
Node 0 : 0x00 0x07 0x0e 0x15
Node 1 : 0x1c 0x23 0x2a 0x31
The bus numbering style in the reference boards is not a requirement. The
numbering found in other boards is not incorrect. Therefore, the root device
caching method needs to be adjusted.
Go back to the "search and count" method used before the recent rework.
Search for root devices using PCI class code rather than fixed PCI IDs.
This keeps the goal of the rework (remove dependency on PCI IDs) while being
able to support various board designs.
Merge helper functions to reduce code duplication.
[ bp: Reflow comment. ]
Fixes: 40a5f6ffdf ("x86/amd_nb: Simplify root device search")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/all/20251028-fix-amd-root-v2-1-843e38f8be2c@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 734e99623c upstream.
find_or_create_cached_dir() could grab a new reference after kref_put()
had seen the refcount drop to zero but before cfid_list_lock is acquired
in smb2_close_cached_fid(), leading to use-after-free.
Switch to kref_put_lock() so cfid_release() is called with
cfid_list_lock held, closing that gap.
Fixes: ebe98f1447 ("cifs: enable caching of directories for which a lease is held")
Cc: stable@vger.kernel.org
Reported-by: Jay Shin <jaeshin@redhat.com>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4012abe8a7 upstream.
SMB2_change_notify called smb2_validate_iov() but ignored the return
code, then kmemdup()ed using server provided OutputBufferOffset/Length.
Check the return of smb2_validate_iov() and bail out on error.
Discovered with help from the ZeroPath security tooling.
Signed-off-by: Joshua Rogers <linux@joshua.hu>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: stable@vger.kernel.org
Fixes: e3e9463414 ("smb3: improve SMB3 change notification support")
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd9f30d103 upstream.
Guenter Roeck reported this kernel crash on his emulated B160L machine:
Starting network: udhcpc: started, v1.36.1
Backtrace:
[<104320d4>] unwind_once+0x1c/0x5c
[<10434a00>] walk_stackframe.isra.0+0x74/0xb8
[<10434a6c>] arch_stack_walk+0x28/0x38
[<104e5efc>] stack_trace_save+0x48/0x5c
[<105d1bdc>] set_track_prepare+0x44/0x6c
[<105d9c80>] ___slab_alloc+0xfc4/0x1024
[<105d9d38>] __slab_alloc.isra.0+0x58/0x90
[<105dc80c>] kmem_cache_alloc_noprof+0x2ac/0x4a0
[<105b8e54>] __anon_vma_prepare+0x60/0x280
[<105a823c>] __vmf_anon_prepare+0x68/0x94
[<105a8b34>] do_wp_page+0x8cc/0xf10
[<105aad88>] handle_mm_fault+0x6c0/0xf08
[<10425568>] do_page_fault+0x110/0x440
[<10427938>] handle_interruption+0x184/0x748
[<11178398>] schedule+0x4c/0x190
BUG: spinlock recursion on CPU#0, ifconfig/2420
lock: terminate_lock.2+0x0/0x1c, .magic: dead4ead, .owner: ifconfig/2420, .owner_cpu: 0
While creating the stack trace, the unwinder uses the stack pointer to guess
the previous frame to read the previous stack pointer from memory. The crash
happens, because the unwinder tries to read from unaligned memory and as such
triggers the unalignment trap handler which then leads to the spinlock
recursion and finally to a deadlock.
Fix it by checking the alignment before accessing the memory.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c42458fcf5 upstream.
The current code directly overwrites the scratch pointer with the
return value of kvrealloc(). If kvrealloc() fails and returns NULL,
the original buffer becomes unreachable, causing a memory leak.
Fix this by using a temporary variable to store kvrealloc()'s return
value and only update the scratch pointer on success.
Found via static anlaysis and this is similar to commit 42378a9ca5
("bpf, verifier: Fix memory leak in array reallocation for stack state")
Fixes: be17c0df67 ("riscv: module: Optimize PLT/GOT entry counting")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20251026091912.39727-1-linmq006@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d59fba493 upstream.
In the parse_adv_monitor_pattern() function, the value of
the 'length' variable is currently limited to HCI_MAX_EXT_AD_LENGTH(251).
The size of the 'value' array in the mgmt_adv_pattern structure is 31.
If the value of 'pattern[i].length' is set in the user space
and exceeds 31, the 'patterns[i].value' array can be accessed
out of bound when copied.
Increasing the size of the 'value' array in
the 'mgmt_adv_pattern' structure will break the userspace.
Considering this, and to avoid OOB access revert the limits for 'offset'
and 'length' back to the value of HCI_MAX_AD_LENGTH.
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.
Fixes: db08722fc7 ("Bluetooth: hci_core: Fix missing instances using HCI_MAX_AD_LENGTH")
Cc: stable@vger.kernel.org
Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 487df8b698 upstream.
The Mesa issue referenced below pointed out a possible deadlock:
[ 1231.611031] Possible interrupt unsafe locking scenario:
[ 1231.611033] CPU0 CPU1
[ 1231.611034] ---- ----
[ 1231.611035] lock(&xa->xa_lock#17);
[ 1231.611038] local_irq_disable();
[ 1231.611039] lock(&fence->lock);
[ 1231.611041] lock(&xa->xa_lock#17);
[ 1231.611044] <Interrupt>
[ 1231.611045] lock(&fence->lock);
[ 1231.611047]
*** DEADLOCK ***
In this example, CPU0 would be any function accessing job->dependencies
through the xa_* functions that don't disable interrupts (eg:
drm_sched_job_add_dependency(), drm_sched_entity_kill_jobs_cb()).
CPU1 is executing drm_sched_entity_kill_jobs_cb() as a fence signalling
callback so in an interrupt context. It will deadlock when trying to
grab the xa_lock which is already held by CPU0.
Replacing all xa_* usage by their xa_*_irq counterparts would fix
this issue, but Christian pointed out another issue: dma_fence_signal
takes fence.lock and so does dma_fence_add_callback.
dma_fence_signal() // locks f1.lock
-> drm_sched_entity_kill_jobs_cb()
-> foreach dependencies
-> dma_fence_add_callback() // locks f2.lock
This will deadlock if f1 and f2 share the same spinlock.
To fix both issues, the code iterating on dependencies and re-arming them
is moved out to drm_sched_entity_kill_jobs_work().
Cc: stable@vger.kernel.org # v6.2+
Fixes: 2fdb8a8f07 ("drm/scheduler: rework entity flush, kill and fini")
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13908
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
[phasta: commit message nits]
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patch.msgid.link/20251104095358.15092-1-pierre-eric.pelloux-prayer@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ceba45a66 upstream.
The normal timer mechanism assume that timeout further in the future
need a lower accuracy. As an example, the granularity for a timer
scheduled 4096 ms in the future on a 1000 Hz system is already 512 ms.
This granularity is perfectly sufficient for e.g. timeouts, but there
are other types of events that will happen at a future point in time and
require a higher accuracy.
Add a new wiphy_hrtimer_work type that uses an hrtimer internally. The
API is almost identical to the existing wiphy_delayed_work and it can be
used as a drop-in replacement after minor adjustments. The work will be
scheduled relative to the current time with a slack of 1 millisecond.
CC: stable@vger.kernel.org # 6.4+
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20251028125710.7f13a2adc5eb.I01b5af0363869864b0580d9c2a1770bafab69566@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c6a743c69 upstream.
[Why]
drm_dp_mst_topology_queue_probe() is used under the assumption that
mst is already initialized. If we connect system with SST first
then switch to the mst branch during suspend, we will fail probing
topology by calling the wrong API since the mst manager is yet to
be initialized.
[How]
At dm_resume(), once it's detected as mst branc connected, check if
the mst is initialized already. If not, call
dm_helpers_dp_mst_start_top_mgr() instead to initialize mst
V2: Adjust the commit msg a bit
Fixes: bc068194f5 ("drm/amd/display: Don't write DP_MSTM_CTRL after LT")
Cc: Fangzhi Zuo <jerry.zuo@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 62320fb8d91a0bddc44a228203cfa9bfbb5395bd)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 80f0d631dc ]
The function create_field_var() allocates memory for 'val' through
create_hist_field() inside parse_atom(), and for 'var' through
create_var(), which in turn allocates var->type and var->var.name
internally. Simply calling kfree() to release these structures will
result in memory leaks.
Use destroy_hist_field() to properly free 'val', and explicitly release
the memory of var->type and var->var.name before freeing 'var' itself.
Link: https://patch.msgid.link/20251106120132.3639920-1-zilin@seu.edu.cn
Fixes: 02205a6752 ("tracing: Add support for 'field variables'")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3534e03e0e ]
Sometimes VMs will have some intermittent dmesg warnings that are
unrelated to vsock. Change the dmesg parsing to filter on strings
containing 'vsock' to avoid false positive failures that are unrelated
to vsock. The downside is that it is possible for some vsock related
warnings to not contain the substring 'vsock', so those will be missed.
Fixes: a4a65c6fe0 ("selftests/vsock: add initial vmtest.sh for vsock")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251105-vsock-vmtest-dmesg-fix-v2-1-1a042a14892c@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8dca36978a ]
syzbot reported[1] a use-after-free when deleting an expired fdb. It is
due to a race condition between learning still happening and a port being
deleted, after all its fdbs have been flushed. The port's state has been
toggled to disabled so no learning should happen at that time, but if we
have MST enabled, it will bypass the port's state, that together with VLAN
filtering disabled can lead to fdb learning at a time when it shouldn't
happen while the port is being deleted. VLAN filtering must be disabled
because we flush the port VLANs when it's being deleted which will stop
learning. This fix adds a check for the port's vlan group which is
initialized to NULL when the port is getting deleted, that avoids the port
state bypass. When MST is enabled there would be a minimal new overhead
in the fast-path because the port's vlan group pointer is cache-hot.
[1] https://syzkaller.appspot.com/bug?extid=dd280197f0f7ab3917be
Fixes: ec7328b591 ("net: bridge: mst: Multiple Spanning Tree (MST) mode")
Reported-by: syzbot+dd280197f0f7ab3917be@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69088ffa.050a0220.29fc44.003d.GAE@google.com/
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251105111919.1499702-2-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0216721ce7 ]
The following warning was seen when we try to connect using ssh to the device.
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:575
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 104, name: dropbear
preempt_count: 1, expected: 0
INFO: lockdep is turned off.
CPU: 0 UID: 0 PID: 104 Comm: dropbear Tainted: G W 6.18.0-rc2-00399-g6f1ab1b109b9-dirty #530 NONE
Tainted: [W]=WARN
Hardware name: Generic DT based system
Call trace:
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x7c/0xac
dump_stack_lvl from __might_resched+0x16c/0x2b0
__might_resched from __mutex_lock+0x64/0xd34
__mutex_lock from mutex_lock_nested+0x1c/0x24
mutex_lock_nested from lan966x_stats_get+0x5c/0x558
lan966x_stats_get from dev_get_stats+0x40/0x43c
dev_get_stats from dev_seq_printf_stats+0x3c/0x184
dev_seq_printf_stats from dev_seq_show+0x10/0x30
dev_seq_show from seq_read_iter+0x350/0x4ec
seq_read_iter from seq_read+0xfc/0x194
seq_read from proc_reg_read+0xac/0x100
proc_reg_read from vfs_read+0xb0/0x2b0
vfs_read from ksys_read+0x6c/0xec
ksys_read from ret_fast_syscall+0x0/0x1c
Exception stack(0xf0b11fa8 to 0xf0b11ff0)
1fa0: 00000001 00001000 00000008 be9048d8 00001000 00000001
1fc0: 00000001 00001000 00000008 00000003 be905920 0000001e 00000000 00000001
1fe0: 0005404c be9048c0 00018684 b6ec2cd8
It seems that we are using a mutex in a atomic context which is wrong.
Change the mutex with a spinlock.
Fixes: 12c2d0a5b8 ("net: lan966x: add ethtool configuration and statistics")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251105074955.1766792-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 96baf482ca ]
KSZ9477/KSZ9897 and LAN937X families of switches use a reserved multicast
address table for some specific forwarding with some multicast addresses,
like the one used in STP. The hardware assumes the host port is the last
port in KSZ9897 family and port 5 in LAN937X family. Most of the time
this assumption is correct but not in other cases like KSZ9477.
Originally the function just setups the first entry, but the others still
need update, especially for one common multicast address that is used by
PTP operation.
LAN937x also uses different register bits when accessing the reserved
table.
Fixes: 457c182af5 ("net: dsa: microchip: generic access to ksz9477 static and reserved table")
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Tested-by: Łukasz Majewski <lukma@nabladev.com>
Link: https://patch.msgid.link/20251105033741.6455-1-Tristram.Ha@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4d6ec3a793 ]
The driver calls mfd_add_devices() but fails to call mfd_remove_devices()
in error paths after successful MFD device registration and in the remove
function. This leads to resource leaks where MFD child devices are not
properly unregistered.
Replace mfd_add_devices with devm_mfd_add_devices to automatically
manage the device resources.
Fixes: c96e976d9a ("net: wan: framer: Add support for the Lantiq PEF2256 framer")
Suggested-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Acked-by: Herve Codina <herve.codina@bootlin.com>
Link: https://patch.msgid.link/20251105034716.662-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8a7ed9586 ]
The MLX5E_SHAMPO_WQ_HEADER_PER_PAGE and
MLX5E_SHAMPO_LOG_MAX_HEADER_ENTRY_SIZE macros are used directly in
several places under the assumption that there will always be more
headers per WQE than headers per page. However, this assumption doesn't
hold for 64K page sizes and higher MTUs (> 4K). This can be first
observed during header page allocation: ksm_entries will become 0 during
alignment to MLX5E_SHAMPO_WQ_HEADER_PER_PAGE.
This patch introduces 2 additional members to the mlx5e_shampo_hd struct
which are meant to be used instead of the macrose mentioned above.
When the number of headers per WQE goes below
MLX5E_SHAMPO_WQ_HEADER_PER_PAGE, clamp the number of headers per
page and expand the header size accordingly so that the headers
for one WQE cover a full page.
All the formulas are adapted to use these two new members.
Fixes: 945ca432bf ("net/mlx5e: SHAMPO, Drop info array")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1762238915-1027590-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bacd8d8018 ]
mlx5e_hw_gro_skb_has_enough_space() uses a formula to check if there is
enough space in the skb frags to store more data. This formula is
incorrect for 64K page sizes and it triggers early GRO session
termination because the first fragment will blow up beyond
GRO_LEGACY_MAX_SIZE.
This patch adds a special case for page sizes >= GRO_LEGACY_MAX_SIZE
(64K) which uses the skb->len instead. Within this context,
the check is safe from fragment overflow because the hardware
will continuously fill the data up to the reservation size of 64K
and the driver will coalesce all data from the same page to the same
fragment. This means that the data will span one fragment or at most
two for such a large page size.
It is expected that the if statement will be optimized out as the
check is done with constants.
Fixes: 92552d3abd ("net/mlx5e: HW_GRO cqe handler implementation")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1762238915-1027590-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 665a7e13c2 ]
HW-GRO is broken on mlx5 for 64K page sizes. The patch in the fixes tag
didn't take into account larger page sizes when doing an align down
of max_ksm_entries. For 64K page size, max_ksm_entries is 0 which will skip
mapping header pages via WQE UMR. This breaks header-data split
and will result in the following syndrome:
mlx5_core 0000:00:08.0 eth2: Error cqe on cqn 0x4c9, ci 0x0, qn 0x1133, opcode 0xe, syndrome 0x4, vendor syndrome 0x32
00000000: 00 00 00 00 04 4a 00 00 00 00 00 00 20 00 93 32
00000010: 55 00 00 00 fb cc 00 00 00 00 00 00 07 18 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4a
00000030: 00 00 3b c7 93 01 32 04 00 00 00 00 00 00 bf e0
mlx5_core 0000:00:08.0 eth2: ERR CQE on RQ: 0x1133
Furthermore, the function that fills in WQE UMRs for the headers
(mlx5e_build_shampo_hd_umr()) only supports mapping page sizes that
fit in a single UMR WQE.
This patch goes back to the old non-aligned max_ksm_entries value and it
changes mlx5e_build_shampo_hd_umr() to support mapping a large page over
multiple UMR WQEs.
This means that mlx5e_build_shampo_hd_umr() can now leave a page only
partially mapped. The caller, mlx5e_alloc_rx_hd_mpwqe(), ensures that
there are enough UMR WQEs to cover complete pages by working on
ksm_entries that are multiples of MLX5E_SHAMPO_WQ_HEADER_PER_PAGE.
Fixes: 8a0ee54027 ("net/mlx5e: SHAMPO, Simplify UMR allocation for headers")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1762238915-1027590-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae4789affd ]
The ICSSG driver does the initial FDB configuration which
includes setting the control registers. Other run time
management like learning is managed by the PRU's. The default
FDB hash size used by the firmware is 512 slots, which is
currently missing in the current driver. Update the driver
FDB config to include FDB hash size as well.
Please refer trm [1] 6.4.14.12.17 section on how the FDB config
register gets configured. From the table 6-1404, there is a reset
field for FDB_HAS_SIZE which is 4, meaning 1024 slots. Currently
the driver is not updating this reset value from 4(1024 slots) to
3(512 slots). This patch fixes this by updating the reset value
to 512 slots.
[1]: https://www.ti.com/lit/pdf/spruim2
Fixes: abd5576b9c ("net: ti: icssg-prueth: Add support for ICSSG switch firmware")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251104104415.3110537-1-m-malladi@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1fd5367391 ]
->nr_pages is int, it needs type extension before calculating the region
size.
Fixes: a90558b36c ("io_uring/memmap: helper for pinning region pages")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: style fixup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c74619e760 ]
hwsim radios marked destroy_on_close are removed when the Netlink socket
that created them is closed. As the portid is not unique across network
namespaces, closing a socket in one namespace may remove radios in another
if it has the destroy_on_close flag set.
Instead of matching the network namespace, match the netgroup of the radio
to limit radio removal to those that have been created by the closing
Netlink socket. The netgroup of a radio identifies the network namespace
it was created in, and matching on it removes a destroy_on_close radio
even if it has been moved to another namespace.
Fixes: 100cb9ff40 ("mac80211_hwsim: Allow managing radios from non-initial namespaces")
Signed-off-by: Martin Willi <martin@strongswan.org>
Link: https://patch.msgid.link/20251103082436.30483-1-martin@strongswan.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 327c20c21d ]
Fix a AA deadlock in refill_skbs() where memory allocation while holding
skb_pool->lock can trigger a recursive lock acquisition attempt.
The deadlock scenario occurs when the system is under severe memory
pressure:
1. refill_skbs() acquires skb_pool->lock (spinlock)
2. alloc_skb() is called while holding the lock
3. Memory allocator fails and calls slab_out_of_memory()
4. This triggers printk() for the OOM warning
5. The console output path calls netpoll_send_udp()
6. netpoll_send_udp() attempts to acquire the same skb_pool->lock
7. Deadlock: the lock is already held by the same CPU
Call stack:
refill_skbs()
spin_lock_irqsave(&skb_pool->lock) <- lock acquired
__alloc_skb()
kmem_cache_alloc_node_noprof()
slab_out_of_memory()
printk()
console_flush_all()
netpoll_send_udp()
skb_dequeue()
spin_lock_irqsave(&skb_pool->lock) <- deadlock attempt
This bug was exposed by commit 248f6571fd ("netpoll: Optimize skb
refilling on critical path") which removed refill_skbs() from the
critical path (where nested printk was being deferred), letting nested
printk being called from inside refill_skbs()
Refactor refill_skbs() to never allocate memory while holding
the spinlock.
Another possible solution to fix this problem is protecting the
refill_skbs() from nested printks, basically calling
printk_deferred_{enter,exit}() in refill_skbs(), then, any nested
pr_warn() would be deferred.
I prefer this approach, given I _think_ it might be a good idea to move
the alloc_skb() from GFP_ATOMIC to GFP_KERNEL in the future, so, having
the alloc_skb() outside of the lock will be necessary step.
There is a possible TOCTOU issue when checking for the pool length, and
queueing the new allocated skb, but, this is not an issue, given that
an extra SKB in the pool is harmless and it will be eventually used.
Signed-off-by: Breno Leitao <leitao@debian.org>
Fixes: 248f6571fd ("netpoll: Optimize skb refilling on critical path")
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251103-fix_netpoll_aa-v4-1-4cfecdf6da7c@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28d9a84ef0 ]
While populating firmware host logging segments for the coredump, it is
possible for the FW command that flushes the segment to fail. When that
happens, the existing code will not update the max entry and entry size
in the segment header and this causes software that decodes the coredump
to skip the segment.
The segment most likely has already collected some DMA data, so always
update these 2 segment fields in the header to allow the decoder to
decode any data in the segment.
Fixes: 3c2179e663 ("bnxt_en: Add FW trace coredump segments to the coredump")
Reviewed-by: Shruti Parab <shruti.parab@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251104005700.542174-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e120f46768 ]
Raw IP packets have no MAC header, leaving skb->mac_header uninitialized.
This can trigger kernel panics on ARM64 when xfrm or other subsystems
access the offset due to strict alignment checks.
Initialize the MAC header to prevent such crashes.
This can trigger kernel panics on ARM when running IPsec over the
qmimux0 interface.
Example trace:
Internal error: Oops: 000000009600004f [#1] SMP
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.34-gbe78e49cb433 #1
Hardware name: LS1028A RDB Board (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : xfrm_input+0xde8/0x1318
lr : xfrm_input+0x61c/0x1318
sp : ffff800080003b20
Call trace:
xfrm_input+0xde8/0x1318
xfrm6_rcv+0x38/0x44
xfrm6_esp_rcv+0x48/0xa8
ip6_protocol_deliver_rcu+0x94/0x4b0
ip6_input_finish+0x44/0x70
ip6_input+0x44/0xc0
ipv6_rcv+0x6c/0x114
__netif_receive_skb_one_core+0x5c/0x8c
__netif_receive_skb+0x18/0x60
process_backlog+0x78/0x17c
__napi_poll+0x38/0x180
net_rx_action+0x168/0x2f0
Fixes: c6adf77953 ("net: usb: qmi_wwan: add qmap mux protocol support")
Signed-off-by: Qendrim Maxhuni <qendrim.maxhuni@garderos.com>
Link: https://patch.msgid.link/20251029075744.105113-1-qendrim.maxhuni@garderos.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e25935ed2 ]
The devm_kcalloc() function never return error pointers, it returns NULL
on failure. Also delete the netdev_err() printk. These allocation
functions already have debug output built-in some the extra error message
is not required.
Fixes: efabce2901 ("octeontx2-pf: AF_XDP zero copy receive support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/aQYKkrGA12REb2sj@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit de0337d641 ]
The TSO path called ionic_tx_map_skb() before preparing the TCP pseudo
checksum (ionic_tx_tcp_[inner_]pseudo_csum()), which may perform
skb_cow_head() and might modifies bytes in the linear header area.
Mapping first and then mutating the header risks:
- Using a stale DMA address if skb_cow_head() relocates the head, and/or
- Device reading stale header bytes on weakly-ordered systems
(CPU writes after mapping are not guaranteed visible without an
explicit dma_sync_single_for_device()).
Reorder the TX path to perform all header mutations (including
skb_cow_head()) *before* DMA mapping. Mapping is now done only after the
skb layout and header contents are final. This removes the need for any
post-mapping dma_sync and prevents on-wire corruption observed under
VLAN+TSO load after repeated runs.
This change is purely an ordering fix; no functional behavior change
otherwise.
Fixes: 0f3154e6bc ("ionic: Add Tx and Rx handling")
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://patch.msgid.link/20251031155203.203031-2-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d261f5b09c ]
The TX path currently writes descriptors and then immediately writes to
the MMIO doorbell register to notify the NIC. On weakly ordered
architectures, descriptor writes may still be pending in CPU or DMA
write buffers when the doorbell is issued, leading to the device
fetching stale or incomplete descriptors.
Add a dma_wmb() in ionic_txq_post() to ensure all descriptor writes are
visible to the device before the doorbell MMIO write.
Fixes: 0f3154e6bc ("ionic: Add Tx and Rx handling")
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Link: https://patch.msgid.link/20251031155203.203031-1-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e57723fe53 ]
When iterating over the ARL table we stop at max ARL entries / 2, but
this is only valid if the chip actually returns 2 results at once. For
chips with only one result register we will stop before reaching the end
of the table if it is more than half full.
Fix this by only dividing the maximum results by two if we have a chip
with more than one result register (i.e. those with 4 ARL bins).
Fixes: cd169d799b ("net: dsa: b53: Bound check ARL searches")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251102100758.28352-4-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0be04b5fa6 ]
The switch clears the ARL_SRCH_STDN bit when the search is done, i.e. it
finished traversing the ARL table.
This means that there will be no valid result, so we should not attempt
to read and process any further entries.
We only ever check the validity of the entries for 4 ARL bin chips, and
only after having passed the first entry to the b53_fdb_copy().
This means that we always pass an invalid entry at the end to the
b53_fdb_copy(). b53_fdb_copy() does check the validity though before
passing on the entry, so it never gets passed on.
On < 4 ARL bin chips, we will even continue reading invalid entries
until we reach the result limit.
Fixes: 1da6df85c6 ("net: dsa: b53: Implement ARL add/del/dump operations")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251102100758.28352-3-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c264294624 ]
In the New Control register bit 1 is either reserved, or has a different
function:
Out of Range Error Discard
When enabled, the ingress port discards any frames
if the Length field is between 1500 and 1536
(excluding 1500 and 1536) and with good CRC.
The actual bit for enabling IP multicast is bit 0, which was only
explicitly enabled for BCM5325 so far.
For older switch chips, this bit defaults to 0, so we want to enable it
as well, while newer switch chips default to 1, and their documentation
says "It is illegal to set this bit to zero."
So drop the wrong B53_IPMC_FWD_EN define, enable the IP multicast bit
also for other switch chips. While at it, rename it to (B53_)IP_MC as
that is how it is called in Broadcom code.
Fixes: 63cc54a6f0 ("net: dsa: b53: Fix egress flooding settings")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251102100758.28352-2-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3e4ebdc160 ]
BCM63XX's switch does not support MDIO scanning of external phys, so its
MACs needs to be manually configured for autonegotiated link speeds.
So b53_force_port_config() and b53_force_link() accordingly also when
mode is MLO_AN_PHY for those ports.
Fixes lower speeds than 1000/full on rgmii ports 4 - 7.
This aligns the behaviour with the old bcm63xx_enetsw driver for those
ports.
Fixes: 967dd82ffc ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251101132807.50419-3-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b6a8a5477f ]
There is no guarantee that the port state override registers have their
default values, as not all switches support being reset via register or
have a reset GPIO.
So when forcing port config, we need to make sure to clear all fields,
which we currently do not do for the speed and flow control
configuration. This can cause flow control stay enabled, or in the case
of speed becoming an illegal value, e.g. configured for 1G (0x2), then
setting 100M (0x1), results in 0x3 which is invalid.
For PORT_OVERRIDE_SPEED_2000M we need to make sure to only clear it on
supported chips, as the bit can have different meanings on other chips,
e.g. for BCM5389 this controls scanning PHYs for link/speed
configuration.
Fixes: 5e004460f8 ("net: dsa: b53: Add helper to set link parameters")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251101132807.50419-2-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2b526c2cf ]
The call to device_node_to_regmap() in airoha_mdio_probe() can return
an ERR_PTR() if regmap initialization fails. Currently, the driver
stores the pointer without validation, which could lead to a crash
if it is later dereferenced.
Add an IS_ERR() check and return the corresponding error code to make
the probe path more robust.
Fixes: 67e3ba9783 ("net: mdio: Add MDIO bus controller for Airoha AN7583")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251031161607.58581-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d7d2fcf7ae ]
There is a race between operations that iterate over the userdata
cg_children list and concurrent add/remove of userdata items through
configfs. The update_userdata() function iterates over the
nt->userdata_group.cg_children list, and count_extradata_entries() also
iterates over this same list to count nodes.
Quoting from Documentation/filesystems/configfs.rst:
> A subsystem can navigate the cg_children list and the ci_parent pointer
> to see the tree created by the subsystem. This can race with configfs'
> management of the hierarchy, so configfs uses the subsystem mutex to
> protect modifications. Whenever a subsystem wants to navigate the
> hierarchy, it must do so under the protection of the subsystem
> mutex.
Without proper locking, if a userdata item is added or removed
concurrently while these functions are iterating, the list can be
accessed in an inconsistent state. For example, the list_for_each() loop
can reach a node that is being removed from the list by list_del_init()
which sets the nodes' .next pointer to point to itself, so the loop will
never end (or reach the WARN_ON_ONCE in update_userdata() ).
Fix this by holding the configfs subsystem mutex (su_mutex) during all
operations that iterate over cg_children.
This includes:
- userdatum_value_store() which calls update_userdata() to iterate over
cg_children
- All sysdata_*_enabled_store() functions which call
count_extradata_entries() to iterate over cg_children
The su_mutex must be acquired before dynamic_netconsole_mutex to avoid
potential lock ordering issues, as configfs operations may already hold
su_mutex when calling into our code.
Fixes: df03f830d0 ("net: netconsole: cache userdata formatted string in netconsole_target")
Signed-off-by: Gustavo Luiz Duarte <gustavold@gmail.com>
Link: https://patch.msgid.link/20251029-netconsole-fix-warn-v1-1-0d0dd4622f48@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c211f5d7cb ]
After registering a VLAN device and setting its feature flags, we need to
synchronize the VLAN features with the lower device. For example, the VLAN
device does not have the NETIF_F_LRO flag, it should be synchronized with
the lower device based on the NETIF_F_UPPER_DISABLES definition.
As the dev->vlan_features has changed, we need to call
netdev_update_features(). The caller must run after netdev_upper_dev_link()
links the lower devices, so this patch adds the netdev_update_features()
call in register_vlan_dev().
Fixes: fd867d51f8 ("net/core: generic support for disabling netdev features down stack")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251030073539.133779-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d01f8136d4 ]
The script "ethtool-common.sh" is not installed in INSTALL_PATH, and
triggers some errors when I try to run the test
'drivers/net/netdevsim/ethtool-coalesce.sh':
TAP version 13
1..1
# timeout set to 600
# selftests: drivers/net/netdevsim: ethtool-coalesce.sh
# ./ethtool-coalesce.sh: line 4: ethtool-common.sh: No such file or directory
# ./ethtool-coalesce.sh: line 25: make_netdev: command not found
# ethtool: bad command line argument(s)
# ./ethtool-coalesce.sh: line 124: check: command not found
# ./ethtool-coalesce.sh: line 126: [: -eq: unary operator expected
# FAILED /0 checks
not ok 1 selftests: drivers/net/netdevsim: ethtool-coalesce.sh # exit=1
Install this file to avoid this error. After this patch:
TAP version 13
1..1
# timeout set to 600
# selftests: drivers/net/netdevsim: ethtool-coalesce.sh
# PASSED all 22 checks
ok 1 selftests: drivers/net/netdevsim: ethtool-coalesce.sh
Fixes: fbb8531e58 ("selftests: extract common functions in ethtool-common.sh")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20251030040340.3258110-1-wangliang74@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8e8486702 ]
The GRO self-test, gro.c, currently constructs IPv6 packets containing a
Hop-by-Hop Options header (IPPROTO_HOPOPTS) to ensure the GRO path
correctly handles IPv6 extension headers.
However, network elements may be configured to drop packets with the
Hop-by-Hop Options header (HBH). This causes the self-test to fail
in environments where such network elements are present.
To improve the robustness and reliability of this test in diverse
network environments, switch from using IPPROTO_HOPOPTS to
IPPROTO_DSTOPTS (Destination Options).
The Destination Options header is less likely to be dropped by
intermediate routers and still serves the core purpose of the test:
validating GRO's handling of an IPv6 extension header. This change
ensures the test can execute successfully without being incorrectly
failed by network policies outside the kernel's control.
Fixes: 7d1575014a ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Anubhav Singh <anubhavsinggh@google.com>
Link: https://patch.msgid.link/20251030060436.1556664-1-anubhavsinggh@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 02d064de05 ]
Due to the gro_sender sending data packets and FIN packets
in very quick succession, these are received almost simultaneously
by the gro_receiver. FIN packets are sometimes processed before the
data packets leading to intermittent (~1/100) test failures.
This change adds a delay of 100ms before sending FIN packets
in gro:tcp test to avoid the out-of-order delivery. The same
mitigation already exists for the gro:ip test.
Fixes: 7d1575014a ("selftests/net: GRO coalesce test")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Anubhav Singh <anubhavsinggh@google.com>
Link: https://patch.msgid.link/20251030062818.1562228-1-anubhavsinggh@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d18a84edd ]
The internal switch on BCM63XX SoCs will unconditionally add 802.1Q VLAN
tags on egress to CPU when 802.1Q mode is enabled. We do this
unconditionally since commit ed409f3bba ("net: dsa: b53: Configure
VLANs while not filtering").
This is fine for VLAN aware bridges, but for standalone ports and vlan
unaware bridges this means all packets are tagged with the default VID,
which is 0.
While the kernel will treat that like untagged, this can break userspace
applications processing raw packets, expecting untagged traffic, like
STP daemons.
This also breaks several bridge tests, where the tcpdump output then
does not match the expected output anymore.
Since 0 isn't a valid VID, just strip out the VLAN tag if we encounter
it, unless the priority field is set, since that would be a valid tag
again.
Fixes: 964dbf186e ("net: dsa: tag_brcm: add support for legacy tags")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20251027194621.133301-1-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1c21cf89a6 ]
The memory allocated for ptr using kvmalloc() is not freed on the last
error path. Fix that by freeing it on that error path.
Fixes: 9a24ce5e29 ("Bluetooth: btrtl: Firmware format v2 support")
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f838d624fd ]
Patch "Make HID attributes visible" is needed for older kernel versions
(e.g. 6.12) where ufs_get_device_desc() is called from ufshcd_probe_hba().
In these older kernel versions ufshcd_get_device_desc() may be called
after the sysfs attributes have been added. In the upstream kernel however
ufshcd_get_device_desc() is called before ufs_sysfs_add_nodes(). See also
the ufshcd_device_params_init() call from ufshcd_init(). Hence, calling
sysfs_update_group() is not necessary.
See also commit 69f5eb78d4 ("scsi: ufs: core: Move the
ufshcd_device_init(hba, true) call") in kernel v6.13.
This patch fixes the following kernel warning:
sysfs: cannot create duplicate filename '/devices/platform/3c2d0000.ufs/hid'
Workqueue: async async_run_entry_fn
Call trace:
dump_backtrace+0xfc/0x17c
show_stack+0x18/0x28
dump_stack_lvl+0x40/0x104
dump_stack+0x18/0x3c
sysfs_warn_dup+0x6c/0xc8
internal_create_group+0x1c8/0x504
sysfs_create_groups+0x38/0x9c
ufs_sysfs_add_nodes+0x20/0x58
ufshcd_init+0x1114/0x134c
ufshcd_pltfrm_init+0x728/0x7d8
ufs_google_probe+0x30/0x84
platform_probe+0xa0/0xe0
really_probe+0x114/0x454
__driver_probe_device+0xa4/0x160
driver_probe_device+0x44/0x23c
__device_attach_driver+0x15c/0x1f4
bus_for_each_drv+0x10c/0x168
__device_attach_async_helper+0x80/0xf8
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
ufshcd 3c2d0000.ufs: ufs_sysfs_add_nodes: sysfs groups creation failed (err = -17)
Cc: Daniel Lee <chullee@google.com>
Cc: Peter Wang <peter.wang@mediatek.com>
Cc: Bjorn Andersson <andersson@kernel.org>
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Fixes: bb7663dec6 ("scsi: ufs: sysfs: Make HID attributes visible")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Fixes: bb7663dec6 ("scsi: ufs: sysfs: Make HID attributes visible")
Acked-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Link: https://patch.msgid.link/20251028222433.1108299-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a74f038fa5 ]
The pt_dump_seq_puts() macro incorrectly uses seq_printf() instead of
seq_puts(). This is both a performance issue and conceptually wrong,
as the macro name suggests plain string output (puts) but the
implementation uses formatted output (printf).
The macro is used in ptdump.c:301 to output a newline character. Using
seq_printf() adds unnecessary overhead for format string parsing when
outputting this constant string.
This bug was introduced in commit 59c4da8640 ("riscv: Add support to
dump the kernel page tables") in 2020, which copied the implementation
pattern from other architectures that had the same bug.
Fixes: 59c4da8640 ("riscv: Add support to dump the kernel page tables")
Signed-off-by: Josephine Pfeiffer <hi@josie.lol>
Link: https://lore.kernel.org/r/20251018170451.3355496-1-hi@josie.lol
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 060ea84a48 ]
Unwinding the stack of a task other than current, KASAN would report
"BUG: KASAN: out-of-bounds in walk_stackframe+0x41c/0x460"
There is a same issue on x86 and has been resolved by the commit
84936118bd ("x86/unwind: Disable KASAN checks for non-current tasks")
The solution could be applied to RISC-V too.
This patch also can solve the issue:
https://seclists.org/oss-sec/2025/q4/23
Fixes: 5d8544e2d0 ("RISC-V: Generic library routines and assembly")
Co-developed-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Link: https://lore.kernel.org/r/20251022072608.743484-1-zhangchunyan@iscas.ac.cn
[pjw@kernel.org: clean up checkpatch issues]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c74dc8ab47 ]
ufs_sysfs_add_nodes() is called concurrently with ufs_get_device_desc().
This may cause the following code to be called before
ufs_sysfs_add_nodes():
sysfs_update_group(&hba->dev->kobj, &ufs_sysfs_hid_group);
If this happens, ufs_sysfs_add_nodes() triggers a kernel warning and
fails. Fix this by calling ufs_sysfs_add_nodes() before SCSI LUNs are
scanned since the sysfs_update_group() call happens from the context of
thread that executes ufshcd_async_scan(). This patch fixes the following
kernel warning:
sysfs: cannot create duplicate filename '/devices/platform/3c2d0000.ufs/hid'
Workqueue: async async_run_entry_fn
Call trace:
dump_backtrace+0xfc/0x17c
show_stack+0x18/0x28
dump_stack_lvl+0x40/0x104
dump_stack+0x18/0x3c
sysfs_warn_dup+0x6c/0xc8
internal_create_group+0x1c8/0x504
sysfs_create_groups+0x38/0x9c
ufs_sysfs_add_nodes+0x20/0x58
ufshcd_init+0x1114/0x134c
ufshcd_pltfrm_init+0x728/0x7d8
ufs_google_probe+0x30/0x84
platform_probe+0xa0/0xe0
really_probe+0x114/0x454
__driver_probe_device+0xa4/0x160
driver_probe_device+0x44/0x23c
__device_attach_driver+0x15c/0x1f4
bus_for_each_drv+0x10c/0x168
__device_attach_async_helper+0x80/0xf8
async_run_entry_fn+0x4c/0x17c
process_one_work+0x26c/0x65c
worker_thread+0x33c/0x498
kthread+0x110/0x134
ret_from_fork+0x10/0x20
ufshcd 3c2d0000.ufs: ufs_sysfs_add_nodes: sysfs groups creation failed (err = -17)
Cc: Daniel Lee <chullee@google.com>
Fixes: bb7663dec6 ("scsi: ufs: sysfs: Make HID attributes visible")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20251014200118.3390839-2-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 10d9dda426 upstream.
Since __tracepoint_user_init() calls tracepoint_user_register() without
initializing tuser->tpoint with given tracpoint, it does not register
tracepoint stub function as callback correctly, and tprobe does not work.
Initializing tuser->tpoint correctly before tracepoint_user_register()
so that it sets up tracepoint callback.
I confirmed below example works fine again.
echo "t sched_switch preempt prev_pid=prev->pid next_pid=next->pid" > /sys/kernel/tracing/dynamic_events
echo 1 > /sys/kernel/tracing/events/tracepoints/sched_switch/enable
cat /sys/kernel/tracing/trace_pipe
Link: https://lore.kernel.org/all/176244793514.155515.6466348656998627773.stgit@devnote2/
Fixes: 2867495dea ("tracing: tprobe-events: Register tracepoint when enable tprobe event")
Reported-by: Beau Belgrave <beaub@linux.microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Tested-by: Beau Belgrave <beaub@linux.microsoft.com>
Reviewed-by: Beau Belgrave <beaub@linux.microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2618849f31 upstream.
[BUG]
During development of a minor feature (make sure all btrfs_bio::end_io()
is called in task context), I noticed a crash in generic/388, where
metadata writes triggered new works after btrfs_stop_all_workers().
It turns out that it can even happen without any code modification, just
using RAID5 for metadata and the same workload from generic/388 is going
to trigger the use-after-free.
[CAUSE]
If btrfs hits an error, the fs is marked as error, no new
transaction is allowed thus metadata is in a frozen state.
But there are some metadata modifications before that error, and they are
still in the btree inode page cache.
Since there will be no real transaction commit, all those dirty folios
are just kept as is in the page cache, and they can not be invalidated
by invalidate_inode_pages2() call inside close_ctree(), because they are
dirty.
And finally after btrfs_stop_all_workers(), we call iput() on btree
inode, which triggers writeback of those dirty metadata.
And if the fs is using RAID56 metadata, this will trigger RMW and queue
new works into rmw_workers, which is already stopped, causing warning
from queue_work() and use-after-free.
[FIX]
Add a special handling for write_one_eb(), that if the fs is already in
an error state, immediately mark the bbio as failure, instead of really
submitting them.
Then during close_ctree(), iput() will just discard all those dirty
tree blocks without really writing them back, thus no more new jobs for
already stopped-and-freed workqueues.
The extra discard in write_one_eb() also acts as an extra safenet.
E.g. the transaction abort is triggered by some extent/free space
tree corruptions, and since extent/free space tree is already corrupted
some tree blocks may be allocated where they shouldn't be (overwriting
existing tree blocks). In that case writing them back will further
corrupting the fs.
CC: stable@vger.kernel.org # 6.6+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16c43a56b7 upstream.
Even if normally `build_error` isn't a kernel object, it should still
be treated as such so that we pass the same flags. Similarly, `rustdoc`
targets are never kernel objects, but we need to treat them as such.
Otherwise, starting with Rust 1.91.0 (released 2025-10-30), `rustc`
will complain about missing sanitizer flags since `-Zsanitizer` is a
target modifier too [1]:
error: mixing `-Zsanitizer` will cause an ABI mismatch in crate `build_error`
--> rust/build_error.rs:3:1
|
3 | //! Build-time error.
| ^
|
= help: the `-Zsanitizer` flag modifies the ABI so Rust crates compiled with different values of this flag cannot be used together safely
= note: unset `-Zsanitizer` in this crate is incompatible with `-Zsanitizer=kernel-address` in dependency `core`
= help: set `-Zsanitizer=kernel-address` in this crate or unset `-Zsanitizer` in `core`
= help: if you are sure this will not cause problems, you may use `-Cunsafe-allow-abi-mismatch=sanitizer` to silence this error
Thus explicitly mark them as kernel objects.
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust/pull/138736 [1]
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Link: https://patch.msgid.link/20251102212853.1505384-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fad472efab upstream.
The `rustdoc` modifiers bug [1] was fixed in Rust 1.90.0 [2], for which
we added a workaround in commit abbf9a4494 ("rust: workaround `rustdoc`
target modifiers bug").
However, `rustdoc`'s doctest generation still has a similar issue [3],
being fixed at [4], which does not affect us because we apply the
workaround to both, and now, starting with Rust 1.91.0 (released
2025-10-30), `-Zsanitizer` is a target modifier too [5], which means we
fail with:
RUSTDOC TK rust/kernel/lib.rs
error: mixing `-Zsanitizer` will cause an ABI mismatch in crate `kernel`
--> rust/kernel/lib.rs:3:1
|
3 | //! The `kernel` crate.
| ^
|
= help: the `-Zsanitizer` flag modifies the ABI so Rust crates compiled with different values of this flag cannot be used together safely
= note: unset `-Zsanitizer` in this crate is incompatible with `-Zsanitizer=kernel-address` in dependency `core`
= help: set `-Zsanitizer=kernel-address` in this crate or unset `-Zsanitizer` in `core`
= help: if you are sure this will not cause problems, you may use `-Cunsafe-allow-abi-mismatch=sanitizer` to silence this error
A simple way around is to add the sanitizer to the list in the existing
workaround (especially if we had not started to pass the sanitizer
flags in the previous commit, since in that case that would not be
necessary). However, that still applies the workaround in more cases
than necessary.
Instead, only modify the doctests flags to ignore the check for
sanitizers, so that it is more local (and thus the compiler keeps checking
it for us in the normal `rustdoc` calls). Since the previous commit
already treated the `rustdoc` calls as kernel objects, this should allow
us in the future to easily remove this workaround when the time comes.
By the way, the `-Cunsafe-allow-abi-mismatch` flag overwrites previous
ones rather than appending, so it needs to be all done in the same flag.
Moreover, unknown modifiers are rejected, and thus we have to gate based
on the version too.
Finally, `-Zsanitizer-cfi-normalize-integers` is not affected (in Rust
1.91.0), so it is not needed in the workaround for the moment.
Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust/issues/144521 [1]
Link: https://github.com/rust-lang/rust/pull/144523 [2]
Link: https://github.com/rust-lang/rust/issues/146465 [3]
Link: https://github.com/rust-lang/rust/pull/148068 [4]
Link: https://github.com/rust-lang/rust/pull/138736 [5]
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Link: https://patch.msgid.link/20251102212853.1505384-2-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff4d2ef387 upstream.
The future move of pin-init to `syn` uncovers the following private
intra-doc link:
error: public documentation for `Devres` links to private item `Self::inner`
--> rust/kernel/devres.rs:106:7
|
106 | /// [`Self::inner`] is guaranteed to be initialized and is always accessed read-only.
| ^^^^^^^^^^^ this item is private
|
= note: this link will resolve properly if you pass `--document-private-items`
= note: `-D rustdoc::private-intra-doc-links` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(rustdoc::private_intra_doc_links)]`
Currently, when rendered, the link points to "nowhere" (an inexistent
anchor for a "method").
Thus fix it.
Cc: stable@vger.kernel.org
Fixes: f5d3ef25d2 ("rust: devres: get rid of Devres' inner Arc")
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20251029071406.324511-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09b1704f5b upstream.
The future move of pin-init to `syn` uncovers the following broken
intra-doc link:
error: unresolved link to `crate::pin_init`
--> rust/kernel/sync/condvar.rs:39:40
|
39 | /// instances is with the [`pin_init`](crate::pin_init!) and [`new_condvar`] macros.
| ^^^^^^^^^^^^^^^^ no item named `pin_init` in module `kernel`
|
= note: `-D rustdoc::broken-intra-doc-links` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(rustdoc::broken_intra_doc_links)]`
Currently, when rendered, the link points to a literal `crate::pin_init!`
URL.
Thus fix it.
Cc: stable@vger.kernel.org
Fixes: 129e97be8e ("rust: pin-init: fix documentation links")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20251029073344.349341-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 284922f4c5 ]
The runtime-const infrastructure was never designed to handle the
modular case, because the constant fixup is only done at boot time for
core kernel code.
But by the time I used it for the x86-64 user space limit handling in
commit 86e6b1547b ("x86: fix user address masking non-canonical
speculation issue"), I had completely repressed that fact.
And it all happens to work because the only code that currently actually
gets inlined by modules is for the access_ok() limit check, where the
default constant value works even when not fixed up. Because at least I
had intentionally made it be something that is in the non-canonical
address space region.
But it's technically very wrong, and it does mean that at least in
theory, the use of 'access_ok()' + '__get_user()' can trigger the same
speculation issue with non-canonical addresses that the original commit
was all about.
The pattern is unusual enough that this probably doesn't matter in
practice, but very wrong is still very wrong. Also, let's fix it before
the nice optimized scoped user accessor helpers that Thomas Gleixner is
working on cause this pseudo-constant to then be more widely used.
This all came up due to an unrelated discussion with Mateusz Guzik about
using the runtime const infrastructure for names_cachep accesses too.
There the modular case was much more obviously broken, and Mateusz noted
it in his 'v2' of the patch series.
That then made me notice how broken 'access_ok()' had been in modules
all along. Mea culpa, mea maxima culpa.
Fix it by simply not using the runtime-const code in modules, and just
using the USER_PTR_MAX variable value instead. This is not
performance-critical like the core user accessor functions (get_user()
and friends) are.
Also make sure this doesn't get forgotten the next time somebody wants
to do runtime constant optimizations by having the x86 runtime-const.h
header file error out if included by modules.
Fixes: 86e6b1547b ("x86: fix user address masking non-canonical speculation issue")
Acked-by: Borislav Petkov <bp@alien8.de>
Acked-by: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Triggered-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/all/20251030105242.801528-1-mjguzik@gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 22c73d52a6 ]
The mds auth caps check should also validate the
fsname along with the associated caps. Not doing
so would result in applying the mds auth caps of
one fs on to the other fs in a multifs ceph cluster.
The bug causes multiple issues w.r.t user
authentication, following is one such example.
Steps to Reproduce (on vstart cluster):
1. Create two file systems in a cluster, say 'fsname1' and 'fsname2'
2. Authorize read only permission to the user 'client.usr' on fs 'fsname1'
$ceph fs authorize fsname1 client.usr / r
3. Authorize read and write permission to the same user 'client.usr' on fs 'fsname2'
$ceph fs authorize fsname2 client.usr / rw
4. Update the keyring
$ceph auth get client.usr >> ./keyring
With above permssions for the user 'client.usr', following is the
expectation.
a. The 'client.usr' should be able to only read the contents
and not allowed to create or delete files on file system 'fsname1'.
b. The 'client.usr' should be able to read/write on file system 'fsname2'.
But, with this bug, the 'client.usr' is allowed to read/write on file
system 'fsname1'. See below.
5. Mount the file system 'fsname1' with the user 'client.usr'
$sudo bin/mount.ceph usr@.fsname1=/ /kmnt_fsname1_usr/
6. Try creating a file on file system 'fsname1' with user 'client.usr'. This
should fail but passes with this bug.
$touch /kmnt_fsname1_usr/file1
7. Mount the file system 'fsname1' with the user 'client.admin' and create a
file.
$sudo bin/mount.ceph admin@.fsname1=/ /kmnt_fsname1_admin
$echo "data" > /kmnt_fsname1_admin/admin_file1
8. Try removing an existing file on file system 'fsname1' with the user
'client.usr'. This shoudn't succeed but succeeds with the bug.
$rm -f /kmnt_fsname1_usr/admin_file1
For more information, please take a look at the corresponding mds/fuse patch
and tests added by looking into the tracker mentioned below.
v2: Fix a possible null dereference in doutc
v3: Don't store fsname from mdsmap, validate against
ceph_mount_options's fsname and use it
v4: Code refactor, better warning message and
fix possible compiler warning
[ Slava.Dubeyko: "fsname check failed" -> "fsname mismatch" ]
Link: https://tracker.ceph.com/issues/72167
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 53db6f25ee ]
The wake_up_bit() is called in ceph_async_unlink_cb(),
wake_async_create_waiters(), and ceph_finish_async_create().
It makes sense to switch on clear_bit() function, because
it makes the code much cleaner and easier to understand.
More important rework is the adding of smp_mb__after_atomic()
memory barrier after the bit modification and before
wake_up_bit() call. It can prevent potential race condition
of accessing the modified bit in other threads. Luckily,
clear_and_wake_up_bit() already implements the required
functionality pattern:
static inline void clear_and_wake_up_bit(int bit, unsigned long *word)
{
clear_bit_unlock(bit, word);
/* See wake_up_bit() for which memory barrier you need to use. */
smp_mb__after_atomic();
wake_up_bit(word, bit);
}
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5824ccba9a ]
The Coverity Scan service has detected potential
race condition in ceph_ioctl_lazyio() [1].
The CID 1591046 contains explanation: "Check of thread-shared
field evades lock acquisition (LOCK_EVASION). Thread1 sets
fmode to a new value. Now the two threads have an inconsistent
view of fmode and updates to fields correlated with fmode
may be lost. The data guarded by this critical section may
be read while in an inconsistent state or modified by multiple
racing threads. In ceph_ioctl_lazyio: Checking the value of
a thread-shared field outside of a locked region to determine
if a locked operation involving that thread shared field
has completed. (CWE-543)".
The patch places fi->fmode field access under ci->i_ceph_lock
protection. Also, it introduces the is_file_already_lazy
variable that is set under the lock and it is checked later
out of scope of critical section.
[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1591046
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7ed1e29cf ]
The Coverity Scan service has detected the calling of
wait_for_completion_killable() without checking the return
value in ceph_lock_wait_for_completion() [1]. The CID 1636232
defect contains explanation: "If the function returns an error
value, the error value may be mistaken for a normal value.
In ceph_lock_wait_for_completion(): Value returned from
a function is not checked for errors before being used. (CWE-252)".
The patch adds the checking of wait_for_completion_killable()
return value and return the error code from
ceph_lock_wait_for_completion().
[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1636232
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e97663760 ]
If reinitialization of one of the GPUs fails after reset, it logs
failure on all subsequent GPUs eventhough they have resumed
successfully.
A sample log where only device at 0000:95:00.0 had a failure -
amdgpu 0000:15:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:65:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:75:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:85:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:95:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:e5:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:f5:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:05:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:15:00.0: amdgpu: GPU reset end with ret = -5
To avoid confusion, report the error for each device
separately and return the first error as the overall result.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7574f30337 ]
If mmap write lock is taken while draining retry fault, mmap write lock
is not released because svm_range_restore_pages calls mmap_read_unlock
then returns. This causes deadlock and system hangs later because mmap
read or write lock cannot be taken.
Downgrade mmap write lock to read lock if draining retry fault fix this
bug.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66128f4287 ]
On m68k, check_sizetypes in headers_check reports:
./usr/include/asm/bootinfo-amiga.h:17: found __[us]{8,16,32,64} type without #include <linux/types.h>
This header file does not use any of the Linux-specific integer types,
but merely refers to them from comments, so this is a false positive.
As of commit c3a9d74ee4 ("kbuild: uapi: upgrade check_sizetypes()
warning to error"), this check was promoted to an error, breaking m68k
all{mod,yes}config builds.
Fix this by stripping simple comments before looking for Linux-specific
integer types.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/949f096337e28d50510e970ae3ba3ec9c1342ec0.1759753998.git.geert@linux-m68k.org
[nathan: Adjust comment and remove unnecessary escaping from slashes in
regex]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e22f4d1321 ]
During kexec reboots, RTC alarms that are fired during the kernel
transition experience delayed execution. The new kernel would eventually
honor these alarms, but the interrupt handlers would only execute after
the driver probe is completed rather than at the intended alarm time.
This is because pending alarm interrupt status from the previous kernel
is not properly cleared during driver initialization, causing timing
discrepancies in alarm delivery.
To ensure precise alarm timing across kexec transitions, enhance the
probe function to:
1. Clear any pending alarm interrupt status from previous boot.
2. Detect existing valid alarms and preserve their state.
3. Re-enable alarm interrupts for future alarms.
Signed-off-by: Harini T <harini.t@amd.com>
Link: https://lore.kernel.org/r/20250730142110.2354507-1-harini.t@amd.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 328b80b29a ]
The ASUS ROG Zephyrus Duo 15 SE (GX551QS) with ALC 289 codec requires specific
pin configuration for proper volume control. Without this quirk, volume
adjustments produce a muffled sound effect as only certain channels attenuate,
leaving bass frequency at full volume.
Testing with hdajackretask confirms these pin tweaks fix the issue:
- Pin 0x17: Internal Speaker (LFE)
- Pin 0x1e: Internal Speaker
Signed-off-by: Adam Holliday <dochollidayxx@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3637d34b35 ]
Add bounds checking to prevent writes past framebuffer boundaries when
rendering text near screen edges. Return early if the Y position is off-screen
and clip image height to screen boundary. Break from the rendering loop if the
X position is off-screen. When clipping image width to fit the screen, update
the character count to match the clipped width to prevent buffer size
mismatches.
Without the character count update, bit_putcs_aligned and bit_putcs_unaligned
receive mismatched parameters where the buffer is allocated for the clipped
width but cnt reflects the original larger count, causing out-of-bounds writes.
Reported-by: syzbot+48b0652a95834717f190@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=48b0652a95834717f190
Suggested-by: Helge Deller <deller@gmx.de>
Tested-by: syzbot+48b0652a95834717f190@syzkaller.appspotmail.com
Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1375152bb0 ]
Instead of preserving mode, timestamp, and owner, for the object files
during installation, just preserve the mode and timestamp.
When installing as root, the installed files should be owned by root.
When installing as user, --preserve=ownership doesn't work anyway. This
makes --preserve=ownership rather pointless.
Signed-off-by: Emil Dahl Juhl <juhl.emildahl@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db740f5689 ]
The atomic instructions sc.q, llacq.{w/d}, screl.{w/d} were newly added
in the LoongArch Reference Manual v1.10, it is necessary to handle them
in insns_not_supported() to avoid putting a breakpoint in the middle of
a ll/sc atomic sequence, otherwise it will loop forever for kprobes and
uprobes.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aad1d99bea ]
It could be triggered on 32 bit big endian machines at 32 bpp in the
pattern realignment. In this case just return early as the result is
an identity.
Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 558ae45798 ]
When a UTP error occurs in isolation, UFS is not currently recoverable.
This is because the UTP error is not considered fatal in the error
handling code, leading to either an I/O timeout or an OCS error.
Add the UTP error flag to INT_FATAL_ERRORS so the controller will be
reset in this situation.
sd 0:0:0:0: [sda] tag#38 UNKNOWN(0x2003) Result: hostbyte=0x07
driverbyte=DRIVER_OK cmd_age=0s
sd 0:0:0:0: [sda] tag#38 CDB: opcode=0x28 28 00 00 51 24 e2 00 00 08 00
I/O error, dev sda, sector 42542864 op 0x0:(READ) flags 0x80700 phys_seg
8 prio class 2
OCS error from controller = 9 for tag 39
pa_err[1] = 0x80000010 at 2667224756 us
pa_err: total cnt=2
dl_err[0] = 0x80000002 at 2667148060 us
dl_err[1] = 0x80002000 at 2667282844 us
No record of nl_err
No record of tl_err
No record of dme_err
No record of auto_hibern8_err
fatal_err[0] = 0x804 at 2667282836 us
---------------------------------------------------
REGISTER
---------------------------------------------------
NAME OFFSET VALUE
STD HCI SFR 0xfffffff0 0x0
AHIT 0x18 0x814
INTERRUPT STATUS 0x20 0x1000
INTERRUPT ENABLE 0x24 0x70ef5
[mkp: commit desc]
Signed-off-by: Hoyoung Seo <hy50.seo@samsung.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Message-Id: <20250930061428.617955-1-hy50.seo@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba60189291 ]
During initialization, the EDVD_COREx_VOLT_FREQ registers for some cores
are still at reset values and not reflecting the actual frequency. This
causes get calls to fail. Set all cores to their respective max
frequency during probe to initialize the registers to working values.
Suggested-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ad865862a ]
The NTB epf host driver assumes the BAR number associated with a memory
window is just incremented from the BAR number associated with MW1. This
seems to have been enough so far but this is not really how the endpoint
side work and the two could easily become mis-aligned.
ntb_epf_mw_to_bar() even assumes that the BAR number is the memory window
index + 2, which means the function only returns a proper result if BAR_2
is associated with MW1.
Instead, fully describe and allow arbitrary NTB BAR mapping.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7c2e86f7b5 ]
The output clock register offset used in clk_wzrd_register_output_clocks
was incorrectly referencing 0x3C instead of 0x38, which caused
misconfiguration of output dividers on Versal platforms.
Correcting the off-by-one error ensures proper configuration of output
clocks.
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18db1ff2de ]
For some of the SCMI based platforms, the oem extended config may be
supported, but not for duty cycle purpose. Skip the duty cycle ops if
err return when trying to get duty cycle info.
Signed-off-by: Jacky Bai <ping.bai@nxp.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1e0d75258b ]
As described in AM335x Errata Advisory 1.0.42, WKUP_DEBUGSS_CLKCTRL
can't be disabled - the clock module will just be stuck in transitioning
state forever, resulting in the following warning message after the wait
loop times out:
l3-aon-clkctrl:0000:0: failed to disable
Just add the clock to enable_init_clks, so no attempt is made to disable
it.
Signed-off-by: Matthias Schiffer <matthias.schiffer@tq-group.com>
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit af98caeaa7 ]
This register is important for sequencing the commands to PLLs, so
actually write the update bits with regmap_write_bits() instead of
relying on a read/modify/write regmap command that could skip the actual
hardware write if the value is identical to the one read.
It's changed when modification is needed to the PLL, when
read-only operation is done, we could keep the call to
regmap_update_bits().
Add a comment to the sam9x60_div_pll_set_div() function that uses this
PLL_UPDT register so that it's used consistently, according to the
product's datasheet.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Tested-by: Ryan Wanner <ryan.wanner@microchip.com> # on sama7d65 and sam9x75
Link: https://lore.kernel.org/r/20250827150811.82496-1-nicolas.ferre@microchip.com
[claudiu.beznea: fix "Alignment should match open parenthesis"
checkpatch.pl check]
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e0237f5635 ]
A potential divider for the master clock is div/3. The register
configuration for div/3 is MASTER_PRES_MAX. The current bit shifting
method does not work for this case. Checking for MASTER_PRES_MAX will
ensure the correct decimal value is stored in the system.
Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bfa2bddf6f ]
Add the ACR register to all PLL settings and provide the correct
ACR value for each PLL used in different SoCs.
Suggested-by: Mihai Sain <mihai.sain@microchip.com>
Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com>
[nicolas.ferre@microchip.com: add sama7d65 and review commit message]
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6f1a4f059 ]
PCF2127 can generate interrupt every full second or minute configured
from control and status register 1, bits MI (1) and SI (0).
On interrupt control register 2 bit MSF (7) is set and must be cleared
to continue normal operation.
While the driver never enables this interrupt on its own, users or
firmware may do so - e.g. as an easy way to test the interrupt.
Add preprocessor definition for MSF bit and include it in the irq
bitmask to ensure minute and second interrupts are cleared when fired.
This fixes an issue where the rtc enters a test mode and becomes
unresponsive after a second interrupt has fired and is not cleared in
time. In this state register writes to control registers have no
effect and the interrupt line is kept asserted [1]:
[1] userspace commands to put rtc into unresponsive state:
$ i2cget -f -y 2 0x51 0x00
0x04
$ i2cset -f -y 2 0x51 0x00 0x05 # set bit 0 SI
$ i2cget -f -y 2 0x51 0x00
0x84 # bit 8 EXT_TEST set
$ i2cset -f -y 2 0x51 0x00 0x05 # try overwrite control register
$ i2cget -f -y 2 0x51 0x00
0x84 # no change
Signed-off-by: Josua Mayer <josua@solid-run.com>
Reviewed-by: Bruno Thomsen <bruno.thomsen@gmail.com>
Link: https://lore.kernel.org/r/20250825-rtc-irq-v1-1-0133319406a7@solid-run.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7aa8781f37 ]
The A523's RTC block is backward compatible with the R329's, but it also
has a calibration function for its internal oscillator, which would
allow it to provide a clock rate closer to the desired 32.768 KHz. This
is useful on the Radxa Cubie A5E, which does not have an external 32.768
KHz crystal.
Add new compatible-specific data for it.
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20250909170947.2221611-1-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 725e9d8186 ]
Add the missing option name in the help message. Additionally,
switch to __uml_help(), because this is a global option rather
than a per-channel option.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4cd661c248 ]
This field is unused, but the correct structure size is needed
when computing the amount of space for the output argument to
reside, so that it does not cross a page boundary.
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 47691ced15 ]
The HV_ACCESS_TSC_INVARIANT bit is always zero when Linux runs as the
root partition. The root partition will see directly what the hardware
provides.
The old logic in ms_hyperv_init_platform caused the native TSC clock
source to be incorrectly marked as unstable on x86. Fix it.
Skip the unnecessary checks in code for the root partition. Add one
extra comment in code to clarify the behavior.
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 32058c38d3 ]
The function call new_inode() is a primitive for allocating an inode in memory,
rather than planning disk space for it. Therefore, -ENOMEM should be returned
as the error code rather than -ENOSPC.
To be specific, new_inode()'s call path looks like this:
new_inode
new_inode_pseudo
alloc_inode
ops->alloc_inode (hpfs_alloc_inode)
alloc_inode_sb
kmem_cache_alloc_lru
Therefore, the failure of new_inode() indicates a memory presure issue (-ENOMEM),
not a lack of disk space. However, the current implementation of
hpfs_mkdir/create/mknod/symlink incorrectly returns -ENOSPC when new_inode() fails.
This patch fix this by set err to -ENOMEM before the goto statement.
BTW, we also noticed that other nested calls within these four functions,
like hpfs_alloc_f/dnode and hpfs_add_dirent, might also fail due to memory presure.
But similarly, only -ENOSPC is returned. Addressing these will involve code
modifications in other functions, and we plan to submit dedicated patches for these
issues in the future. For this patch, we focus on new_inode().
Signed-off-by: Yikang Yue <yikangy2@illinois.edu>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c567bc5fc6 ]
The AXI crossbar of TH1520 has no proper timeout handling, which means
gating AXI clocks can easily lead to bus timeout and thus system hang.
Set all AXI clock gates to CLK_IS_CRITICAL. All these clock gates are
ungated by default on system reset.
In addition, convert all current CLK_IGNORE_UNUSED usage to
CLK_IS_CRITICAL to prevent unwanted clock gating.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Drew Fustini <fustini@kernel.org>
Signed-off-by: Drew Fustini <fustini@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3b1a4a59a2 ]
In btrfs_fallocate(), when the allocated range overlaps with a prealloc
extent and the extent starts after i_size, the range doesn't get marked
dirty in file_extent_tree. This results in persisting an incorrect
disk_i_size for the inode when not using the no-holes feature.
This is reproducible since commit 41a2ee75aa ("btrfs: introduce
per-inode file extent tree"), then became hidden since commit 3d7db6e8bd
("btrfs: don't allocate file extent tree for non regular files") and then
visible again after commit 8679d2687c ("btrfs: initialize
inode::file_extent_tree after i_mode has been set"), which fixes the
previous commit.
The following reproducer triggers the problem:
$ cat test.sh
MNT=/mnt/test
DEV=/dev/vdb
mkdir -p $MNT
mkfs.btrfs -f -O ^no-holes $DEV
mount $DEV $MNT
touch $MNT/file1
fallocate -n -o 1M -l 2M $MNT/file1
umount $MNT
mount $DEV $MNT
len=$((1 * 1024 * 1024))
fallocate -o 1M -l $len $MNT/file1
du --bytes $MNT/file1
umount $MNT
mount $DEV $MNT
du --bytes $MNT/file1
umount $MNT
Running the reproducer gives the following result:
$ ./test.sh
(...)
2097152 /mnt/test/file1
1048576 /mnt/test/file1
The difference is exactly 1048576 as we assigned.
Fix by adding a call to btrfs_inode_set_file_extent_range() in
btrfs_fallocate_update_isize().
Fixes: 41a2ee75aa ("btrfs: introduce per-inode file extent tree")
Signed-off-by: austinchang <austinchang@synology.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f260c6aff0 ]
When btrfs_add_qgroup_relation() is called with invalid qgroup levels
(src >= dst), the function returns -EINVAL directly without freeing the
preallocated qgroup_list structure passed by the caller. This causes a
memory leak because the caller unconditionally sets the pointer to NULL
after the call, preventing any cleanup.
The issue occurs because the level validation check happens before the
mutex is acquired and before any error handling path that would free
the prealloc pointer. On this early return, the cleanup code at the
'out' label (which includes kfree(prealloc)) is never reached.
In btrfs_ioctl_qgroup_assign(), the code pattern is:
prealloc = kzalloc(sizeof(*prealloc), GFP_KERNEL);
ret = btrfs_add_qgroup_relation(trans, sa->src, sa->dst, prealloc);
prealloc = NULL; // Always set to NULL regardless of return value
...
kfree(prealloc); // This becomes kfree(NULL), does nothing
When the level check fails, 'prealloc' is never freed by either the
callee or the caller, resulting in a 64-byte memory leak per failed
operation. This can be triggered repeatedly by an unprivileged user
with access to a writable btrfs mount, potentially exhausting kernel
memory.
Fix this by freeing prealloc before the early return, ensuring prealloc
is always freed on all error paths.
Fixes: 4addc1ffd6 ("btrfs: qgroup: preallocate memory before adding a relation")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Shardul Bankar <shardulsb08@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe9622011f ]
When QP wraps around, WQE data from the previous use at the same
position still remains as driver does not clear it. The WQE field
layout differs across different opcodes, causing that the fields
that are not explicitly assigned for the current opcode retain
stale values, and are issued to HW by mistake. Such fields are as
follows:
* MSG_START_SGE_IDX field in ATOMIC WQE
* BLOCK_SIZE and ZBVA fields in FRMR WQE
* DirectWQE fields when DirectWQE not used
For ATOMIC WQE, always set the latest sge index in MSG_START_SGE_IDX
as required by HW.
For FRMR WQE and DirectWQE, clear only those unassigned fields
instead of the entire WQE to avoid performance penalty.
Fixes: 68a997c5d2 ("RDMA/hns: Add FRMR support for hip08")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20251016114051.1963197-4-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c4b67b514a ]
Currently driver enforces affinity between QP cache and send CQ
cache, which helps improve the performance of sending, but doesn't
set affinity with recv CQ cache, resulting in suboptimal performance
of receiving.
Use one CQ bank per context to ensure the affinity among QP, send CQ
and recv CQ. For kernel ULP, CQ bank is fixed to 0.
Fixes: 9e03dbea2b ("RDMA/hns: Fix CQ and QP cache affinity")
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20251016114051.1963197-2-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8713158fa ]
In `UVERBS_METHOD_CQ_CREATE`, umem should be released if anything goes
wrong. Currently, if `create_cq_umem` fails, umem would not be
released or referenced, causing a possible leak.
In this patch, we release umem at `UVERBS_METHOD_CQ_CREATE`, the driver
should not release umem if it returns an error code.
Fixes: 1a40c362ae ("RDMA/uverbs: Add a common way to create CQ with umem")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Link: https://patch.msgid.link/aOh1le4YqtYwj-hH@osx.local
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5575b7646b ]
The driver maintains a CQ table that is used to ensure that a CQ is
still valid when processing CQ related AEs. When a CQ is destroyed,
the table entry is cleared, using irdma_cq.cq_num as the index. This
field was never being set, so it was just always clearing out entry
0.
Additionally, the cq_num field size was increased to accommodate HW
supporting more than 64K CQs.
Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Jacob Moroni <jmoroni@google.com>
Link: https://patch.msgid.link/20250923142439.943930-1-jmoroni@google.com
Acked-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8d158f47f1 ]
In some cases, it is possible for pble_rsrc->next_fpm_addr to be
larger than u32, so remove the u32 cast to avoid unintentional
truncation.
This fixes the following error that can be observed when registering
massive memory regions:
[ 447.227494] (NULL ib_device): cqp opcode = 0x1f maj_err_code = 0xffff min_err_code = 0x800c
[ 447.227505] (NULL ib_device): [Update PE SDs Cmd Error][op_code=21] status=-5 waiting=1 completion_err=1 maj=0xffff min=0x800c
Fixes: e8c4dbc2fc ("RDMA/irdma: Add PBLE resource manager")
Signed-off-by: Jacob Moroni <jmoroni@google.com>
Link: https://patch.msgid.link/20250923190850.1022773-1-jmoroni@google.com
Acked-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 88de89f184 ]
The current error handling path in bnxt_re_destroy_gsi_sqp() could lead
to a resource leak. When bnxt_qplib_destroy_qp() fails, the function
jumps to the 'fail' label and returns immediately, skipping the call
to bnxt_qplib_free_qp_res().
Continue the resource teardown even if bnxt_qplib_destroy_qp() fails,
which aligns with the driver's general error handling strategy and
prevents the potential leak.
Fixes: 8dae419f9e ("RDMA/bnxt_re: Refactor queue pair creation code")
Signed-off-by: YanLong Dai <daiyanlong@kylinos.cn>
Link: https://patch.msgid.link/20250924061444.11288-1-daiyanlong@kylinos.cn
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db291ed173 ]
[Why]
DP validation may fail with multiple displays and higher color depths.
The sink may support others though.
[How]
When DP bandwidth validation fails, progressively fallback through:
- YUV422 8bpc (bandwidth efficient)
- YUV422 6bpc (reduced color depth)
- YUV420 (last resort)
This resolves cases where displays would show no image due to insufficient
DP link bandwidth for the requested RGB mode.
Suggested-by: Mauri Carvalho <mcarvalho3@lenovo.com>
Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Mario Limonciello <Mario.Limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 88b4cbcf6b ]
Currently when both IMA and EVM are in fix mode, the IMA signature will
be reset to IMA hash if a program first stores IMA signature in
security.ima and then writes/removes some other security xattr for the
file.
For example, on Fedora, after booting the kernel with "ima_appraise=fix
evm=fix ima_policy=appraise_tcb" and installing rpm-plugin-ima,
installing/reinstalling a package will not make good reference IMA
signature generated. Instead IMA hash is generated,
# getfattr -m - -d -e hex /usr/bin/bash
# file: usr/bin/bash
security.ima=0x0404...
This happens because when setting security.selinux, the IMA_DIGSIG flag
that had been set early was cleared. As a result, IMA hash is generated
when the file is closed.
Similarly, IMA signature can be cleared on file close after removing
security xattr like security.evm or setting/removing ACL.
Prevent replacing the IMA file signature with a file hash, by preventing
the IMA_DIGSIG flag from being reset.
Here's a minimal C reproducer which sets security.selinux as the last
step which can also replaced by removing security.evm or setting ACL,
#include <stdio.h>
#include <sys/xattr.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
int main() {
const char* file_path = "/usr/sbin/test_binary";
const char* hex_string = "030204d33204490066306402304";
int length = strlen(hex_string);
char* ima_attr_value;
int fd;
fd = open(file_path, O_WRONLY|O_CREAT|O_EXCL, 0644);
if (fd == -1) {
perror("Error opening file");
return 1;
}
ima_attr_value = (char*)malloc(length / 2 );
for (int i = 0, j = 0; i < length; i += 2, j++) {
sscanf(hex_string + i, "%2hhx", &ima_attr_value[j]);
}
if (fsetxattr(fd, "security.ima", ima_attr_value, length/2, 0) == -1) {
perror("Error setting extended attribute");
close(fd);
return 1;
}
const char* selinux_value= "system_u:object_r:bin_t:s0";
if (fsetxattr(fd, "security.selinux", selinux_value, strlen(selinux_value), 0) == -1) {
perror("Error setting extended attribute");
close(fd);
return 1;
}
close(fd);
return 0;
}
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 00be6f26a2 ]
When io_uring is used in the same task as CIFS, there might be
unnecessary reconnects, causing issues in user-space applications
like QEMU with a log like:
> CIFS: VFS: \\10.10.100.81 Error -512 sending data on socket to server
Certain io_uring completions might be added to task_work with
notify_method being TWA_SIGNAL and thus TIF_NOTIFY_SIGNAL is set for
the task.
In __smb_send_rqst(), signals are masked before calling
smb_send_kvec(), but the masking does not apply to TIF_NOTIFY_SIGNAL.
If sk_stream_wait_memory() is reached via sock_sendmsg() while
TIF_NOTIFY_SIGNAL is set, signal_pending(current) will evaluate to
true there, and -EINTR will be propagated all the way from
sk_stream_wait_memory() to sock_sendmsg() in smb_send_kvec().
Afterwards, __smb_send_rqst() will see that not everything was written
and reconnect.
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5676398315 ]
open_cached_dir_by_dentry() was missing an update of
cfid->last_access_time to jiffies, similar to what open_cached_dir()
has.
Add it to the function.
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 06fdc45f16 ]
When a PF enters switchdev mode, its netdevice becomes the uplink
representor but remains in its current network namespace. All other
representors (VFs, SFs) are created in the netns of the devlink
instance.
If the PF's netns has been moved and differs from the devlink's netns,
enabling switchdev mode would create a state where the OVS control
plane (ovs-vsctl) cannot manage the switch because the PF uplink
representor and the other representors are split across different
namespaces.
To prevent this inconsistent configuration, block the request to enter
switchdev mode if the PF netdevice's netns does not match the netns of
its devlink instance.
As part of this change, the PF's netns is first marked as immutable.
This prevents race conditions where the netns could be changed after
the check is performed but before the mode transition is complete, and
it aligns the PF's behavior with that of the final uplink representor.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1759094723-843774-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4099b98203 ]
A soft lockup was observed when loading amdgpu module.
If a module has a lot of tracable functions, multiple calls
to kallsyms_lookup can spend too much time in RCU critical
section and with disabled preemption, causing kernel panic.
This is the same issue that was fixed in
commit d0b24b4e91 ("ftrace: Prevent RCU stall on PREEMPT_VOLUNTARY
kernels") and commit 42ea22e754 ("ftrace: Add cond_resched() to
ftrace_graph_set_hash()").
Fix it the same way by adding cond_resched() in ftrace_module_enable.
Link: https://lore.kernel.org/aMQD9_lxYmphT-up@vova-pc
Signed-off-by: Vladimir Riabchun <ferr.lambarginio@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 025e880759 ]
Willy Tarreau <w@1wt.eu> forwarded me a message from
Disclosure <disclosure@aisle.com> with the following
warning:
> The helper `xattr_key()` uses the pointer variable in the loop condition
> rather than dereferencing it. As `key` is incremented, it remains non-NULL
> (until it runs into unmapped memory), so the loop does not terminate on
> valid C strings and will walk memory indefinitely, consuming CPU or hanging
> the thread.
I easily reproduced this with setfattr and getfattr, causing a kernel
oops, hung user processes and corrupted orangefs files. Disclosure
sent along a diff (not a patch) with a suggested fix, which I based
this patch on.
After xattr_key started working right, xfstest generic/069 exposed an
xattr related memory leak that lead to OOM. xattr_key returns
a hashed key. When adding xattrs to the orangefs xattr cache, orangefs
used hash_add, a kernel hashing macro. hash_add also hashes the key using
hash_log which resulted in additions to the xattr cache going to the wrong
hash bucket. generic/069 tortures a single file and orangefs does a
getattr for the xattr "security.capability" every time. Orangefs
negative caches on xattrs which includes a kmalloc. Since adds to the
xattr cache were going to the wrong bucket, every getattr for
"security.capability" resulted in another kmalloc, none of which were
ever freed.
I changed the two uses of hash_add to hlist_add_head instead
and the memory leak ceased and generic/069 quit throwing furniture.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Reported-by: Stanislav Fort of Aisle Research <stanislav.fort@aisle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79c1587b6c ]
syzbot created an exfat image with cluster bits not set for the allocation
bitmap. exfat-fs reads and uses the allocation bitmap without checking
this. The problem is that if the start cluster of the allocation bitmap
is 6, cluster 6 can be allocated when creating a directory with mkdir.
exfat zeros out this cluster in exfat_mkdir, which can delete existing
entries. This can reallocate the allocated entries. In addition,
the allocation bitmap is also zeroed out, so cluster 6 can be reallocated.
This patch adds exfat_test_bitmap_range to validate that clusters used for
the allocation bitmap are correctly marked as in-use.
Reported-by: syzbot+a725ab460fc1def9896f@syzkaller.appspotmail.com
Tested-by: syzbot+a725ab460fc1def9896f@syzkaller.appspotmail.com
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6dfba10838 ]
For exFAT filesystems with 4MB read_ahead_size, removing the storage device
when the read operation is in progress, which cause the last read syscall
spent 150s [1]. The main reason is that exFAT generates excessive log
messages [2].
After applying this patch, approximately 300,000 lines of log messages
were suppressed, and the delay of the last read() syscall was reduced
to about 4 seconds.
[1]:
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000120>
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000032>
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072) = 131072 <0.000119>
read(4, 0x7fccf28ae000, 131072) = -1 EIO (Input/output error) <150.186215>
[2]:
[ 333.696603] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5)
[ 333.697378] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5)
[ 333.698156] exFAT-fs (vdb): error, failed to access to FAT (entry 0x0000d780, err:-5)
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca94b2b036 ]
Currently, bcsp_recv() can be called even when the BCSP protocol has not
been registered. This leads to a NULL pointer dereference, as shown in
the following stack trace:
KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
RIP: 0010:bcsp_recv+0x13d/0x1740 drivers/bluetooth/hci_bcsp.c:590
Call Trace:
<TASK>
hci_uart_tty_receive+0x194/0x220 drivers/bluetooth/hci_ldisc.c:627
tiocsti+0x23c/0x2c0 drivers/tty/tty_io.c:2290
tty_ioctl+0x626/0xde0 drivers/tty/tty_io.c:2706
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
To prevent this, ensure that the HCI_UART_REGISTERED flag is set before
processing received data. If the protocol is not registered, return
-EUNATCH.
Reported-by: syzbot+4ed6852d4da4606c93da@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4ed6852d4da4606c93da
Tested-by: syzbot+4ed6852d4da4606c93da@syzkaller.appspotmail.com
Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 339a87883a ]
This aligns the usage of socket sk_sndtimeo as conn_timeout when
initiating a connection and then use it when scheduling the
resulting HCI command, similar to what has been done in bf98feea5b
("Bluetooth: hci_conn: Always use sk_timeo as conn_timeout").
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d79c7d01f1 ]
If the controller has no buffers left return -ENOBUFF to indicate that
iso_cnt might be out of sync.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3e94262921 ]
Implement hdev->wakeup() callback to support Wake On BT feature.
Test steps:
1. echo enabled > /sys/bus/pci/devices/0000:00:14.7/power/wakeup
2. connect bluetooth hid device
3. put the system to suspend - rtcwake -m mem -s 300
4. press any key on hid to wake up the system
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 70a5ce8bc9 ]
bp->dev->dev_addr is of type `unsigned char *`. Casting it to a u32
pointer and dereferencing implies dealing manually with endianness,
which is error-prone.
Replace by calls to get_unaligned_le32|le16() helpers.
This was found using sparse:
⟩ make C=2 drivers/net/ethernet/cadence/macb_main.o
warning: incorrect type in assignment (different base types)
expected unsigned int [usertype] bottom
got restricted __le32 [usertype]
warning: incorrect type in assignment (different base types)
expected unsigned short [usertype] top
got restricted __le16 [usertype]
...
Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250923-macb-fixes-v6-5-772d655cdeb6@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 81dcfdd21d ]
Fix to avoid cases where the `res` shell variable is
empty in script comparisons.
The comparison has been modified into string comparison to
handle other possible values the variable could assume.
The issue can be reproduced with the command:
make kselftest TARGETS=net
It solves the error:
./tfo_passive.sh: line 98: [: -eq: unary operator expected
Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250925132832.9828-1-alessandro.zanni87@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b9f128947 ]
PCI devices prior to PCI 2.3 both use level interrupts and do not support
interrupt masking, leading to a failure when passed through to a KVM guest on
at least the ppc64 platform. This failure manifests as receiving and
acknowledging a single interrupt in the guest, while the device continues to
assert the level interrupt indicating a need for further servicing.
When lazy IRQ masking is used on DisINTx- (non-PCI 2.3) hardware, the following
sequence occurs:
* Level IRQ assertion on device
* IRQ marked disabled in kernel
* Host interrupt handler exits without clearing the interrupt on the device
* Eventfd is delivered to userspace
* Guest processes IRQ and clears device interrupt
* Device de-asserts INTx, then re-asserts INTx while the interrupt is masked
* Newly asserted interrupt acknowledged by kernel VMM without being handled
* Software mask removed by VFIO driver
* Device INTx still asserted, host controller does not see new edge after EOI
The behavior is now platform-dependent. Some platforms (amd64) will continue
to spew IRQs for as long as the INTX line remains asserted, therefore the IRQ
will be handled by the host as soon as the mask is dropped. Others (ppc64) will
only send the one request, and if it is not handled no further interrupts will
be sent. The former behavior theoretically leaves the system vulnerable to
interrupt storm, and the latter will result in the device stalling after
receiving exactly one interrupt in the guest.
Work around this by disabling lazy IRQ masking for DisINTx- INTx devices.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Link: https://lore.kernel.org/r/333803015.1744464.1758647073336.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 733a763dd8 ]
The problem of having class-D initialization sequence in probe using
regmap_register_patch() is that it will do hardware register writes
immediately after being called as it bypasses regcache. Afterwards, in
aic3x_init() we also perform codec soft reset, rendering class-D init
sequence pointless. This issue is even more apparent when using reset
GPIO line, since in that case class-D amplifier initialization fails
with "Failed to init class D: -5" message as codec is already held in
reset state after requesting the reset GPIO and hence hardware I/O
fails with -EIO errno.
Thus move class-D amplifier initialization sequence from probe function
to aic3x_set_power() just before the usual regcache sync. Use bypassed
regmap_multi_reg_write_bypassed() function to make sure, class-D init
sequence is performed in proper order as described in the datasheet.
Signed-off-by: Primoz Fiser <primoz.fiser@norik.com>
Link: https://patch.msgid.link/20250925085929.2581749-1-primoz.fiser@norik.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27fa1a8b28 ]
The mclk direction now needs to be specified in endpoint node with
"system-clock-direction-out" property. However some calls to the
set_sysclk callback, related to CPU DAI clock, result in unbalanced
calls to clock API.
The set_sysclk callback in STM32 SAI driver is intended only for mclk
management. So it is relevant to ensure that calls to set_sysclk are
related to mclk only.
Since the master clock is handled only at runtime, skip the calls to
set_sysclk in the initialization phase.
Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com>
Link: https://patch.msgid.link/20250916123118.84175-1-olivier.moysan@foss.st.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8ae2640f9 ]
This commit fixes a potential race condition in the userqueue fence
signaling mechanism by replacing dma_fence_is_signaled_locked() with
dma_fence_is_signaled().
The issue occurred because:
1. dma_fence_is_signaled_locked() should only be used when holding
the fence's individual lock, not just the fence list lock
2. Using the locked variant without the proper fence lock could lead
to double-signaling scenarios:
- Hardware completion signals the fence
- Software path also tries to signal the same fence
By using dma_fence_is_signaled() instead, we properly handle the
locking hierarchy and avoid the race condition while still maintaining
the necessary synchronization through the fence_list_lock.
v2: drop the comment (Christian)
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7469567d88 ]
Add a fallback mechanism to attempt pipe reset when KCQ reset
fails to recover the ring. After performing the KCQ reset and
queue remapping, test the ring functionality. If the ring test
fails, initiate a pipe reset as an additional recovery step.
v2: fix the typo (Lijo)
v3: try pipeline reset when kiq mapping fails (Lijo)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 46e75c56df ]
The following code paths may result in high latency or even task hangs:
1. fastcommit io is throttled by wbt.
2. jbd2_fc_wait_bufs() might wait for a long time while
JBD2_FAST_COMMIT_ONGOING is set in journal->flags, and then
jbd2_journal_commit_transaction() waits for the
JBD2_FAST_COMMIT_ONGOING bit for a long time while holding the write
lock of j_state_lock.
3. start_this_handle() waits for read lock of j_state_lock which
results in high latency or task hang.
Given the fact that ext4_fc_commit() already modifies the current
process' IO priority to match that of the jbd2 thread, it should be
reasonable to match jbd2's IO submission flags as well.
Suggested-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Julian Sun <sunjunchao@bytedance.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20250827121812.1477634-1-sunjunchao@bytedance.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1534f72dc2 ]
The parent function ext4_xattr_inode_lookup_create already uses GFP_NOFS for memory alloction, so the function ext4_xattr_inode_cache_find should use same gfp_flag.
Signed-off-by: chuguangqing <chuguangqing@inspur.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 25226abc1a ]
MSIOF has TXRST/RXRST to reset FIFO, but it shouldn't be used during SYNC
signal was asserted, because it will be cause of HW issue.
When MSIOF is used as Sound driver, this driver is assuming it is used as
clock consumer mode (= Codec is clock provider). This means, it can't
control SYNC signal by itself.
We need to use SW reset (= reset_control_xxx()) instead of TXRST/RXRST.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Yusuke Goda <yusuke.goda.sx@renesas.com>
Link: https://patch.msgid.link/87cy7fyuug.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 513024d5a0 ]
When IOMMU is enabled, dma_alloc_coherent() with GFP_USER may return
addresses from the vmalloc range. If such an address is mapped without
VM_MIXEDMAP, vm_insert_page() will trigger a BUG_ON due to the
VM_PFNMAP restriction.
Fix this by checking for vmalloc addresses and setting VM_MIXEDMAP
in the VMA before mapping. This ensures safe mapping and avoids kernel
crashes. The memory is still driver-allocated and cannot be accessed
directly by userspace.
Signed-off-by: Moti Haimovski <moti.haimovski@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a0d866bab1 ]
Dirty state can occur when the host VM undergoes a reset while the
device does not. In such a case, the driver must reset the device before
it can be used again. As part of this reset, the device capabilities
are zeroed. Therefore, the driver must read the Preboot status again to
learn the Preboot state, capabilities, and security configuration.
Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f5067531c ]
EFAULT is currently returned if less than requested user pages are
pinned. This value means a "bad address" which might be confusing to
the user, as the address of the given user memory is not necessarily
"bad".
Modify the return value to ENOMEM, as "out of memory" is more suitable
in this case.
Signed-off-by: Tomer Tayar <tomer.tayar@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b4fd8e56c9 ]
Change the BMON_CR register value back to its original state before
enabling, so that BMON does not continue to collect information
after being disabled.
Signed-off-by: Vered Yavniely <vered.yavniely@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 072fdd4b0b ]
The fc_ct_ms_fill() helper currently formats the OS name and version
into entry->value using "%s v%s". Since init_utsname()->sysname and
->release are unbounded strings, snprintf() may attempt to write more
than FC_FDMI_HBA_ATTR_OSNAMEVERSION_LEN bytes, triggering a
-Wformat-truncation warning with W=1.
In file included from drivers/scsi/libfc/fc_elsct.c:18:
drivers/scsi/libfc/fc_encode.h: In function ‘fc_ct_ms_fill.constprop’:
drivers/scsi/libfc/fc_encode.h:359:30: error: ‘%s’ directive output may
be truncated writing up to 64 bytes into a region of size between 62
and 126 [-Werror=format-truncation=]
359 | "%s v%s",
| ^~
360 | init_utsname()->sysname,
361 | init_utsname()->release);
| ~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/libfc/fc_encode.h:357:17: note: ‘snprintf’ output between
3 and 131 bytes into a destination of size 128
357 | snprintf((char *)&entry->value,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
358 | FC_FDMI_HBA_ATTR_OSNAMEVERSION_LEN,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
359 | "%s v%s",
| ~~~~~~~~~
360 | init_utsname()->sysname,
| ~~~~~~~~~~~~~~~~~~~~~~~~
361 | init_utsname()->release);
| ~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by using "%.62s v%.62s", which ensures sysname and release are
truncated to fit within the 128-byte field defined by
FC_FDMI_HBA_ATTR_OSNAMEVERSION_LEN.
[mkp: clarified commit description]
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2537577979 ]
Move the MCQ interrupt enable process to
ufshcd_mcq_make_queues_operational() to ensure that interrupts are set
correctly when making queues operational, similar to
ufshcd_make_hba_operational(). This change addresses the issue where
ufshcd_mcq_make_queues_operational() was not fully operational due to
missing interrupt enablement.
This change only affects host drivers that call
ufshcd_mcq_make_queues_operational(), i.e. ufs-mediatek.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 42e2a9e11a ]
Once the last user of a clock has been removed, the clock should be
removed. So far orphaned clocks are cleaned up in dp83640_free_clocks()
only. Add the logic to remove orphaned clocks in dp83640_remove().
This allows to simplify the code, and use standard macro
module_phy_driver(). dp83640 was the last external user of
phy_driver_register(), so we can stop exporting this function afterwards.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/6d4e80e7-c684-4d95-abbd-ea62b79a9a8a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd9a9562b2 ]
Currently, after the bridge is created, the FDB does not hold an FDB entry
for the bridge MAC on VLAN 0:
# ip link add name br up type bridge
# ip -br link show dev br
br UNKNOWN 92:19:8c:4e:01:ed <BROADCAST,MULTICAST,UP,LOWER_UP>
# bridge fdb show | grep 92:19:8c:4e:01:ed
92:19:8c:4e:01:ed dev br vlan 1 master br permanent
Later when the bridge MAC is changed, or in fact when the address is given
during netdevice creation, the entry appears:
# ip link add name br up address 00:11:22:33:44:55 type bridge
# bridge fdb show | grep 00:11:22:33:44:55
00:11:22:33:44:55 dev br vlan 1 master br permanent
00:11:22:33:44:55 dev br master br permanent
However when the bridge address is set by the user to the current bridge
address before the first port is enslaved, none of the address handlers
gets invoked, because the address is not actually changed. The address is
however marked as NET_ADDR_SET. Then when a port is enslaved, the address
is not changed, because it is NET_ADDR_SET. Thus the VLAN 0 entry is not
added, and it has not been added previously either:
# ip link add name br up type bridge
# ip -br link show dev br
br UNKNOWN 7e:f0:a8:1a:be:c2 <BROADCAST,MULTICAST,UP,LOWER_UP>
# ip link set dev br addr 7e:f0:a8:1a:be:c2
# ip link add name v up type veth
# ip link set dev v master br
# ip -br link show dev br
br UNKNOWN 7e:f0:a8:1a:be:c2 <BROADCAST,MULTICAST,UP,LOWER_UP>
# bridge fdb | grep 7e:f0:a8:1a:be:c2
7e:f0:a8:1a:be:c2 dev br vlan 1 master br permanent
Then when the bridge MAC is used as DMAC, and br_handle_frame_finish()
looks up an FDB entry with VLAN=0, it doesn't find any, and floods the
traffic instead of passing it up.
Fix this by simply adding the VLAN 0 FDB entry for the bridge itself always
on netdevice creation. This also makes the behavior consistent with how
ports are treated: ports always have an FDB entry for each member VLAN as
well as VLAN 0.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/415202b2d1b9b0899479a502bbe2ba188678f192.1758550408.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a890a2e339 ]
Theoretically it's an oopsable race, but I don't believe one can manage
to hit it on real hardware; might become doable on a KVM, but it still
won't be easy to attack.
Anyway, it's easy to deal with - since xdr_encode_hyper() is just a call of
put_unaligned_be64(), we can put that under ->d_lock and be done with that.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf75ad0968 ]
When client initialization goes through server trunking discovery, it
schedules the state manager and then sleeps waiting for nfs_client
initialization completion.
The state manager can fail during state recovery, and specifically in
lease establishment as nfs41_init_clientid() will bail out in case of
errors returned from nfs4_proc_create_session(), without ever marking
the client ready. The session creation can fail for a variety of reasons
e.g. during backchannel parameter negotiation, with status -EINVAL.
The error status will propagate all the way to the nfs4_state_manager
but the client status will not be marked, and thus the mount process
will remain blocked waiting.
Fix it by adding -EINVAL error handling to nfs4_state_manager().
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit be390f9524 ]
RFC7530 states that clients should be prepared for the return of
NFS4ERR_GRACE errors for non-reclaim lock and I/O requests.
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 51cb93aa0c ]
Don't update DC stream color components during atomic check. The driver
will continue validating the new CRTC color state but will not change DC
stream color components. The DC stream color state will only be
programmed at commit time in the `atomic_setup_commit` stage.
It fixes gamma LUT loss reported by KDE users when changing brightness
quickly or changing Display settings (such as overscan) with nightlight
on and HDR. As KWin can do a test commit with color settings different
from those that should be applied in a non-test-only commit, if the
driver changes DC stream color state in atomic check, this state can be
eventually HW programmed in commit tail, instead of the respective state
set by the non-blocking commit.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4444
Reported-by: Xaver Hugl <xaver.hugl@gmail.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f082daf08f ]
[Why]
Driver does not pick up and save vbios's clocks during init clocks,
the dispclk in clk_mgr will keep 0 until the first update clocks.
In some cases, OS changes the timing in the second set mode
(lower the pixel clock), causing the driver to lower the dispclk
in prepare bandwidth, which is illegal and causes grey screen.
[How]
1. Dump and save the vbios's clocks, and init the dispclk in
dcn314_init_clocks.
2. Fix the condition in dcn314_update_clocks, regarding a 0kHz value.
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Lo-an Chen <lo-an.chen@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b65cf4baeb ]
[Why&How]
We need to inform DMUB whether fast sync in ultra sleep mode is supported,
so that it can disable desync error detection when the it is not enabled.
This helps prevent unexpected desync errors when transitioning out of
ultra sleep mode.
Add fast sync in ultra sleep mode field in replay copy setting command.
Reviewed-by: Robin Chen <robin.chen@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Allen Li <wei-guang.li@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c8bedab2d9 ]
[WHY]
Ensure AVI infoframe updates from stream updates are applied to the active
stream so OS overrides are not lost.
[HOW]
Copy avi_infopacket to stream when valid flag is set.
Follow existing infopacket copy pattern and perform a basic validity check before assignment.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Karthi Kandasamy <karthi.kandasamy@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 54980f3c63 ]
[WHY&HOW]
dc_post_update_surfaces_to_stream needs to be called after a full update
completes in order to optimize clocks and watermarks for power. Add
missing calls before idle entry is requested to ensure optimal power.
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ad14239227 ]
e8bd877fb7 ("ovl: fix possible double unlink") added a sanity
check of !d_unhashed(child) to try to verify that child dentry was not
unlinked while parent dir was unlocked.
This "was not unlink" check has a false positive result in the case of
casefolded parent dir, because in that case, ovl_create_temp() returns
an unhashed dentry after ovl_create_real() gets an unhashed dentry from
ovl_lookup_upper() and makes it positive.
To avoid returning unhashed dentry from ovl_create_temp(), let
ovl_create_real() lookup again after making the newdentry positive,
so it always returns a hashed positive dentry (or an error).
This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
after ovl_create_temp() and allows mount of overlayfs with casefolding
enabled layers.
Reported-by: André Almeida <andrealmeid@igalia.com>
Closes: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
Suggested-by: Neil Brown <neil@brown.name>
Reviewed-by: Neil Brown <neil@brown.name>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d57f4b8749 ]
Today, once an inet_bind_bucket enters a state where fastreuse >= 0 or
fastreuseport >= 0 after a socket is explicitly bound to a port, it remains
in that state until all sockets are removed and the bucket is destroyed.
In this state, the bucket is skipped during ephemeral port selection in
connect(). For applications using a reduced ephemeral port
range (IP_LOCAL_PORT_RANGE socket option), this can cause faster port
exhaustion since blocked buckets are excluded from reuse.
The reason the bucket state isn't updated on port release is unclear.
Possibly a performance trade-off to avoid scanning bucket owners, or just
an oversight.
Fix it by recalculating the bucket state when a socket releases a port. To
limit overhead, each inet_bind2_bucket stores its own (fastreuse,
fastreuseport) state. On port release, only the relevant port-addr bucket
is scanned, and the overall state is derived from these.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250917-update-bind-bucket-state-on-unhash-v5-1-57168b661b47@cloudflare.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d1d6ad7f66 ]
Testing with a Presonus STUDIO 1824c together with
a Behringer ultragain digital ADAT device shows that
using all 3 altno settings works fine.
When selecting sample rate, the driver sets the interface
to the correct altno setting and the correct number of
channels is set.
Selecting the correct altno setting via Ardour, Reaper or
whatever other way to set the sample rate is more convenient
than re-loading the driver module with device_setup to
set altno.
Signed-off-by: Roy Vegard Ovesen <roy.vegard.ovesen@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a0b977a3d1 ]
At reset, the KSZ8463 uses a strap-based configuration to set SPI as
bus interface. SPI is the only bus supported by the driver. If the
required pull-ups/pull-downs are missing (by mistake or by design to
save power) the pins may float and the configuration can go wrong
preventing any communication with the switch.
Introduce a ksz8463_configure_straps_spi() function called during the
device reset. It relies on the 'straps-rxd-gpios' OF property and the
'reset' pinmux configuration to enforce SPI as bus interface.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Bastien Curutchet (Schneider Electric) <bastien.curutchet@bootlin.com>
Link: https://patch.msgid.link/20250918-ksz-strap-pins-v3-3-16662e881728@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50d51cef55 ]
Quoted from musl wiki:
GNU getopt permutes argv to pull options to the front, ahead of
non-option arguments. musl and the POSIX standard getopt stop
processing options at the first non-option argument with no
permutation.
Thus these scripts stop working on musl since non-option arguments for
tools using getopt() (in this case, (ar)ping) do not always come last.
Fix it by reordering arguments.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250919053538.1106753-1-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 299fad4133 ]
When a device is surprise-removed (e.g., due to a dock unplug), the PCI
core unconfigures all downstream devices and sets their error state to
pci_channel_io_perm_failure. This marks them as disconnected via
pci_dev_is_disconnected().
During device removal, the runtime PM framework may attempt to resume the
device to D0 via pm_runtime_get_sync(), which calls into pci_power_up().
Since the device is already disconnected, this resume attempt is
unnecessary and results in a predictable errors like this, typically when
undocking from a TBT3 or USB4 dock with PCIe tunneling:
pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
Avoid powering up disconnected devices by checking their status early in
pci_power_up() and returning -EIO.
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
[bhelgaas: add typical message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://patch.msgid.link/20250909031916.4143121-1-superm1@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 64b9642fc2 ]
When disabling SR-IOV, clear the configuration of each VF
in the hardware. Do not exit the configuration clearing process
due to the failure of a single VF. Additionally, Clear the VF
configurations before decrementing the PM counter.
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 85acd1b26b ]
Before the device reset, although the driver has set the queue
status to intercept doorbells sent by the task process, the reset
thread is isolated from the user-mode task process, so the task process
may still send doorbells. Therefore, before the reset, the queue is
directly invalidated, and the device directly discards the doorbells
sent by the process.
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d3ca2ef0c9 ]
Originally ptp_ocp driver was not strictly checking flags for external
timestamper and was always activating rising edge timestamping as it's
the only supported mode. Recent changes to ptp made it incompatible with
PTP_EXTTS_REQUEST2 ioctl. Adjust ptp_clock_info to provide supported
mode and be compatible with new infra.
While at here remove explicit check of periodic output flags from the
driver and provide supported flags for ptp core to check.
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250918131146.651468-1-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 16df67f218 ]
The two implementers of vfio_device_ops.device_feature,
vfio_cdx_ioctl_feature and vfio_pci_core_ioctl_feature, return
-ENOTTY in the fallthrough case when the feature is unsupported. For
consistency, the base case, vfio_ioctl_device_feature, should do the
same when device_feature == NULL, indicating an implementation has no
feature extensions.
Signed-off-by: Alex Mastro <amastro@fb.com>
Link: https://lore.kernel.org/r/20250908-vfio-enotty-v1-1-4428e1539e2e@fb.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7205ef77df ]
Conventions for readsl() are the same as for readl() - any __iomem
pointer is acceptable, both const and volatile ones being OK. Same
for readsb() and readsw().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Andreas Larsson <andreas@gaisler.com> # Making sparc64 subject prefix
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf7154ffb1 ]
EEE speed down means speed down MAC MCU clock. It is not from spec.
It is kind of Realtek specific power saving feature. But enable it
may cause some issues, like packet drop or interrupt loss. Different
hardware may have different issues.
EEE speed down ratio (mac ocp 0xe056[7:4]) is used to set EEE speed
down rate. The larger this value is, the more power can save. But it
actually save less power then we expected. And, as mentioned above,
will impact compatibility. So set it to 1 (mac ocp 0xe056[7:4] = 0)
, which means not to speed down, to improve compatibility.
Signed-off-by: ChunHao Lin <hau@realtek.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/20250918023425.3463-1-hau@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 99e9c5ffbb ]
Variable idx is set in the loop, but is never used resulting in dead
code. Building with GCC 16, which enables
-Werror=unused-but-set-parameter= by default results in build error.
This patch removes the idx parameter, since all the callers of the
fm10k_unbind_hw_stats_q as 0 as idx anyways.
Suggested-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4402e8f39d ]
Bit 66 in the page group response descriptor used to be the LPIG (Last
Page in Group), but it was marked as Reserved since Specification 4.0.
Remove programming on this bit to make it consistent with the latest
specification.
Existing hardware all treats bit 66 of the page group response descriptor
as "ignored", therefore this change doesn't break any existing hardware.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20250901053943.1708490-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 75c02a0376 ]
snprintf() returns the number of bytes that would have been written, not
the number actually written. Using this for offset tracking can cause
buffer overruns if truncation occurs.
Replace snprintf() with scnprintf() to ensure the offset stays within
bounds.
Since scnprintf() never returns a negative value, and zero is not possible
in this context because 'bytes' starts at 0 and 'size - bytes' is
DEBUG_BUFFER_SIZE in the first call, which is large enough to hold the
string literals used, the return value is always positive. An integer
overflow is also completely out of reach here due to the small and fixed
buffer size. The error check in latency_show_one() is therefore
unnecessary. Remove it and make dmar_latency_snapshot() return void.
Signed-off-by: Seyediman Seyedarab <ImanDevel@gmail.com>
Link: https://lore.kernel.org/r/20250731225048.131364-1-ImanDevel@gmail.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 60f887b129 ]
When a PHY is halted (e.g. `ip link set dev lan2 down`), several
fields in struct phy_device may still reflect the last active
connection. This leads to ethtool showing stale values even though
the link is down.
Reset selected fields in _phy_state_machine() when transitioning
to PHY_HALTED and the link was previously up:
- speed/duplex -> UNKNOWN, but only in autoneg mode (in forced mode
these fields carry configuration, not status)
- master_slave_state -> UNKNOWN if previously supported
- mdix -> INVALID (state only, same meaning as "unknown")
- lp_advertising -> always cleared
The cleanup is skipped if the PHY is in PHY_ERROR state, so the
last values remain available for diagnostics.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250917094751.2101285-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc9a8e238e ]
kcalloc() may fail. When WS is non-zero and allocation fails, ectx.ws
remains NULL while ectx.ws_size is set, leading to a potential NULL
pointer dereference in atom_get_src_int() when accessing WS entries.
Return -ENOMEM on allocation failure to avoid the NULL dereference.
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 300b072df7 ]
The transaction manager initialization in txInit() was not properly
initializing TxBlock[0].waitor waitqueue, causing a crash when
txEnd(0) is called on read-only filesystems.
When a filesystem is mounted read-only, txBegin() returns tid=0 to
indicate no transaction. However, txEnd(0) still gets called and
tries to access TxBlock[0].waitor via tid_to_tblock(0), but this
waitqueue was never initialized because the initialization loop
started at index 1 instead of 0.
This causes a 'non-static key' lockdep warning and system crash:
INFO: trying to register non-static key in txEnd
Fix by ensuring all transaction blocks including TxBlock[0] have
their waitqueues properly initialized during txInit().
Reported-by: syzbot+c4f3462d8b2ad7977bea@syzkaller.appspotmail.com
Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 42f993d343 ]
Currently, all master upper netdevices (e.g., bond, VRF) are treated
equally.
When a VRF netdevice is used over an IPoIB netdevice, the expected
netdev resolution is on the lower IPoIB device which has the IP address
assigned to it and not the VRF device.
The rdma_cm module (CMA) tries to match incoming requests to a
particular netdevice. When successful, it also validates that the return
path points to the same device by performing a routing table lookup.
Currently, the former would resolve to the VRF netdevice, while the
latter to the correct lower IPoIB netdevice, leading to failure in
rdma_cm.
Improve this by ignoring the VRF master netdevice, if it exists, and
instead return the lower IPoIB device.
Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20250916111103.84069-5-edwards@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bc2a5a12fa ]
Logically before a waiting side which has already timed out turns the
atomic status back to idle, a completing side could still pass atomic
condition and call complete. It will make the following H2C commands,
waiting C2H events, get a completion unexpectedly early. Hence, renew
a completion for each H2C command waiting a C2H event.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250915065343.39023-1-pkshih@realtek.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23361bd549 ]
When we get wrong extent info data, and look up extent_node in rb tree,
it will cause infinite loop (CONFIG_F2FS_CHECK_FS=n). Avoiding this by
return NULL and print some kernel messages in that case.
Signed-off-by: wangzijie <wangzijie1@honor.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 41cf11946b ]
Allow autosuspend to be used by xhci plat device. For Qualcomm SoCs,
when in host mode, it is intended that the controller goes to suspend
state to save power and wait for interrupts from connected peripheral
to wake it up. This is particularly used in cases where a HID or Audio
device is connected. In such scenarios, the usb controller can enter
auto suspend and resume action after getting interrupts from the
connected device.
Signed-off-by: Krishna Kurapati <krishna.kurapati@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250916120436.3617598-1-krishna.kurapati@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 368ed48a5e ]
The usbmon binary interface currently truncates captures of large
transfers from higher-speed USB devices. Because a single event capture
is limited to one-fifth of the total buffer size, the current maximum
size of a captured URB is around 240 KiB. This is insufficient when
capturing traffic from modern devices that use transfers of several
hundred kilobytes or more, as truncated URBs can make it impossible for
user-space USB analysis tools like Wireshark to properly defragment and
reassemble higher-level protocol packets in the captured data.
The root cause of this issue is the 1200 KiB BUFF_MAX limit, which has
not been changed since the binary interface was introduced in 2006.
To resolve this issue, this patch increases BUFF_MAX to 64 MiB. The
original comment for BUFF_MAX based the limit's calculation on a
saturated 480 Mbit/s bus. Applying the same logic to a modern USB 3.2
Gen 2×2 20 Gbit/s bus (~2500 MB/s over a 20ms window) indicates the
buffer should be at least 50 MB. The new limit of 64 MiB covers that,
plus a little extra for any overhead.
With this change, both users and developers should now be able to debug
and reverse engineer modern USB devices even when running unmodified
distro kernels.
Please note that this change does not affect the default buffer size. A
larger buffer is only allocated when a user explicitly requests it via
the MON_IOCT_RING_SIZE ioctl, so the change to the maximum buffer size
should not unduly increase memory usage for users that don't
deliberately request a larger buffer.
Link: https://lore.kernel.org/CAO3ALPzdUkmMr0YMrODLeDSLZqNCkWcAP8NumuPHLjNJ8wC1kQ@mail.gmail.com
Signed-off-by: Forest Crossman <cyrozap@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/CAO3ALPxU5RzcoueC454L=WZ1qGMfAcnxm+T+p+9D8O9mcrUbCQ@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2bf81856a4 ]
There is a timing race condition when a PRLI may be sent on the wire
before PLOGI_ACC in Point to Point topology. Fix by deferring REG_RPI
mbox completion handling to after PLOGI_ACC's CQE completion. Because
the discovery state machine only sends PRLI after REG_RPI mbox
completion, PRLI is now guaranteed to be sent after PLOGI_ACC.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-8-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5de09770b1 ]
To assist in debugging lpfc_xri_rebalancing driver parameter, a debugfs
entry is used. The debugfs file operations for xri rebalancing have
been previously implemented, but lack definition for its information
buffer size. Similar to other pre-existing debugfs entry buffers,
define LPFC_HDWQINFO_SIZE as 8192 bytes.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-9-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a4809b98eb ]
In lpfc_cleanup, there is an extraneous nlp_put for NPIV ports on the
F_Port_Ctrl ndlp object. In cases when an ABTS is issued, the
outstanding kref is needed for when a second XRI_ABORTED CQE is
received. The final kref for the ndlp is designed to be decremented in
lpfc_sli4_els_xri_aborted instead. Also, add a new log message to allow
for future diagnostics when debugging related issues.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-5-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f408dde246 ]
If lpfc_reset_flush_io_context fails to execute, then the wrong return
status code may be passed back to upper layers when issuing a target
reset TMF command. Fix by checking the return status from
lpfc_reset_flush_io_context() first in order to properly return FAILED
or FAST_IO_FAIL.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-7-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b5bf6d681f ]
The kref for Fabric_DID ndlps is not decremented after repeated FDISC
failures and exhausting maximum allowed retries. This can leave the
ndlp lingering unnecessarily. Add a test and set bit operation for the
NLP_DROPPED flag. If not previously set, then a kref is decremented. The
ndlp is freed when the remaining reference for the completing ELS is
put.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-6-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 803dfd83df ]
lpfc_sli4_queue_setup() does not allocate memory and is used for
submitting CREATE_QUEUE mailbox commands. Thus, if such mailbox
commands fail we should clean up by also freeing the memory allocated
for the queues with lpfc_sli4_queue_destroy(). Change the intended
clean up label for the lpfc_sli4_queue_setup() error case to
out_destroy_queue.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-4-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d7ddcf921e ]
Gang submission means that the kernel driver guarantees that multiple
submissions are executed on the HW at the same time on different engines.
Background is that those submissions then depend on each other and each
can't finish stand alone.
SRIOV now uses world switch to preempt submissions on the engines to allow
sharing the HW resources between multiple VFs.
The problem is now that the SRIOV world switch can't know about such inter
dependencies and will cause a timeout if it waits for a partially running
gang submission.
To conclude SRIOV and gang submissions are fundamentally incompatible at
the moment. For now just disable them.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3b09b11805 ]
Due to multiple explosion issues in the early days of the Xe driver,
the GuC load was hacked to never return a failure. That prevented
kernel panics and such initially, but now all it achieves is creating
more confusing errors when the driver tries to submit commands to a
GuC it already knows is not there. So fix that up.
As a stop-gap and to help with debug of load failures due to invalid
GuC init params, a wedge call had been added to the inner GuC load
function. The reason being that it leaves the GuC log accessible via
debugfs. However, for an end user, simply aborting the module load is
much cleaner than wedging and trying to continue. The wedge blocks
user submissions but it seems that various bits of the driver itself
still try to submit to a dead GuC and lots of subsequent errors occur.
And with regards to developers debugging why their particular code
change is being rejected by the GuC, it is trivial to either add the
wedge back in and hack the return code to zero again or to just do a
GuC log dump to dmesg.
v2: Add support for error injection testing and drop the now redundant
wedge call.
CC: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://lore.kernel.org/r/20250909224132.536320-1-John.C.Harrison@Intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d99aa4ed7 ]
In nested arrays don't require that the intermediate attribute
type should be a valid attribute type, it might just be zero
or an incrementing index, it is often not even used.
See include/net/netlink.h about NLA_NESTED_ARRAY:
> The difference to NLA_NESTED is the structure:
> NLA_NESTED has the nested attributes directly inside
> while an array has the nested attributes at another
> level down and the attribute types directly in the
> nesting don't matter.
Example based on include/uapi/linux/wireguard.h:
> WGDEVICE_A_PEERS: NLA_NESTED
> 0: NLA_NESTED
> WGPEER_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN
> [..]
> 0: NLA_NESTED
> ...
> ...
Previous the check required that the nested type was valid
in the parent attribute set, which in this case resolves to
WGDEVICE_A_UNSPEC, which is YNL_PT_REJECT, and it took the
early exit and returned YNL_PARSE_CB_ERROR.
This patch renames the old nl_attr_validate() to
__nl_attr_validate(), and creates a new inline function
nl_attr_validate() to mimic the old one.
The new __nl_attr_validate() takes the attribute type as an
argument, so we can use it to validate attributes of a
nested attribute, in the context of the parents attribute
type, which in the above case is generated as:
[WGDEVICE_A_PEERS] = {
.name = "peers",
.type = YNL_PT_NEST,
.nest = &wireguard_wgpeer_nest,
},
__nl_attr_validate() only checks if the attribute length
is plausible for a given attribute type, so the .nest in
the above example is not used.
As the new inline function needs to be defined after
ynl_attr_type(), then the definitions are moved down,
so we avoid a forward declaration of ynl_attr_type().
Some other examples are NL80211_BAND_ATTR_FREQS (nest) and
NL80211_ATTR_SUPPORTED_COMMANDS (u32) both in nl80211-user.c
$ make -C tools/net/ynl/generated nl80211-user.c
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250915144301.725949-7-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7d62beb102 ]
Dell systems utilize an EC-based touchpad emulation when the ACPI
touchpad _DSM is not invoked. This emulation acts as a secondary
master on the I2C bus, designed for scenarios where the I2C touchpad
driver is absent, such as in BIOS menus. Typically, loading the
i2c-hid module triggers the _DSM at initialization, disabling the
EC-based emulation.
However, if the i2c-hid module is missing from the boot kernel
used for hibernation snapshot restoration, the _DSM remains
uncalled, resulting in dual masters on the I2C bus and
subsequent arbitration errors. This issue arises when i2c-hid
resides in the rootfs instead of the kernel or initramfs.
To address this, switch from using the SYSTEM_SLEEP_PM_OPS()
macro to dedicated callbacks, introducing a specific
callback for restoring the S4 image. This callback ensures
the _DSM is invoked.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c1553fc105 ]
Currently, the UFS lane clocks remain enabled even after the link enters
the Hibern8 state and are only disabled during runtime/system
suspend.This patch modifies the behavior to disable the lane clocks
during ufs_qcom_setup_clocks(), which is invoked shortly after the link
enters Hibern8 via gate work.
While hibern8_notify() offers immediate control, toggling clocks on
every transition isn't ideal due to varied contexts like clock scaling.
Since setup_clocks() manages PHY/controller resources and is invoked
soon after Hibern8 entry, it serves as a central and stable point for
clock gating.
Signed-off-by: Palash Kambar <quic_pkambar@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Message-ID: <20250909055149.2068737-1-quic_pkambar@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d2d3f529e7 ]
A lot of modern SoC have the ability to store MAC addresses in their
NVMEM. So extend the generic function device_get_mac_address() to
obtain the MAC address from an nvmem cell named 'mac-address' in
case there is no firmware node which contains the MAC address directly.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250912140332.35395-3-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f3b52167a0 ]
Driver authors often forget to add GFP_NOWARN for page allocation
from the datapath. This is annoying to users as OOMs are a fact
of life, and we pretty much expect network Rx to hit page allocation
failures during OOM. Make page pool add GFP_NOWARN for ATOMIC allocations
by default.
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250912161703.361272-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c97a7dccb3 ]
dml21_map_dc_state_into_dml_display_cfg calls (the call is usually
inlined by the compiler) populate_dml21_surface_config_from_plane_state
and populate_dml21_plane_config_from_plane_state which may use FPU. In
a x86-64 build:
$ objdump --disassemble=dml21_map_dc_state_into_dml_display_cfg \
> drivers/gpu/drm/amd/display/dc/dml2/dml21/dml21_translation_helper.o |
> grep %xmm -c
63
Thus it needs to be guarded with DC_FP_START. But we must note that the
current code quality of the in-kernel FPU use in AMD dml2 is very much
problematic: we are actually calling DC_FP_START in dml21_wrapper.c
here, and this translation unit is built with CC_FLAGS_FPU. Strictly
speaking this does not make any sense: with CC_FLAGS_FPU the compiler is
allowed to generate FPU uses anywhere in the translated code, perhaps
out of the DC_FP_START guard. This problematic pattern also occurs in
at least dml2_wrapper.c, dcn35_fpu.c, and dcn351_fpu.c. Thus we really
need a careful audit and refactor for the in-kernel FPU uses, and this
patch is simply whacking a mole. However per the reporter, whacking
this mole is enough to make a 9060XT "just work."
Reported-by: Asiacn <710187964@qq.com>
Closes: https://github.com/loongson-community/discussions/issues/102
Tested-by: Asiacn <710187964@qq.com>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 489f0f600c ]
When the EDID has the HDMI bit, we should simply select
the HDMI signal type even on DVI ports.
For reference see, the legacy amdgpu display code:
amdgpu_atombios_encoder_get_encoder_mode
which selects ATOM_ENCODER_MODE_HDMI for the same case.
This commit fixes DVI connectors to work with DVI-D/HDMI
adapters so that they can now produce output over these
connectors for HDMI monitors with higher bandwidth modes.
With this change, even HDMI audio works through DVI.
For testing, I used a CAA-DMDHFD3 DVI-D/HDMI adapter
with the following GPUs:
Tahiti (DCE 6) - DC can now output 4K 30 Hz over DVI
Polaris 10 (DCE 11.2) - DC can now output 4K 60 Hz over DVI
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0449726b58 ]
DC can turn off the display clock when no displays are connected
or when all displays are off, for reference see:
- dce*_validate_bandwidth
DC also assumes that the DP clock is always on and never powers
it down, for reference see:
- dce110_clock_source_power_down
In case of DCE 6.0 and 6.4, PLL0 is the clock source for both
the engine clock and DP clock, for reference see:
- radeon_atom_pick_pll
- atombios_crtc_set_disp_eng_pll
Therefore, PLL0 should be always kept running on DCE 6.0 and 6.4.
This commit achieves that by ensuring that by setting the display
clock to the corresponding value in low power state instead of
zero.
This fixes a page flip timeout on SI with DC which happens when
all connected displays are blanked.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 02a6c2e4b2 ]
[why&how]
small error in order of operations in immediateflipbytes
calculation on dml ms side that can result in dml ms
and mp mismatch immediateflip support for a given pipe
and thus an invalid hw state, correct the order to align
with mp.
Reviewed-by: Leo Chen <leo.chen@amd.com>
Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5e76bc677c ]
[Why]
fill_stream_properties_from_drm_display_mode() will not configure pixel
encoding to YCBCR422 when the DRM color format supports YCBCR422 but not
YCBCR420 or YCBCR4444. Instead it will fallback to RGB.
[How]
Add support for YCBCR422 in pixel encoding mapping.
Suggested-by: Mauri Carvalho <mcarvalho3@lenovo.com>
Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Signed-off-by: Mario Limonciello <Mario.Limonciello@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18e755155c ]
[Why]
New sequence from HW for reset and firmware reloading has been
provided that aims to stabilize the reload sequence in the case the
firmware is hung or has outstanding requests.
[How]
Update the sequence to remove the DMUIF reset and the redundant
writes in the release.
Reviewed-by: Sreeja Golui <sreeja.golui@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c1456fadce ]
xgmi hive reference is taken on function entry, but not released
correctly for all paths. Use __free() to release reference properly.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Ce Sun <cesun102@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dea75df7af ]
Replace kmalloc_array() + copy_from_user() with memdup_array_user().
This shrinks the source code and improves separation between the kernel
and userspace slabs.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8497324901 ]
If multiple instances of this driver are instantiated and try to send
concurrently then the single static buffer snd_serial_generic_tx_work()
will cause corruption in the data output.
Move the buffer into the per-instance driver data to avoid this.
Signed-off-by: John Keeping <jkeeping@inmusicbrands.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cb6ebbdffe ]
Support writing MAC TXD for the AddBA Req. Without this commit, the
start sequence number in AddBA Req will be unexpected value for MT7996
and MT7992. This can result in certain stations (e.g., AX200) dropping
packets, leading to ping failures and degraded connectivity. Ensuring
the correct MAC TXD and TXP helps maintain reliable packet transmission
and prevents interoperability issues with affected stations.
Signed-off-by: Howard Hsu <howard-yh.hsu@mediatek.com>
Link: https://patch.msgid.link/20250909-mt7996-addba-txd-fix-v1-1-feec16f0c6f0@kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 25ef5b5d02 ]
Enable 160MHz beamformee support on mt7922 by updating HE capability
element configuration. Previously, only 160MHz channel width was set,
but beamformee for 160MHz was not properly advertised. This patch
adds BEAMFORMEE_MAX_STS_ABOVE_80MHZ_4 capability to allow devices
to utilize 160MHz BW for beamforming.
Tested by connecting to 160MHz-bandwidth beamforming AP and verified
HE capability.
Signed-off-by: Quan Zhou <quan.zhou@mediatek.com>
Link: https://patch.msgid.link/ae637afaffed387018fdc43709470ef65898ff0b.1756383627.git.quan.zhou@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66048f8b3c ]
During recent testing with the netem qdisc to inject delays into TCP
traffic, we observed that our CLS BPF program failed to function correctly
due to incorrect classid retrieval from task_get_classid(). The issue
manifests in the following call stack:
bpf_get_cgroup_classid+5
cls_bpf_classify+507
__tcf_classify+90
tcf_classify+217
__dev_queue_xmit+798
bond_dev_queue_xmit+43
__bond_start_xmit+211
bond_start_xmit+70
dev_hard_start_xmit+142
sch_direct_xmit+161
__qdisc_run+102 <<<<< Issue location
__dev_xmit_skb+1015
__dev_queue_xmit+637
neigh_hh_output+159
ip_finish_output2+461
__ip_finish_output+183
ip_finish_output+41
ip_output+120
ip_local_out+94
__ip_queue_xmit+394
ip_queue_xmit+21
__tcp_transmit_skb+2169
tcp_write_xmit+959
__tcp_push_pending_frames+55
tcp_push+264
tcp_sendmsg_locked+661
tcp_sendmsg+45
inet_sendmsg+67
sock_sendmsg+98
sock_write_iter+147
vfs_write+786
ksys_write+181
__x64_sys_write+25
do_syscall_64+56
entry_SYSCALL_64_after_hwframe+100
The problem occurs when multiple tasks share a single qdisc. In such cases,
__qdisc_run() may transmit skbs created by different tasks. Consequently,
task_get_classid() retrieves an incorrect classid since it references the
current task's context rather than the skb's originating task.
Given that dev_queue_xmit() always executes with bh disabled, we can use
softirq_count() instead to obtain the correct classid.
The simple steps to reproduce this issue:
1. Add network delay to the network interface:
such as: tc qdisc add dev bond0 root netem delay 1.5ms
2. Build two distinct net_cls cgroups, each with a network-intensive task
3. Initiate parallel TCP streams from both tasks to external servers.
Under this specific condition, the issue reliably occurs. The kernel
eventually dequeues an SKB that originated from Task-A while executing in
the context of Task-B.
It is worth noting that it will change the established behavior for a
slightly different scenario:
<sock S is created by task A>
<class ID for task A is changed>
<skb is created by sock S xmit and classified>
prior to this patch the skb will be classified with the 'new' task A
classid, now with the old/original one. The bpf_get_cgroup_classid_curr()
function is a more appropriate choice for this case.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250902062933.30087-1-laoar.shao@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9048beca9c ]
during entropy evaluation, if the generated samples fail
any statistical test, then, all of the bits will be discarded,
and a second set of samples will be generated and tested.
the entropy delay interval should be doubled before performing the
retry.
also, ctrlpriv->rng4_sh_init and inst_handles both reads RNG DRNG
status register, but only inst_handles is updated before every retry.
so only check inst_handles and removing ctrlpriv->rng4_sh_init
Signed-off-by: Gaurav Jain <gaurav.jain@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2b0dc40ac6 ]
payload_size field of the request header is incorrectly calculated using
sizeof(req). Since 'req' is a pointer (struct hsti_request *), sizeof(req)
returns the size of the pointer itself (e.g., 8 bytes on a 64-bit system),
rather than the size of the structure it points to. This leads to an
incorrect payload size being sent to the Platform Security Processor (PSP),
potentially causing the HSTI query command to fail.
Fix this by using sizeof(*req) to correctly calculate the size of the
struct hsti_request.
Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>> ---
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09fefb24ed ]
dw_pcie_edma_irq_verify() is supposed to verify the eDMA IRQs in devicetree
by fetching them using either 'dma' or 'dmaX' IRQ names. Former is used
when the platform uses a single IRQ for all eDMA channels and latter is
used when the platform uses separate IRQ per channel. But currently,
dw_pcie_edma_irq_verify() bails out early if edma::nr_irqs is 1, i.e., when
a single IRQ is used. This gives an impression that the driver could work
with any single IRQ in devicetree, not necessarily with name 'dma'.
But dw_pcie_edma_irq_vector(), which actually requests the IRQ, does
require the single IRQ to be named as 'dma'. So this creates inconsistency
between dw_pcie_edma_irq_verify() and dw_pcie_edma_irq_vector().
Thus, to fix this inconsistency, make sure dw_pcie_edma_irq_verify() also
verifies the single IRQ name by removing the bail out code.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
[mani: reworded subject and description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: fix typos]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250908165914.547002-3-cassel@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9a23ea1f75 ]
Using the number of bytes in the request as DMA timeout is really
inconsistent, as large requests could possibly set a timeout of
hundreds of seconds.
Remove the per-channel timeout field and use a single, static DMA
timeout of 3 seconds for all requests.
Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
Tested-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Reviewed-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df3c6e0b6d ]
Fix the issue of max_timeout being calculated larger than actual value.
The calculation result of freq / (S3C2410_WTCON_PRESCALE_MAX + 1) /
S3C2410_WTCON_MAXDIV is smaller than the actual value because the remainder
is discarded during the calculation process. This leads to a larger
calculated value for max_timeout compared to the actual settable value.
To resolve this issue, the order of calculations in the computation process
has been adjusted.
Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org>
Signed-off-by: Sangwook Shin <sw617.shin@samsung.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b595974b4a ]
The Asus Z13 folio has a multitouch touchpad that needs to bind
to the hid-multitouch driver in order to work properly. So bind
it to the HID_GROUP_GENERIC group to release the touchpad and
move it to the bottom so that the comment applies to it.
While at it, change the generic KEYBOARD3 name to Z13_FOLIO.
Reviewed-by: Luke D. Jones <luke@ljones.dev>
Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dc2f650f7e ]
netdev_WARN() uses WARN/WARN_ON to print a backtrace along with
file and line information. In this case, udp_tunnel_nic_register()
returning an error is just a failed operation, not a kernel bug.
udp_tunnel_nic_register() can fail due to a memory allocation
failure (kzalloc() or udp_tunnel_nic_alloc()).
This is a normal runtime error and not a kernel bug.
Replace netdev_WARN() with netdev_warn() accordingly.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250910195031.3784748-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fa57032941 ]
Usually the autodefer helpers in lib.sh are expected to be run in context
where success is the expected outcome. However when using them for feature
detection, failure can legitimately occur. But the failed command still
schedules a cleanup, which will likely fail again.
Instead, only schedule deferred cleanup when the positive command succeeds.
This way of organizing the cleanup has the added benefit that now the
return code from these functions reflects whether the command passed.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/af10a5bb82ea11ead978cf903550089e006d7e70.1757004393.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18282100d7 ]
tcp_recvmsg_dmabuf can export the following errors:
- EFAULT when linear copy fails
- ETOOSMALL when cmsg put fails
- ENODEV if one of the frags is readable
- ENOMEM on xarray failures
But they are all ignored and replaced by EFAULT in the caller
(tcp_recvmsg_locked). Expose real error to the userspace to
add more transparency on what specifically fails.
In non-devmem case (skb_copy_datagram_msg) doing `if (!copied)
copied=-EFAULT` is ok because skb_copy_datagram_msg can return only EFAULT.
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250910162429.4127997-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 43adad382e ]
When 8139too is probing and 8139TOO_PIO=y it will call pci_iomap_range()
and from there __pci_ioport_map() for the PCI IO space.
If HAS_IOPORT_MAP=n and NO_GENERIC_PCI_IOPORT_MAP=n, like it is on my
m68k config, __pci_ioport_map() becomes NULL, pci_iomap_range() will
always fail and the driver will complain it couldn't map the PIO space
and return an error.
NO_IOPORT_MAP seems to cover the case where what 8139too is trying
to do cannot ever work so make 8139TOO_PIO depend on being it false
and avoid creating an unusable driver.
Signed-off-by: Daniel Palmer <daniel@thingy.jp>
Link: https://patch.msgid.link/20250907064349.3427600-1-daniel@thingy.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e414b10058 ]
All of the x86 KVM guest types (VMX, SEV and TDX) do some special context
tracking when entering guests. This means that the actual guest entry
sequence must be noinstr.
Part of entering a TDX guest is passing a physical address to the TDX
module. Right now, that physical address is stored as a 'struct page'
and converted to a physical address at guest entry. That page=>phys
conversion can be complicated, can vary greatly based on kernel
config, and it is definitely _not_ a noinstr path today.
There have been a number of tinkering approaches to try and fix this
up, but they all fall down due to some part of the page=>phys
conversion infrastructure not being noinstr friendly.
Precalculate the page=>phys conversion and store it in the existing
'tdx_vp' structure. Use the new field at every site that needs a
tdvpr physical address. Remove the now redundant tdx_tdvpr_pa().
Remove the __flatten remnant from the tinkering.
Note that only one user of the new field is actually noinstr. All
others can use page_to_phys(). But, they might as well save the effort
since there is a pre-calculated value sitting there for them.
[ dhansen: rewrite all the text ]
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kiryl Shutsemau <kas@kernel.org>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9605505039 ]
The commit b2798ba0b8 ("KVM: X86: Choose qspinlock when dedicated
physical CPUs are available") states that when PV_DEDICATED=1
(vCPU has dedicated pCPU), qspinlock should be preferred regardless of
PV_UNHALT. However, the current implementation doesn't reflect this: when
PV_UNHALT=0, we still use virt_spin_lock() even with dedicated pCPUs.
This is suboptimal because:
1. Native qspinlocks should outperform virt_spin_lock() for dedicated
vCPUs irrespective of HALT exiting
2. virt_spin_lock() should only be preferred when vCPUs may be preempted
(non-dedicated case)
So reorder the PV spinlock checks to:
1. First handle dedicated pCPU case (disable virt_spin_lock_key)
2. Second check single CPU, and nopvspin configuration
3. Only then check PV_UNHALT support
This ensures we always use native qspinlock for dedicated vCPUs, delivering
pretty performance gains at high contention levels.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Wangyang Guo <wangyang.guo@intel.com>
Link: https://lore.kernel.org/r/20250722110005.4988-1-lirongqing@baidu.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db99b2f2b3 ]
tcp reject code won't reply to a tcp reset.
But the icmp reject 'netdev' family versions will reply to icmp
dst-unreach errors, unlike icmp_send() and icmp6_send() which are used
by the inet family implementation (and internally by the REJECT target).
Check for the icmp(6) type and do not respond if its an unreachable error.
Without this, something like 'ip protocol icmp reject', when used
in a netdev chain attached to 'lo', cause a packet loop.
Same for two hosts that both use such a rule: each error packet
will be replied to.
Such situation persist until the (bogus) rule is amended to ratelimit or
checks the icmp type before the reject statement.
As the inet versions don't do this make the netdev ones follow along.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d0cb6d00b ]
To ensure the proper functioning of the jump_label test module, this patch
adds support for the R_OR1K_32_PCREL relocation type for any modules. The
implementation calculates the PC-relative offset by subtracting the
instruction location from the target value and stores the result at the
specified location.
Signed-off-by: chenmiao <chenmiao.ku@gmail.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c068ba9d3d ]
The test always returns success even if some tests were modified to
fail. Fix by converting the test to use the appropriate library
functions instead of using its own functions.
Before:
# ./traceroute.sh
TEST: IPV6 traceroute [FAIL]
TEST: IPV4 traceroute [ OK ]
Tests passed: 1
Tests failed: 1
$ echo $?
0
After:
# ./traceroute.sh
TEST: IPv6 traceroute [FAIL]
traceroute6 did not return 2000:102::2
TEST: IPv4 traceroute [ OK ]
$ echo $?
1
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250908073238.119240-5-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 47efbac9b7 ]
Use require_command() so that the test will return SKIP (4) when a
required command is not present.
Before:
# ./traceroute.sh
SKIP: Could not run IPV6 test without traceroute6
SKIP: Could not run IPV4 test without traceroute
$ echo $?
0
After:
# ./traceroute.sh
TEST: traceroute6 not installed [SKIP]
$ echo $?
4
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250908073238.119240-6-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d82e3d2dd0 ]
Originally, the 'amd_pmf_get_custom_bios_inputs()' function was written
under the assumption that the BIOS would only send a single pending
request for the driver to process. However, following OEM enablement, it
became clear that multiple pending requests for custom BIOS inputs might
be sent at the same time, a scenario that the current code logic does not
support when it comes to handling multiple custom BIOS inputs.
To address this, the code logic needs to be improved to not only manage
multiple simultaneous custom BIOS inputs but also to ensure it is scalable
for future additional inputs.
Co-developed-by: Patil Rajesh Reddy <Patil.Reddy@amd.com>
Signed-off-by: Patil Rajesh Reddy <Patil.Reddy@amd.com>
Tested-by: Yijun Shen <Yijun.Shen@Dell.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://patch.msgid.link/20250901110140.2519072-3-Shyam-sundar.S-k@amd.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ecba852dc9 ]
Change "ret" from u8 to int type in redrat3_enable_detector() to store
negative error codes or zero returned by redrat3_send_cmd() and
usb_submit_urb() - this better aligns with the coding standards and
maintains code consistency.
No effect on runtime.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 60e9f776b7 ]
The upstream mesa copy of the GPU regs has shifted more things to reg64
instead of seperate 32b HI/LO reg32's. This works better with the "new-
style" c++ builders that mesa has been migrating to for a6xx+ (to better
handle register shuffling between gens), but it leaves the C builders
with missing _HI/LO builders.
So handle the special case of reg64, automatically generating the
missing _HI/LO builders.
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/673559/
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d73836cb85 ]
Address the issue where the host does not send adapt to the device after
PA_Init success. Ensure the adapt process is correctly initiated for
devices with IP version MT6899 and above, resolving communication issues
between the host and device.
Signed-off-by: Alice Chao <alice.chao@mediatek.com>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f5ca8d0c7a ]
Disable auto-hibern8 during power mode transitions to prevent unintended
entry into auto-hibern8. Restore the original auto-hibern8 timer value
after completing the power mode change to maintain system stability and
prevent potential issues during power state transitions.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 77b96ef70b ]
Refine the system power management (PM) flow by skipping low power mode
(LPM) and MTCMOS settings if runtime PM is already applied. Prevent
redundant operations to ensure a more efficient PM process.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit deb105f498 ]
The 88e1510 PHY has an erratum where the phy downshift counter is not
cleared after phy being suspended(BMCR_PDOWN set) and then later
resumed(BMCR_PDOWN cleared). This can cause the gigabit link to
intermittently downshift to a lower speed.
Disabling and re-enabling the downshift feature clears the counter,
allowing the PHY to retry gigabit link negotiation up to the programmed
retry count times before downshifting. This behavior has been observed
on copper links.
Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250906-marvell_fix-v2-1-f6efb286937f@altera.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit faac32d4ec ]
Improve the recovery process for hibernation exit failures. Trigger the
error handler and break the suspend operation to ensure effective
recovery from hibernation errors. Activate the error handling mechanism
by ufshcd_force_error_recovery and scheduling the error handler work.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 91cad911ed ]
Resolve the issue of unbalanced IRQ enablement by setting the
'is_mcq_intr_enabled' flag after the first successful IRQ enablement.
Ensure proper tracking of the IRQ state and prevent potential mismatches
in IRQ handling.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3126b5fd02 ]
Disabling the AES core in Shared ICE is not supported during power
collapse for UFS Host Controller v5.0, which may lead to data errors
after Hibern8 exit. To comply with hardware programming guidelines and
avoid this issue, issue a sync reset to ICE upon power collapse exit.
Hence follow below steps to reset the ICE upon exiting power collapse
and align with Hw programming guide.
a. Assert the ICE sync reset by setting both SYNC_RST_SEL and
SYNC_RST_SW bits in UFS_MEM_ICE_CFG
b. Deassert the reset by clearing SYNC_RST_SW in UFS_MEM_ICE_CFG
Signed-off-by: Palash Kambar <quic_pkambar@quicinc.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 15ef3f5aa8 ]
Improve the recovery process for failed resume operations. Log the
device's power status and return 0 if both resume and recovery fail to
prevent I/O hang.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 955f3bc4af ]
On DGFX, during init_post_hwconfig() step, we are reinitializing
CTB BO in VRAM and we have to replace cleanup action to disable CT
communication prior to release of underlying BO.
But that introduces some discrepancy between DGFX and iGFX, as for
iGFX we keep previously added disable CT action that would be called
during unwind much later.
To keep the same flow on both types of platforms, always replace old
cleanup action and register new one.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Link: https://lore.kernel.org/r/20250908102053.539-2-michal.wajdeczko@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bfbd5aa534 ]
The OmniVision OG01A1B image sensor is a monochrome sensor, it supports
8-bit and 10-bit RAW output formats only.
That said the planar greyscale Y8/Y10 media formats are more appropriate
for the sensor instead of the originally and arbitrary selected SGRBG one,
since there is no red, green or blue color components.
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d5f6bd3ee3 ]
Currently, the test allocates BAR sizes according to fixed table bar_size.
This does not work with controllers which have fixed size BARs that are
smaller than the requested BAR size. One such controller is Renesas R-Car
V4H PCIe controller, which has BAR4 size limited to 256 bytes, which is
much less than one of the BAR size, 131072 currently requested by this
test. A lot of controllers drivers in-tree have fixed size BARs, and they
do work perfectly fine, but it is only because their fixed size is larger
than the size requested by pci-epf-test.c
Adjust the test such that in case a fixed size BAR is detected, the fixed
BAR size is used, as that is the only possible option.
This helps with test failures reported as follows:
pci_epf_test pci_epf_test.0: requested BAR size is larger than fixed size
pci_epf_test pci_epf_test.0: Failed to allocate space for BAR4
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
[mani: reworded description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250905184240.144431-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27bc5eaf00 ]
Recent changes to make netlink socket memory accounting must
have broken the implicit assumption of the netlink-dump test
that we can fit exactly 64 dumps into the socket. Handle the
failure mode properly, and increase the dump count to 80
to make sure we still run into the error condition if
the default buffer size increases in the future.
Link: https://patch.msgid.link/20250906211351.3192412-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bd74a86bc7 ]
Add the missing linking of NAPIs to netdev queues when enabling
interrupt vectors in order to support NAPI configuration and
interfaces requiring get_rx_queue()->napi to be set (like XSk
busy polling).
As currently, idpf_vport_{start,stop}() is called from several flows
with inconsistent RTNL locking, we need to synchronize them to avoid
runtime assertions. Notably:
* idpf_{open,stop}() -- regular NDOs, RTNL is always taken;
* idpf_initiate_soft_reset() -- usually called under RTNL;
* idpf_init_task -- called from the init work, needs RTNL;
* idpf_vport_dealloc -- called without RTNL taken, needs it.
Expand common idpf_vport_{start,stop}() to take an additional bool
telling whether we need to manually take the RTNL lock.
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # helper
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Ramu R <ramu.r@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a27d774045 ]
There are some special registers which are accessible even when GX power
domain is collapsed during an IFPC sleep. Accessing these registers
wakes up GPU from power collapse and allow programming these registers
without additional handshake with GMU. This patch adds support for this
special register write sequence.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/673368/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e28022873c ]
Currently, misc_deregister() uses list_del() to remove the device
from the list. After list_del(), the list pointers are set to
LIST_POISON1 and LIST_POISON2, which may help catch use-after-free bugs,
but does not reset the list head.
If misc_deregister() is called more than once on the same device,
list_empty() will not return true, and list_del() may be called again,
leading to undefined behavior.
Replace list_del() with list_del_init() to reinitialize the list head
after deletion. This makes the code more robust against double
deregistration and allows safe usage of list_empty() on the miscdevice
after deregistration.
[ Note, this seems to keep broken out-of-tree drivers from doing foolish
things. While this does not matter for any in-kernel drivers,
external drivers could use a bit of help to show them they shouldn't
be doing stuff like re-registering misc devices - gregkh ]
Signed-off-by: Xion Wang <xion.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250904063714.28925-2-xion.wang@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1b434ed000 ]
Not all FRAM chips have a device ID and implement the corresponding read
command. For such chips this led to the following error on module
loading:
at25 spi2.0: Error: no Cypress FRAM (id 00)
The device ID contains the memory size, so devices without this ID are
supported now by setting the size manually in Devicetree using the
"size" property.
Tested with FM25L16B and "size = <2048>;":
at25 spi2.0: 2 KByte fm25 fram, pagesize 4096
According to Infineon/Cypress datasheets, these FRAMs have a device ID:
FM25V01A
FM25V02A
FM25V05
FM25V10
FM25V20A
FM25VN10
but these do not:
FM25040B
FM25640B
FM25C160B
FM25CL64B
FM25L04B
FM25L16B
FM25W256
So all "FM25V*" FRAMs and only these have a device ID. The letter after
"FM25" (V/C/L/W) only describes the voltage range, though.
Link: https://lore.kernel.org/all/20250401133148.38330-1-m.heidelberg@cab.de/
Signed-off-by: Markus Heidelberg <m.heidelberg@cab.de>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Link: https://lore.kernel.org/r/20250815095839.4219-3-m.heidelberg@cab.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3fa89f3a7 ]
Starting with commit f99508074e ("PM: domains: Detach on
device_unbind_cleanup()"), there is no longer a need to call
dev_pm_domain_detach() in the bus remove function. The
device_unbind_cleanup() function now handles this to avoid
invoking devres cleanup handlers while the PM domain is
powered off, which could otherwise lead to failures as
described in the above-mentioned commit.
Drop the explicit dev_pm_domain_detach() call and rely instead
on the flags passed to dev_pm_domain_attach() to power off the
domain.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20250827101747.928265-1-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fc6a5b540c ]
GENI UART driver currently supports only non-DFS (Dynamic Frequency
Scaling) mode for source frequency selection. However, to operate correctly
in DFS mode, the GENI SCLK register must be programmed with the appropriate
DFS index. Failing to do so can result in incorrect frequency selection
Add support for Dynamic Frequency Scaling (DFS) mode in the GENI UART
driver by configuring the GENI_CLK_SEL register with the appropriate DFS
index. This ensures correct frequency selection when operating in DFS mode.
Replace the UART driver-specific logic for clock selection with the GENI
common driver function to obtain the desired frequency and corresponding
clock index. This improves maintainability and consistency across
GENI-based drivers.
Signed-off-by: Viken Dadhaniya <viken.dadhaniya@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250903063136.3015237-1-viken.dadhaniya@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 87c5ff5615 ]
In the __cdnsp_gadget_init() and cdnsp_gadget_exit() functions, the gadget
structure (pdev->gadget) was freed before its endpoints.
The endpoints are linked via the ep_list in the gadget structure.
Freeing the gadget first leaves dangling pointers in the endpoint list.
When the endpoints are subsequently freed, this results in a use-after-free.
Fix:
By separating the usb_del_gadget_udc() operation into distinct "del" and
"put" steps, cdnsp_gadget_free_endpoints() can be executed prior to the
final release of the gadget structure with usb_put_gadget().
A patch similar to bb9c74a5bd14("usb: dwc3: gadget: Free gadget structure
only after freeing endpoints").
Signed-off-by: Chen Yufeng <chenyufeng@iie.ac.cn>
Link: https://lore.kernel.org/r/20250905094842.1232-1-chenyufeng@iie.ac.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6f616757dd ]
The "usxgmii" phy-mode that the Felix switch ports support on LS1028A is
not quite USXGMII, it is defined by the USXGMII multiport specification
document as 10G-QXGMII. It uses the same signaling as USXGMII, but it
multiplexes 4 ports over the link, resulting in a maximum speed of 2.5G
per port.
This change is needed in preparation for the lynx-10g SerDes driver on
LS1028A, which will make a more clear distinction between usxgmii
(supported on lane 0) and 10g-qxgmii (supported on lane 1). These
protocols have their configuration in different PCCR registers (PCCRB vs
PCCR9).
Continue parsing and supporting single-port-per-lane USXGMII when found
in the device tree as usual (because it works), but add support for
10G-QXGMII too. Using phy-mode = "10g-qxgmii" will be required when
modifying the device trees to specify a "phys" phandle to the SerDes
lane. The result when the "phys" phandle is present but the phy-mode is
wrong is undefined.
The only PHY driver in known use with this phy-mode, AQR412C, will gain
logic to transition from "usxgmii" to "10g-qxgmii" in a future change.
Prepare the driver by also setting PHY_INTERFACE_MODE_10G_QXGMII in
supported_interfaces when PHY_INTERFACE_MODE_USXGMII is there, to
prevent breakage with existing device trees.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250903130730.2836022-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8c0b9ed240 ]
devmem test fails on NIPA. Most likely we get skb(s) with readable
frags (why?) but the failure manifests as an OOM. The OOM happens
because ncdevmem spams the following message:
recvmsg ret=-1
recvmsg: Bad address
As of today, ncdevmem can't deal with various reasons of EFAULT:
- falling back to regular recvmsg for non-devmem skbs
- increasing ctrl_data size (can't happen with ncdevmem's large buffer)
Exit (cleanly) with error when recvmsg returns EFAULT. This should at
least cause the test to cleanup its state.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250904182710.1586473-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 68f3c044f3 ]
[Why]
There is a `scale` sysfs attribute that can be used to indicate when
non-linear brightness scaling is in use. As Custom brightness curves
work by linear interpolation of points the scale is no longer linear.
[How]
Indicate non-linear scaling when custom brightness curves in use and
linear scaling otherwise.
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <superm1@kernel.org>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf32515a70 ]
v1:
Returns different error codes based on the scenario to help the user app understand
the AMDGPU device status when an exception occurs.
v2:
change -NODEV to -EBUSY.
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 12cdfb61b3 ]
[Why]
dm_mst_get_pbn_divider() returns value integer coming from
the cast from fixed point, but the casted integer will then be used
in dfixed_const to be multiplied by 4096. The cast from fixed point to integer
causes the calculation error becomes bigger when multiplied by 4096.
That makes the calculated pbn_div value becomes smaller than
it should be, which leads to the req_slot number becomes bigger.
Such error is getting reflected in 8k30 timing,
where the correct and incorrect calculated req_slot 62.9 Vs 63.1.
That makes the wrong calculation failed to light up 8k30
after a dock under HBR3 x 4.
[How]
Restore the accuracy by keeping the fraction part
calculated for the left shift operation.
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fa819e3a7c ]
Some SOCs which are part of the cyan skillfish family
rely on an explicit firmware for IP discovery. Add support
for the gpu_info firmware.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 38e5f33ee3 ]
After a panic if SNP is enabled in the previous kernel then the kdump
kernel boots with IOMMU SNP enforcement still enabled.
IOMMU device table register is locked and exclusive to the previous
kernel. Attempts to copy old device table from the previous kernel
fails in kdump kernel as hardware ignores writes to the locked device
table base address register as per AMD IOMMU spec Section 2.12.2.1.
This causes the IOMMU driver (OS) and the hardware to reference
different memory locations. As a result, the IOMMU hardware cannot
process the command which results in repeated "Completion-Wait loop
timed out" errors and a second kernel panic: "Kernel panic - not
syncing: timer doesn't work through Interrupt-remapped IO-APIC".
Reuse device table instead of copying device table in case of kdump
boot and remove all copying device table code.
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Tested-by: Sairaj Kodilkar <sarunkod@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Link: https://lore.kernel.org/r/3a31036fb2f7323e6b1a1a1921ac777e9f7bdddc.1756157913.git.ashish.kalra@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9be15fbfc6 ]
After a panic if SNP is enabled in the previous kernel then the kdump
kernel boots with IOMMU SNP enforcement still enabled.
IOMMU command buffers and event buffer registers remain locked and
exclusive to the previous kernel. Attempts to enable command and event
buffers in the kdump kernel will fail, as hardware ignores writes to
the locked MMIO registers as per AMD IOMMU spec Section 2.12.2.1.
Skip enabling command buffers and event buffers for kdump boot as they
are already enabled in the previous kernel.
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Tested-by: Sairaj Kodilkar <sarunkod@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Link: https://lore.kernel.org/r/576445eb4f168b467b0fc789079b650ca7c5b037.1756157913.git.ashish.kalra@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f32fe7cb01 ]
After a panic if SNP is enabled in the previous kernel then the kdump
kernel boots with IOMMU SNP enforcement still enabled.
IOMMU completion wait buffers (CWBs), command buffers and event buffer
registers remain locked and exclusive to the previous kernel. Attempts
to allocate and use new buffers in the kdump kernel fail, as hardware
ignores writes to the locked MMIO registers as per AMD IOMMU spec
Section 2.12.2.1.
This results in repeated "Completion-Wait loop timed out" errors and a
second kernel panic: "Kernel panic - not syncing: timer doesn't work
through Interrupt-remapped IO-APIC"
The list of MMIO registers locked and which ignore writes after failed
SNP shutdown are mentioned in the AMD IOMMU specifications below:
Section 2.12.2.1.
https://docs.amd.com/v/u/en-US/48882_3.10_PUB
Reuse the pages of the previous kernel for completion wait buffers,
command buffers, event buffers and memremap them during kdump boot
and essentially work with an already enabled IOMMU configuration and
re-using the previous kernel’s data structures.
Reusing of command buffers and event buffers is now done for kdump boot
irrespective of SNP being enabled during kdump.
Re-use of completion wait buffers is only done when SNP is enabled as
the exclusion base register is used for the completion wait buffer
(CWB) address only when SNP is enabled.
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Tested-by: Sairaj Kodilkar <sarunkod@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Link: https://lore.kernel.org/r/ff04b381a8fe774b175c23c1a336b28bc1396511.1756157913.git.ashish.kalra@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2506af5f81 ]
The GuC communication protocol allows GuC to send NO_RESPONSE_RETRY
reply message to indicate that due to some interim condition it can
not handle incoming H2G request and the host shall resend it.
But in some cases, due to errors, this unsatisfied condition might
be final and this could lead to endless retries as it was recently
seen on the CI:
[drm] GT0: PF: VF1 FLR didn't finish in 5000 ms (-ETIMEDOUT)
[drm] GT0: PF: VF1 resource sanitizing failed (-ETIMEDOUT)
[drm] GT0: PF: VF1 FLR failed!
[drm:guc_ct_send_recv [xe]] GT0: H2G action 0x5503 retrying: reason 0x0
[drm:guc_ct_send_recv [xe]] GT0: H2G action 0x5503 retrying: reason 0x0
[drm:guc_ct_send_recv [xe]] GT0: H2G action 0x5503 retrying: reason 0x0
[drm:guc_ct_send_recv [xe]] GT0: H2G action 0x5503 retrying: reason 0x0
To avoid such dangerous loops allow only limited number of retries
(for now 50) and add some delays (n * 5ms) to slow down the rate of
resending this repeated request.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://lore.kernel.org/r/20250903223330.6408-1-michal.wajdeczko@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c221cbf8dc ]
When the 3.3Vaux supply is present, fetch it at the probe time and keep it
enabled for the entire PCIe controller lifecycle so that the link can enter
L2 state and the devices can signal wakeup using either Beacon or WAKE#
mechanisms.
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[mani: reworded the subject, description and error message]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250820022328.2143374-1-hongxing.zhu@nxp.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a191224186 ]
In partitioned systems, the domain ID is unique in the partition and a
package can have multiple partitions.
Some user-space tools, such as turbostat, assume the domain ID is unique
per package. These tools map CPU power domains, which are unique to a
package. However, this approach does not work in partitioned systems.
There is no architectural definition of "partition" to present to user
space.
To support these tools, set the domain_id to be unique per package. For
compute die IDs, uniqueness can be achieved using the platform info
cdie_mask, mirroring the behavior observed in non-partitioned systems.
For IO dies, which lack a direct CPU relationship, any unique logical
ID can be assigned. Here domain IDs for IO dies are configured after all
compute domain IDs. During the probe, keep the index of the next IO
domain ID after the last IO domain ID of the current partition. Since
CPU packages are symmetric, partition information is same for all
packages.
The Intel Speed Select driver has already implemented a similar change
to make the domain ID unique, with compute dies listed first, followed
by I/O dies.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20250903191154.1081159-1-srinivas.pandruvada@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e53f8b12a2 ]
Currently, when adding the 6 GHz Band Capabilities element, the channel
list of the wiphy is checked to determine if 6 GHz is supported for a given
virtual interface. However, in a multi-radio wiphy (e.g., one that has
both lower bands and 6 GHz combined), the wiphy advertises support for
all bands. As a result, the 6 GHz Band Capabilities element is incorrectly
included in mesh beacon and station's association request frames of
interfaces operating in lower bands, without verifying whether the
interface is actually operating in a 6 GHz channel.
Fix this by verifying if the interface operates on 6 GHz channel
before adding the element. Note that this check cannot be placed
directly in ieee80211_put_he_6ghz_cap() as the same function is used to
add probe request elements while initiating scan in which case the
interface may not be operating in any band's channel.
Signed-off-by: Ramya Gnanasekar <ramya.gnanasekar@oss.qualcomm.com>
Signed-off-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com>
Link: https://patch.msgid.link/20250606104436.326654-1-rameshkumar.sundaram@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 91c5d7c849 ]
The .querystd callback should not program the device with the detected
standard, it should only report the standard to user-space. User-space
may then use .s_std to set the standard, if it wants to use it.
All that is required of .querystd is to setup the auto detection of
standards and report its findings.
While at it add some documentation on why this can't happen while
streaming and improve the error handling using a scoped guard.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 46c1e7814d ]
The .set_fmt callback should not write the new format directly do the
device, it should only store it and have it applied by .s_stream.
The .s_stream callback already calls adv7180_set_field_mode() so it's
safe to remove programming of the device and just store the format and
have .s_stream apply it.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 878c496ac5 ]
The adv7180_set_power() utilizes adv7180_write() which in turn requires
the state mutex to be held, take it before calling adv7180_set_power()
to avoid tripping a lockdep_assert_held().
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 21f82062d0 ]
An exchange with a NFC target must complete within NCI_DATA_TIMEOUT.
A delay of 700 ms is not sufficient for cryptographic operations on smart
cards. CardOS 6.0 may need up to 1.3 seconds to perform 256-bit ECDH
or 3072-bit RSA. To prevent brute-force attacks, passports and similar
documents introduce even longer delays into access control protocols
(BAC/PACE).
The timeout should be higher, but not too much. The expiration allows
us to detect that a NFC target has disappeared.
Signed-off-by: Juraj Šarinay <juraj@sarinay.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20250902113630.62393-1-juraj@sarinay.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f9581ba74 ]
While updating the binary min-len implementation, I noticed that
the only user, should AFAICT be using exact-len instead.
In net/ipv4/fou_core.c FOU_ATTR_LOCAL_V6 and FOU_ATTR_PEER_V6
are only used for singular IPv6 addresses, and there are AFAICT
no known implementations trying to send more, it therefore
appears safe to change it to an exact-len policy.
This patch therefore changes the local-v6/peer-v6 attributes to
use an exact-len check, instead of a min-len check.
Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250902154640.759815-2-ast@fiberby.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 08a1af326a ]
Currently, during locating the CIVD section, the ixgbe driver loops
over the OROM area and at each iteration reads only OROM-datastruct-size
amount of data. This results in many small reads and is inefficient.
Optimize this by reading the entire OROM bank into memory once before
entering the loop. This significantly reduces the probing time.
Without this patch probing time may exceed over 25s, whereas with this
patch applied average time of probe is not greater than 5s.
without the patch:
[14:12:22] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[14:12:25] ixgbe 0000:21:00.0: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0
[14:12:25] ixgbe 0000:21:00.0: 63.012 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x4 link)
[14:12:26] ixgbe 0000:21:00.0: MAC: 7, PHY: 27, PBA No: N55484-001
[14:12:26] ixgbe 0000:21:00.0: 20:3a:43:09:3a:12
[14:12:26] ixgbe 0000:21:00.0: Intel(R) 10 Gigabit Network Connection
[14:12:50] ixgbe 0000:21:00.0 ens2f0np0: renamed from eth0
with the patch:
[14:18:18] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[14:18:19] ixgbe 0000:21:00.0: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0
[14:18:19] ixgbe 0000:21:00.0: 63.012 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x4 link)
[14:18:19] ixgbe 0000:21:00.0: MAC: 7, PHY: 27, PBA No: N55484-001
[14:18:19] ixgbe 0000:21:00.0: 20:3a:43:09:3a:12
[14:18:19] ixgbe 0000:21:00.0: Intel(R) 10 Gigabit Network Connection
[14:18:22] ixgbe 0000:21:00.0 ens2f0np0: renamed from eth0
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d29da1a8f1 ]
We want to mount beneath the given location. For that operation to
make sense, location must be the root of some mount that has something
under it. Currently we let it proceed if those requirements are not met,
with rather meaningless results, and have that bogosity caught further
down the road; let's fail early instead - do_lock_mount() doesn't make
sense unless those conditions hold, and checking them there makes
things simpler.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 813d13524a ]
The SMC can take an excessive amount of time to process some
messages under some conditions.
Background:
Sending a message to the SMC works by writing the message into
the mmSMC_MESSAGE_0 register and its optional parameter into
the mmSMC_SCRATCH0, and then polling mmSMC_RESP_0. Previously
the timeout was AMDGPU_MAX_USEC_TIMEOUT, ie. 100 ms.
Increase the timeout to 200 ms for all messages and to 1 sec for
a few messages which I've observed to be especially slow:
PPSMC_MSG_NoForcedLevel
PPSMC_MSG_SetEnabledLevels
PPSMC_MSG_SetForcedLevels
PPSMC_MSG_DisableULV
PPSMC_MSG_SwitchToSwState
This fixes the following problems on Tahiti when switching
from a lower clock power state to a higher clock state, such
as when DC turns on a display which was previously turned off.
* si_restrict_performance_levels_before_switch would fail
(if the user previously forced high clocks using sysfs)
* si_set_sw_state would fail (always)
It turns out that both of those failures were SMC timeouts and
that the SMC actually didn't fail or hang, just needs more time
to process those.
Add a warning when there is an SMC timeout to make it easier to
identify this type of problem in the future.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 85705b18ae ]
The kfd CRIU checkpoint ioctl would return an error if trying
to checkpoint a process with no kfd buffer objects.
This is a normal case and should not be an error.
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ddcb0cb9d ]
Driver unconditionally saves current state on first init in
dsi_pll_7nm_init(), but does not save the VCO rate, only some of the
divider registers. The state is then restored during probe/enable via
msm_dsi_phy_enable() -> msm_dsi_phy_pll_restore_state() ->
dsi_7nm_pll_restore_state().
Restoring calls dsi_pll_7nm_vco_set_rate() with
pll_7nm->vco_current_rate=0, which basically overwrites existing rate of
VCO and messes with clock hierarchy, by setting frequency to 0 to clock
tree. This makes anyway little sense - VCO rate was not saved, so
should not be restored.
If PLL was not configured configure it to minimum rate to avoid glitches
and configuring entire in clock hierarchy to 0 Hz.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/657827/
Link: https://lore.kernel.org/r/20250610-b4-sm8750-display-v6-9-ee633e3ddbff@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d95a2e016 ]
Now that nft_setelem_flush is not called with rcu read lock held or
disabled softinterrupts anymore this can now use GFP_KERNEL too.
This is the last atomic allocation of transaction elements, so remove
all gfp_t arguments and the wrapper function.
This makes attempts to delete large sets much more reliable, before
this was prone to transient memory allocation failures.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5e742de97c ]
DMA Engine has support for the callback_result which provides
the status of the request and the residue. This helps in
determining the correct status of the request and in
efficient resource management of the request.
The 'callback_result' method is preferred over the deprecated
'callback' method.
Signed-off-by: Devendra K Verma <devverma@amd.com>
Link: https://lore.kernel.org/r/20250821121505.318179-1-devverma@amd.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee4b32220a ]
When a buffer object (BO) is allocated with the XE_BO_FLAG_GGTT_INVALIDATE
flag, the driver initiates TLB invalidation requests via the CTB mechanism
while releasing the BO. However a premature release of the CTB BO can lead
to system crashes, as observed in:
Oops: Oops: 0000 [#1] SMP NOPTI
RIP: 0010:h2g_write+0x2f3/0x7c0 [xe]
Call Trace:
guc_ct_send_locked+0x8b/0x670 [xe]
xe_guc_ct_send_locked+0x19/0x60 [xe]
send_tlb_invalidation+0xb4/0x460 [xe]
xe_gt_tlb_invalidation_ggtt+0x15e/0x2e0 [xe]
ggtt_invalidate_gt_tlb.part.0+0x16/0x90 [xe]
ggtt_node_remove+0x110/0x140 [xe]
xe_ggtt_node_remove+0x40/0xa0 [xe]
xe_ggtt_remove_bo+0x87/0x250 [xe]
Introduce a devm-managed release action during xe_guc_ct_init() and
xe_guc_ct_init_post_hwconfig() to ensure proper CTB disablement before
resource deallocation, preventing the use-after-free scenario.
Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Summers Stuart <stuart.summers@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://lore.kernel.org/r/20250901072541.31461-1-satyanarayana.k.v.p@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5a8c02a6bf ]
Networking drivers implementing PTP clocks and kernel socket code
handling hardware timestamps use the 64-bit signed ktime_t type counting
nanoseconds. When a PTP clock reaches the maximum value in year 2262,
the timestamps returned to applications will overflow into year 1667.
The same thing happens when injecting a large offset with
clock_adjtime(ADJ_SETOFFSET).
The commit 7a8e61f847 ("timekeeping: Force upper bound for setting
CLOCK_REALTIME") limited the maximum accepted value setting the system
clock to 30 years before the maximum representable value (i.e. year
2232) to avoid the overflow, assuming the system will not run for more
than 30 years.
Enforce the same limit for PTP clocks. Don't allow negative values and
values closer than 30 years to the maximum value. Drivers may implement
an even lower limit if the hardware registers cannot represent the whole
interval between years 1970 and 2262 in the required resolution.
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <jstultz@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250828103300.1387025-1-mlichvar@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e61c35157d ]
Depending on which display that is connected to the controller, an
"1" means either a black or a white pixel.
The supported formats (R1/R2/XRGB8888) expects the pixels
to map against (4bit):
00 => Black
01 => Dark Gray
10 => Light Gray
11 => White
If this is not what the display map against, make it possible to invert
the pixels.
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com>
Link: https://lore.kernel.org/r/20250721-st7571-format-v2-4-159f4134098c@gmail.com
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 65673c6e33 ]
The imx-mipi-csis driver sets the rate of the wrap clock to the value
specified in the device tree's "clock-frequency" property, and defaults
to 166 MHz otherwise. This is a historical mistake, as clock rate
selection should have been left to the assigned-clock-rates property.
Honouring the clock-frequency property can't be removed without breaking
backwards compatibility, and the corresponding code isn't very
intrusive. The 166 MHz default, on the other hand, prevents
configuration of the clock rate through assigned-clock-rates, as the
driver immediately overwrites the rate. This behaviour is confusing and
has cost debugging time.
There is little value in a 166 MHz default. All mainline device tree
sources that enable the CSIS specify a clock-frequency explicitly, and
the default wrap clock configuration on supported platforms is at least
as high as 166 MHz. Drop the default, and only set the clock rate
manually when the clock-frequency property is specified.
Link: https://lore.kernel.org/r/20250822002734.23516-10-laurent.pinchart@ideasonboard.com
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aead8e4cc0 ]
Make the "mclk" clock optional in the ad7124 driver. The MCLK is an
internal counter on the ADC, so it is not something that should be
coming from the devicetree. However, existing users may be using this
to essentially select the power mode of the ADC from the devicetree.
In order to not break those users, we have to keep the existing "mclk"
handling, but now it is optional.
Now, when the "mclk" clock is omitted from the devicetree, the driver
will default to the full power mode. Support for an external clock
and dynamic power mode switching can be added later if needed.
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250828-iio-adc-ad7124-proper-clock-support-v3-2-0b317b4605e5@baylibre.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 34c21e9119 ]
Introduce support for standard MII ioctl operations in the LAN865x
Ethernet driver by implementing the .ndo_eth_ioctl callback. This allows
PHY-related ioctl commands to be handled via phy_do_ioctl_running() and
enables support for ethtool and other user-space tools that rely on ioctl
interface to perform PHY register access using commands like SIOCGMIIREG
and SIOCSMIIREG.
This feature enables improved diagnostics and PHY configuration
capabilities from userspace.
Signed-off-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250828114549.46116-1-parthiban.veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b0d04fe6a6 ]
Bindig requires a node name matching ‘^gpio@[0-9a-f]+$’. This patch
changes the clock name from “stp” to “gpio”.
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 585b2f685c ]
Update the legacy (non-DC) display code to respect the maximum
pixel clock for HDMI and DVI-D. Reject modes that would require
a higher pixel clock than can be supported.
Also update the maximum supported HDMI clock value depending on
the ASIC type.
For reference, see the DC code:
check max_hdmi_pixel_clock in dce*_resource.c
v2:
Fix maximum clocks for DVI-D and DVI/HDMI adapters.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 306cbcc6f6 ]
[Why & How]
Previously, when calculating dto phase, we would incorrectly fail when phase
<=0 without additionally checking for the integer value. This meant that
calculations would incorrectly fail when the desired pixel clock was an exact
multiple of the reference clock.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Clay King <clayking@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 002a612023 ]
[Why]
-Pipe splitting allows for clocks to be reduced, but when using TMDS 420,
reduced clocks lead to missed clocks cycles on clock resyncing
[How]
-Impose a minimum clock when using TMDS 420
Reviewed-by: Chris Park <chris.park@amd.com>
Signed-off-by: Relja Vojvodic <rvojvodi@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0750649b52 ]
Compare the whole v4l2_bt_timings struct, not just the width/height when
setting new timings. Timings with the same resolution and different
pixelclock can now be properly set.
Signed-off-by: Martin Tůma <martin.tuma@digiteqautomotive.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c158b5a570 ]
Commit 0d6ccfe6b3 ("selftests: drv-net: rss_ctx: check for all-zero keys")
added a skip exception if NIC has fewer than 3 queues enabled,
but it's just constructing the object, it's not actually rising
this exception.
Before:
# Exception| net.lib.py.utils.CmdExitFailure: Command failed: ethtool -X enp1s0 equal 3 hkey d1:cc:77:47:9d:ea:15:f2:b9:6c:ef:68:62:c0:45:d5:b0:99:7d:cf:29:53:40:06:3d:8e:b9:bc:d4:70:89:b8:8d:59:04:ea:a9:c2:21:b3:55:b8:ab:6b:d9:48:b4:bd:4c:ff:a5:f0:a8:c2
not ok 1 rss_ctx.test_rss_key_indir
After:
ok 1 rss_ctx.test_rss_key_indir # SKIP Device has fewer than 3 queues (or doesn't support queue stats)
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250827173558.3259072-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6d47b4f084 ]
A partitioned system configured with only one package and one compute
die, warning will be generated for duplicate sysfs entry. This typically
occurs during the platform bring-up phase.
Partitioned systems expose dies, equivalent to TPMI compute domains,
through the CPUID. Each partitioned system must contains at least one
compute die per partition, resulting in a minimum of two dies per
package. Hence the function topology_max_dies_per_package() returns at
least two, and the condition "topology_max_dies_per_package() > 1"
prevents the creation of a root domain.
In this case topology_max_dies_per_package() will return 1 and root
domain will be created for partition 0 and a duplicate sysfs warning
for partition 1 as both partitions have same package ID.
To address this also check for non zero partition in addition to
topology_max_dies_per_package() > 1.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20250819211034.3776284-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1161b1863 ]
Upon experiencing a PCI error, fbnic reset the device to recover from
the failure. Reset the hardware stats as part of the device reset to
ensure accurate stats reporting.
Note that the reset is not really resetting the aggregate value to 0,
which may result in a spike for a system collecting deltas in stats.
Rather, the reset re-latches the current value as previous, in case HW
got reset.
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250825200206.2357713-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97bcc5b6f4 ]
This patch fixes an issue where two different flows on the same RXq
produce the same hash resulting in continuous flow overwrites.
Flow #1: A packet for Flow #1 comes in, kernel calls the steering
function. The driver gives back a filter id. The kernel saves
this filter id in the selected slot. Later, the driver's
service task checks if any filters have expired and then
installs the rule for Flow #1.
Flow #2: A packet for Flow #2 comes in. It goes through the same steps.
But this time, the chosen slot is being used by Flow #1. The
driver gives a new filter id and the kernel saves it in the
same slot. When the driver's service task runs, it runs through
all the flows, checks if Flow #1 should be expired, the kernel
returns True as the slot has a different filter id, and then
the driver installs the rule for Flow #2.
Flow #1: Another packet for Flow #1 comes in. The same thing repeats.
The slot is overwritten with a new filter id for Flow #1.
This causes a repeated cycle of flow programming for missed packets,
wasting CPU cycles while not improving performance. This problem happens
at higher rates when the RPS table is small, but tests show it still
happens even with 12,000 connections and an RPS size of 16K per queue
(global table size = 144x16K = 64K).
This patch prevents overwriting an rps_dev_flow entry if it is active.
The intention is that it is better to do aRFS for the first flow instead
of hurting all flows on the same hash. Without this, two (or more) flows
on one RX queue with the same hash can keep overwriting each other. This
causes the driver to reprogram the flow repeatedly.
Changes:
1. Add a new 'hash' field to struct rps_dev_flow.
2. Add rps_flow_is_active(): a helper function to check if a flow is
active or not, extracted from rps_may_expire_flow(). It is further
simplified as per reviewer feedback.
3. In set_rps_cpu():
- Avoid overwriting by programming a new filter if:
- The slot is not in use, or
- The slot is in use but the flow is not active, or
- The slot has an active flow with the same hash, but target CPU
differs.
- Save the hash in the rps_dev_flow entry.
4. rps_may_expire_flow(): Use earlier extracted rps_flow_is_active().
Testing & results:
- Driver: ice (E810 NIC), Kernel: net-next
- #CPUs = #RXq = 144 (1:1)
- Number of flows: 12K
- Eight RPS settings from 256 to 32768. Though RPS=256 is not ideal,
it is still sufficient to cover 12K flows (256*144 rx-queues = 64K
global table slots)
- Global Table Size = 144 * RPS (effectively equal to 256 * RPS)
- Each RPS test duration = 8 mins (org code) + 8 mins (new code).
- Metrics captured on client
Legend for following tables:
Steer-C: #times ndo_rx_flow_steer() was Called by set_rps_cpu()
Steer-L: #times ice_arfs_flow_steer() Looped over aRFS entries
Add: #times driver actually programmed aRFS (ice_arfs_build_entry())
Del: #times driver deleted the flow (ice_arfs_del_flow_rules())
Units: K = 1,000 times, M = 1 million times
|-------|---------|------| Org Code |---------|---------|
| RPS | Latency | CPU | Add | Del | Steer-C | Steer-L |
|-------|---------|------|--------|--------|---------|---------|
| 256 | 227.0 | 93.2 | 1.6M | 1.6M | 121.7M | 267.6M |
| 512 | 225.9 | 94.1 | 11.5M | 11.2M | 65.7M | 199.6M |
| 1024 | 223.5 | 95.6 | 16.5M | 16.5M | 27.1M | 187.3M |
| 2048 | 222.2 | 96.3 | 10.5M | 10.5M | 12.5M | 115.2M |
| 4096 | 223.9 | 94.1 | 5.5M | 5.5M | 7.2M | 65.9M |
| 8192 | 224.7 | 92.5 | 2.7M | 2.7M | 3.0M | 29.9M |
| 16384 | 223.5 | 92.5 | 1.3M | 1.3M | 1.4M | 13.9M |
| 32768 | 219.6 | 93.2 | 838.1K | 838.1K | 965.1K | 8.9M |
|-------|---------|------| New Code |---------|---------|
| 256 | 201.5 | 99.1 | 13.4K | 5.0K | 13.7K | 75.2K |
| 512 | 202.5 | 98.2 | 11.2K | 5.9K | 11.2K | 55.5K |
| 1024 | 207.3 | 93.9 | 11.5K | 9.7K | 11.5K | 59.6K |
| 2048 | 207.5 | 96.7 | 11.8K | 11.1K | 15.5K | 79.3K |
| 4096 | 206.9 | 96.6 | 11.8K | 11.7K | 11.8K | 63.2K |
| 8192 | 205.8 | 96.7 | 11.9K | 11.8K | 11.9K | 63.9K |
| 16384 | 200.9 | 98.2 | 11.9K | 11.9K | 11.9K | 64.2K |
| 32768 | 202.5 | 98.0 | 11.9K | 11.9K | 11.9K | 64.2K |
|-------|---------|------|--------|--------|---------|---------|
Some observations:
1. Overall Latency improved: (1790.19-1634.94)/1790.19*100 = 8.67%
2. Overall CPU increased: (777.32-751.49)/751.45*100 = 3.44%
3. Flow Management (add/delete) remained almost constant at ~11K
compared to values in millions.
Signed-off-by: Krishna Kumar <krikku@gmail.com>
Link: https://patch.msgid.link/20250825031005.3674864-2-krikku@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ec813f384b ]
We need to cancel any outstanding work at both suspend
and driver teardown. Move the cancel to hw_fini which
gets called in both cases.
Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f3820e9d35 ]
When KFD asks CP to preempt queues, other than preempt CP queues, CP
also requests SDMA to preempt SDMA queues with UNMAP_LATENCY timeout.
Currently queue_preemption_timeout_ms is 9000 ms by default but can be
configured via module parameter. KFD_UNMAP_LATENCY_MS is hard coded as
4000 ms though. This patch ties KFD_UNMAP_LATENCY_MS to
queue_preemption_timeout_ms so in a slow system such as emulator, both
CP and SDMA slowness are taken into account.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a359f0f13 ]
[Why]
For the HW cursor, its current position in the pipe_ctx->stream struct is
not affected by the 180 rotation, i. e. the top left corner is still at
0,0. However, the DPP & HUBP set_cursor_position functions require rotated
position.
The current approach is hard-coded for ODM 2:1, thus it's failing for
ODM 4:1, resulting in a double cursor.
[How]
Instead of calculating the new cursor position relatively to the
viewports, we calculate it using a viewavable clip_rect of each plane.
The clip_rects are first offset and scaled to the same space as the
src_rect, i. e. Stream space -> Plane space.
In case of a pipe split, which divides the plane into 2 or more viewports,
the clip_rect is the union of all the viewports of the given plane.
With the assumption that the viewports in HUBP's set_cursor_position are
in the Plane space as well, it should produce a correct cursor position
for any number of pipe splits.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 93aa919ca0 ]
When it only allocates vram without va, which is 0, and a
SVM range allocated stays in this range, the vram allocation
returns failure. It should be skipped for this case from
SVM usage check.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8442bcad0 ]
By polling, poll ACA bank count to ensure that valid
ACA bank reg info can be obtained
v2: add corresponding delay before send msg to SMU to query mca bank info
(Stanley)
v3: the loop cannot exit. (Thomas)
v4: remove amdgpu_aca_clear_bank_count. (Kevin)
v5: continuously inject ce. If a creation interruption
occurs at this time, bank reg info will be lost. (Thomas)
v5: each cycle is delayed by 100ms. (Tao)
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc8e391067 ]
The pci_endpoint_test tests the 32-bit MSI range. However, the device might
not have all vectors configured. For example, if msi_interrupts is 8 in the
ep function space or if the MSI Multiple Message Capable value is
configured as 4 (maximum 16 vectors).
In this case, do not attempt to run the test to avoid timeouts and directly
return the error value.
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250804170916.3212221-2-christian.bruel@foss.st.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7dbe644248 ]
The original commit be2ff42c5d ("fuse: Use hash table to link
processing request") converted fuse_pqueue->processing to a hash table,
but virtio_fs_enqueue_req() was not updated to use it correctly.
So use fuse_pqueue->processing as a hash table, this make the code
more coherent
Co-developed-by: Fushuai Wang <wangfushuai@baidu.com>
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a0f849c1cc ]
fixed_phy_register() creates and registers the phy_device. To be
symmetric, we should not only unregister, but also free the phy_device
in fixed_phy_unregister(). This allows to simplify code in users.
Note wrt of_phy_deregister_fixed_link():
put_device(&phydev->mdio.dev) and phy_device_free(phydev) are identical.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/ad8dda9a-10ed-4060-916b-3f13bdbb899d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d6c8e8b7c9 ]
During enclosure reboot or expander reset, firmware may report a link
speed of 0 in "Device Add" events while the link is still coming up.
The driver drops such devices, leaving them missing even after the link
recovers.
Fix this by treating link speed 0 as 1.5 Gbps during device addition so
the device is exposed to the OS. The actual link speed will be updated
later when link-up events arrive.
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20250820084138.228471-2-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1b8c5fa0cb ]
Currently, ip_extract_route_hint uses RTN_BROADCAST to decide
whether to use the route dst hint mechanism.
This check is too strict, as it prevents directed broadcast
routes from using the hint, resulting in poor performance
during bursts of directed broadcast traffic.
Fix this in ip_extract_route_hint and modify ip_route_use_hint
to preserve the intended behaviour.
Signed-off-by: Oscar Maes <oscmaes92@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250819174642.5148-2-oscmaes92@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6e29c30d8d ]
Module aliases are used by userspace to identify the correct module to
load for a detected hardware. The currently supported RPMSG device IDs for
this module include "rpmsg-raw", but the module alias is "rpmsg_chrdev".
Use the helper macro MODULE_DEVICE_TABLE(rpmsg) to export the correct
supported IDs. And while here, to keep backwards compatibility we also add
the other ID "rpmsg_chrdev" so that it is also still exported as an alias.
This has the side benefit of adding support for some legacy firmware
which still uses the original "rpmsg_chrdev" ID. This was the ID used for
this driver before it was upstreamed (as reflected by the module alias).
Signed-off-by: Andrew Davis <afd@ti.com>
Acked-by: Hari Nagalla <hnagalla@ti.com>
Tested-by: Hari Nagalla <hnagalla@ti.com>
Link: https://lore.kernel.org/r/20250619205722.133827-1-afd@ti.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f5a2826cd5 ]
The IPU6 ISYS driver supported metadata formats but was missing correct
embedded data type in the receiver configuration. Add it now.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dc757dc157 ]
GPD devices originally used BMI160 sensors with the "BMI0160" PNP ID.
When they switched to BMI260 sensors in newer hardware, they reused
the existing Windows driver which accepts both "BMI0160" and "BMI0260"
IDs. Consequently, they kept "BMI0160" in DSDT tables for new BMI260
devices, causing driver mismatches in Linux.
1. GPD updated BIOS v0.40+[1] for newer devices to report "BMI0260" for
BMI260 sensors to avoid loading the bmi160 driver on Linux. While this
isn't Bosch's VID;
2. Bosch's official Windows driver uses "BMI0260" as a compatible ID
3. We're seeing real devices shipping with "BMI0260" in DSDT
The DSDT excerpt of GPD G1619-04 with BIOS v0.40:
Scope (_SB.I2CC)
{
Device (BMA2)
{
Name (_ADR, Zero) // _ADR: Address
Name (_HID, "BMI0260") // _HID: Hardware ID
Name (_CID, "BMI0260") // _CID: Compatible ID
Name (_DDN, "Accelerometer") // _DDN: DOS Device Name
Name (_UID, One) // _UID: Unique ID
Method (_CRS, 0, NotSerialized) // _CRS: Current Resource Settings
{
Name (RBUF, ResourceTemplate ()
{
I2cSerialBusV2 (0x0069, ControllerInitiated, 0x00061A80,
AddressingMode7Bit, "\\_SB.I2CC",
0x00, ResourceConsumer, , Exclusive,
)
})
Return (RBUF) /* \_SB_.I2CC.BMA2._CRS.RBUF */
}
# omit some noise
}
}
Link: http://download.softwincn.com/WIN%20Max%202024/Max2-7840-BIOS-V0.41.zip#1
Signed-off-by: Cryolitia PukNgae <cryolitia@uniontech.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Acked-by: Alex Lanzano <lanzano.alex@gmail.com>
Link: https://patch.msgid.link/20250821-bmi270-gpd-acpi-v4-1-5279b471d749@uniontech.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5efa824920 ]
At least for panel-bridges, the atomic_enable call is defined as being
called right after the preceding element in the display pipe is enabled.
It is also stated that "The bridge can assume that the display pipe (i.e.
clocks and timing signals) feeding it is running when this callback is
called"
This means the DSI controller driving this display would have already
switched over to video-mode from command mode and thus dcs functions
should not be called anymore at this point.
This caused a non-working display for me, when trying to enable
the rk3576 dsi controller using a display using this controller.
Therefore move the display_on/off calls the more appropriate
prepare/unprepare callbacks.
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250707164906.1445288-3-heiko@sntech.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2515d2b9ab ]
There are two registers filled in when reading data from
pcode besides the mailbox itself. Currently, we allow a NULL
value for the second of these two (data1) and assume the first
is defined. However, many of the routines that are calling
this function assume that pcode will ignore the value being
passed in and so leave that first value (data0) defined but
uninitialized. To be safe, make sure this value is always
initialized to something (0 generally) in the event pcode
behavior changes and starts using this value.
v2: Fix sob/author
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://lore.kernel.org/r/20250819201054.393220-1-stuart.summers@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a7bd721580 ]
The NIX block receives traffic from multiple channels, including:
MAC block (RPM)
Loopback module (LBK)
CPT block
RPM
|
-----------------
LBK --| NIX |
-----------------
|
CPT
Due to a hardware errata, CN10k and earlier Octeon silicon series,
the hardware may incorrectly assert XOFF on certain channels during
reset. As a workaround, a write operation to the NIX_AF_RX_CHANX_CFG
register can be performed to broadcast XON signals on the affected
channels
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250820064625.1464361-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2aec0b6a6b ]
Just add fixed struct size validations for UAC2 and UAC3 effect
units. The descriptor has a variable-length array, so it should be
validated with a proper function later once when the unit is really
parsed and used by the driver (currently only referred partially for
the input terminal parsing).
Link: https://patch.msgid.link/20250821151751.12100-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6238784e50 ]
The error handling path in pci_p2pdma_add_resource() contains a bug in its
`pgmap_free` label.
Memory is allocated for the `p2p_pgmap` struct, and the pointer is stored
in `p2p_pgmap`. However, the error path calls devm_kfree() with `pgmap`,
which is a pointer to a member field within the `p2p_pgmap` struct, not the
base pointer of the allocation.
Correct the bug by passing the correct base pointer, `p2p_pgmap`, to
devm_kfree().
Signed-off-by: Sungho Kim <sungho.kim@furiosa.ai>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://patch.msgid.link/20250820105714.2939896-1-sungho.kim@furiosa.ai
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8fc6056dcf ]
As reported, on-disk footer.ino and footer.nid is the same and
out-of-range, let's add sanity check on f2fs_alloc_nid() to detect
any potential corruption in free_nid_list.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d85c565a7 ]
Initially, trace_sock_exceed_buf_limit() was invoked when
__sk_mem_raise_allocated() failed due to the memcg limit or the
global limit.
However, commit d6f19938eb ("net: expose sk wmem in
sock_exceed_buf_limit tracepoint") somehow suppressed the event
only when memcg failed to charge for SK_MEM_RECV, although the
memcg failure for SK_MEM_SEND still triggers the event.
Let's restore the event for SK_MEM_RECV.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Link: https://patch.msgid.link/20250815201712.1745332-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee0aace5f8 ]
The stmmac_rx function would previously set skb->ip_summed to
CHECKSUM_UNNECESSARY if hardware checksum offload (CoE) was enabled
and the packet was of a known IP ethertype.
However, this logic failed to check if the hardware had actually
reported a checksum error. The hardware status, indicating a header or
payload checksum failure, was being ignored at this stage. This could
cause corrupt packets to be passed up the network stack as valid.
This patch corrects the logic by checking the `csum_none` status flag,
which is set when the hardware reports a checksum error. If this flag
is set, skb->ip_summed is now correctly set to CHECKSUM_NONE,
ensuring the kernel's network stack will perform its own validation and
properly handle the corrupt packet.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20250818090217.2789521-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b0ac6d3b56 ]
When removing a nexthop, commit
90f33bffa3 ("nexthops: don't modify published nexthop groups") added a
call to synchronize_rcu() (later changed to _net()) to make sure
everyone sees the new nexthop-group before the rtnl-lock is released.
When one wants to delete a large number of groups and nexthops, it is
fastest to first flush the groups (ip nexthop flush groups) and then
flush the nexthops themselves (ip -6 nexthop flush). As that way the
groups don't need to be rebalanced.
However, `ip -6 nexthop flush` will still take a long time if there is
a very large number of nexthops because of the call to
synchronize_net(). Now, if there are no more groups, there is no point
in calling synchronize_net(). So, let's skip that entirely by checking
if nh->grp_list is empty.
This gives us a nice speedup:
BEFORE:
=======
$ time sudo ip -6 nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 2097152 nexthops
real 1m45.345s
user 0m0.001s
sys 0m0.005s
$ time sudo ip -6 nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 4194304 nexthops
real 3m10.430s
user 0m0.002s
sys 0m0.004s
AFTER:
======
$ time sudo ip -6 nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 2097152 nexthops
real 0m17.545s
user 0m0.003s
sys 0m0.003s
$ time sudo ip -6 nexthop flush
Dump was interrupted and may be inconsistent.
Flushed 4194304 nexthops
real 0m35.823s
user 0m0.002s
sys 0m0.004s
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250816-nexthop_dump-v2-2-491da3462118@openai.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 52e2bb5ff0 ]
For miscdevice who wants dynamic minor, it may fail to be registered again
without reinitialization after being de-registered, which is illustrated
by kunit test case miscdev_test_dynamic_reentry() newly added.
There is a real case found by cascardo when a part of minor range were
contained by range [0, 255):
1) wmi/dell-smbios registered minor 122, and acpi_thermal_rel registered
minor 123
2) unbind "int3400 thermal" driver from its device, this will de-register
acpi_thermal_rel
3) rmmod then insmod dell_smbios again, now wmi/dell-smbios is using minor
123
4) bind the device to "int3400 thermal" driver again, acpi_thermal_rel
fails to register.
Some drivers may reuse the miscdevice structure after they are deregistered
If the intention is to allocate a dynamic minor, if the minor number is not
reset to MISC_DYNAMIC_MINOR before calling misc_register(), it will try to
register a previously dynamically allocated minor number, which may have
been registered by a different driver.
One such case is the acpi_thermal_rel misc device, registered by the
int3400 thermal driver. If the device is unbound from the driver and later
bound, if there was another dynamic misc device registered in between, it
would fail to register the acpi_thermal_rel misc device. Other drivers
behave similarly.
Actually, this kind of issue is prone to happen if APIs
misc_register()/misc_deregister() are invoked by driver's
probe()/remove() separately.
Instead of fixing all the drivers, just reset the minor member to
MISC_DYNAMIC_MINOR in misc_deregister() in case it was a dynamically
allocated minor number, as error handling of misc_register() does.
Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Signed-off-by: Zijun Hu <zijun.hu@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250714-rfc_miscdev-v6-5-2ed949665bde@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 499cbe0f2f ]
Mark dm error as DM_TARGET_PASSES_INTEGRITY so that it can be stacked on
top of PI capable devices. The claim is strictly speaking as lie as dm
error fails all I/O and doesn't pass anything on, but doing the same for
integrity I/O work just fine :)
This helps to make about two dozen xfstests test cases pass on PI capable
devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 956606bafb ]
This fix is already present in f_ecm.c and was never
propagated to f_ncm.c
When creating multiple NCM ethernet devices
on a composite usb gadget device
each MAC address on the HOST side will be identical.
Having the same MAC on different network interfaces is bad.
This fix updates the MAC address inside the
ncm_strings_defs global during the ncm_bind call.
This ensures each device has a unique MAC.
In f_ecm.c ecm_string_defs is updated in the same way.
The defunct MAC assignment in ncm_alloc has been removed.
Signed-off-by: raub camaioni <raubcameo@gmail.com>
Link: https://lore.kernel.org/r/20250815131358.1047525-1-raubcameo@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 12c9b09e98 ]
ADC calibration might fail because of the noise on reference voltage.
To avoid calibration fail, need to meet the following requirement:
ADC reference voltage Noise < 1.8V * 1/2^ENOB
For the case which the ADC reference voltage on board do not meet
the requirement, still load the calibrated values, so ADC can also
work but maybe not that accurate.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Primoz Fiser <primoz.fiser@norik.com>
Link: https://patch.msgid.link/20250812-adc-v2-2-0260833f13b8@nxp.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d2fa0ec6e0 ]
When a poison is consumed on the guest before the guest receives the host's poison creation msg, a corner case may occur to have poison_handler complete processing earlier than it should to cause the guest to hang waiting for the req_bad_pages reply during a VF FLR, resulting in the VM becoming inaccessible in stress tests.
To fix this issue, this patch refactored the mailbox sequence by seperating the bad_page_work into two parts req_bad_pages_work and handle_bad_pages_work.
Old sequence:
1.Stop data exchange work
2.Guest sends MB_REQ_RAS_BAD_PAGES to host and keep polling for IDH_RAS_BAD_PAGES_READY
3.If the IDH_RAS_BAD_PAGES_READY arrives within timeout limit, re-init the data exchange region for updated bad page info
else timeout with error message
New sequence:
req_bad_pages_work:
1.Stop data exhange work
2.Guest sends MB_REQ_RAS_BAD_PAGES to host
Once Guest receives IDH_RAS_BAD_PAGES_READY event
handle_bad_pages_work:
3.re-init the data exchange region for updated bad page info
Signed-off-by: Chenglei Xie <Chenglei.Xie@amd.com>
Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0ed704d058 ]
HMM assumes that pages have READ permissions by default. Inside
svm_range_validate_and_map, we add READ permissions then add WRITE
permissions if the VMA isn't read-only. This will conflict with regions
that only have PROT_WRITE or have PROT_NONE. When that happens,
svm_range_restore_work will continue to retry, silently, giving the
impression of a hang if pr_debug isn't enabled to show the retries..
If pages don't have READ permissions, simply unmap them and continue. If
they weren't mapped in the first place, this would be a no-op. Since x86
doesn't support write-only, and PROT_NONE doesn't allow reads or writes
anyways, this will allow the svm range validation to continue without
getting stuck in a loop forever on mappings we can't use with HMM.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 859958a7fa ]
If a amdgpu_bo_va is fpriv->prt_va, the bo of this one is always NULL.
So, such kind of amdgpu_bo_va should be updated separately before
amdgpu_vm_handle_moved.
Signed-off-by: Heng Zhou <Heng.Zhou@amd.com>
Reviewed-by: Kasiviswanathan, Harish <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cb640b2ca5 ]
Detecting the monitor for DisplayPort targets is more complicated than
just reading the HPD pin level: it requires reading the DPCD in order to
check what kind of device is attached to the port and whether there is
an actual display attached.
In order to let DRM framework handle such configurations, disable
DRM_BRIDGE_OP_DETECT for dp-connector devices, letting the actual DP
driver perform detection. This still keeps DRM_BRIDGE_OP_HPD enabled, so
it is valid for the bridge to report HPD events.
Currently inside the kernel there are only two targets which list
hpd-gpios for dp-connector devices: arm64/qcom/qcs6490-rb3gen2 and
arm64/qcom/sa8295p-adp. Both should be fine with this change.
Cc: Bjorn Andersson <andersson@kernel.org>
Cc: Konrad Dybcio <konradybcio@kernel.org>
Cc: linux-arm-msm@vger.kernel.org
Acked-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/20250802-dp-conn-no-detect-v1-1-2748c2b946da@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c2dc9f0b36 ]
Fixes force feedback for devices built with MMOS firmware and many more
not yet detected devices.
Update quirks mask debug message to always contain all 32 bits of data.
Signed-off-by: Tomasz Pakuła <tomasz.pakula.oficjalny@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f345a4798d ]
The already fixed bug in SDL only affected conditional effects. This
should fix FFB in Forza Horizion 4/5 on Moza Devices as Forza Horizon
flips the constant force direction instead of using negative magnitude
values.
Changing the direction in the effect directly in pidff_upload_effect()
would affect it's value in further operations like comparing to the old
effect and/or just reading the effect values in the user application.
This, in turn, would lead to constant PID_SET_EFFECT spam as the effect
direction would constantly not match the value that's set by the
application.
This way, it's still transparent to any software/API.
Only affects conditional effects now so it's better for it to explicitly
state that in the name. If any HW ever needs fixed direction for other
effects, we'll add more quirks.
Signed-off-by: Tomasz Pakuła <tomasz.pakula.oficjalny@gmail.com>
Reviewed-by: Oleg Makarenko <oleg@makarenk.ooo>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eecd203ada ]
syzbot is reporting that imon has three problems which result in
hung tasks due to forever holding device lock [1].
First problem is that when usb_rx_callback_intf0() once got -EPROTO error
after ictx->dev_present_intf0 became true, usb_rx_callback_intf0()
resubmits urb after printk(), and resubmitted urb causes
usb_rx_callback_intf0() to again get -EPROTO error. This results in
printk() flooding (RCU stalls).
Alan Stern commented [2] that
In theory it's okay to resubmit _if_ the driver has a robust
error-recovery scheme (such as giving up after some fixed limit on the
number of errors or after some fixed time has elapsed, perhaps with a
time delay to prevent a flood of errors). Most drivers don't bother to
do this; they simply give up right away. This makes them more
vulnerable to short-term noise interference during USB transfers, but in
reality such interference is quite rare. There's nothing really wrong
with giving up right away.
but imon has a poor error-recovery scheme which just retries forever;
this behavior should be fixed.
Since I'm not sure whether it is safe for imon users to give up upon any
error code, this patch takes care of only union of error codes chosen from
modules in drivers/media/rc/ directory which handle -EPROTO error (i.e.
ir_toy, mceusb and igorplugusb).
Second problem is that when usb_rx_callback_intf0() once got -EPROTO error
before ictx->dev_present_intf0 becomes true, usb_rx_callback_intf0() always
resubmits urb due to commit 8791d63af0 ("[media] imon: don't wedge
hardware after early callbacks"). Move the ictx->dev_present_intf0 test
introduced by commit 6f6b90c923 ("[media] imon: don't parse scancodes
until intf configured") to immediately before imon_incoming_packet(), or
the first problem explained above happens without printk() flooding (i.e.
hung task).
Third problem is that when usb_rx_callback_intf0() is not called for some
reason (e.g. flaky hardware; the reproducer for this problem sometimes
prevents usb_rx_callback_intf0() from being called),
wait_for_completion_interruptible() in send_packet() never returns (i.e.
hung task). As a workaround for such situation, change send_packet() to
wait for completion with timeout of 10 seconds.
Link: https://syzkaller.appspot.com/bug?extid=592e2ab8775dbe0bf09a [1]
Link: https://lkml.kernel.org/r/d6da6709-d799-4be3-a695-850bddd6eb24@rowland.harvard.edu [2]
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2327a3d6f6 ]
Fix field-spanning memcpy warnings in ah6_output() and
ah6_output_done() where extension headers are copied to/from IPv6
address fields, triggering fortify-string warnings about writes beyond
the 16-byte address fields.
memcpy: detected field-spanning write (size 40) of single field "&top_iph->saddr" at net/ipv6/ah6.c:439 (size 16)
WARNING: CPU: 0 PID: 8838 at net/ipv6/ah6.c:439 ah6_output+0xe7e/0x14e0 net/ipv6/ah6.c:439
The warnings are false positives as the extension headers are
intentionally placed after the IPv6 header in memory. Fix by properly
copying addresses and extension headers separately, and introduce
helper functions to avoid code duplication.
Reported-by: syzbot+01b0667934cdceb4451c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=01b0667934cdceb4451c
Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c5aeb264b6 ]
`offset` is a common field name, yet using it triggers a build error due
to the conflict between the uppercased field constant (which becomes
`OFFSET` in this case) containing the bitrange of the field, and the
`OFFSET` constant constaining the offset of the register.
Fix this by adding `_RANGE` the field's range constant to avoid the
name collision.
[acourbot@nvidia.com: fix merge conflict due to switch from `as u32` to
`u32::from`.]
Reported-by: Timur Tabi <ttabi@nvidia.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://lore.kernel.org/r/20250718-nova-regs-v2-3-7b6a762aa1cd@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa86602a48 ]
Move the configuration of the Auto-Hibern8 (AHIT) timer from the
post-link stage to the 'fixup_dev_quirks' function. This change allows
setting the AHIT based on the vendor requirements:
(a) Samsung: 3.5 ms
(b) Micron: 2 ms
(c) Others: 1 ms
Additionally, the clock gating timer is adjusted based on the AHIT
scale, with a maximum setting of 10 ms. This ensures that the clock
gating delay is appropriately configured to match the AHIT settings.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250811131423.3444014-3-peter.wang@mediatek.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df979273bd ]
The following Vitesse/Microsemi/Microchip PHYs, among those supported by
this driver, have the host interface configurable as SGMII or QSGMII:
- VSC8504
- VSC8514
- VSC8552
- VSC8562
- VSC8572
- VSC8574
- VSC8575
- VSC8582
- VSC8584
All these PHYs are documented to have bit 7 of "MAC SerDes PCS Control"
as "MAC SerDes ANEG enable".
Out of these, I could test the VSC8514 quad PHY in QSGMII. This works
both with the in-band autoneg on and off, on the NXP LS1028A-RDB and
T1040-RDB boards.
Notably, the bit is sticky (survives soft resets), so giving Linux the
tools to read and modify this settings makes it robust to changes made
to it by previous boot layers (U-Boot).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20250813074454.63224-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f09fc24dd9 ]
On fast machines the tests run in quick succession so even
when tests clean up after themselves the carrier may need
some time to come back.
Specifically in NIPA when ping.py runs right after netpoll_basic.py
the first ping command fails.
Since the context manager callbacks are now common NetDrvEpEnv
gets an ip link up call as well.
Reviewed-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20250812142054.750282-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d05b24429 ]
If a backup port is configured for a bridge port, the bridge will
redirect known unicast traffic towards the backup port when the primary
port is administratively up but without a carrier. This is useful, for
example, in MLAG configurations where a system is connected to two
switches and there is a peer link between both switches. The peer link
serves as the backup port in case one of the switches loses its
connection to the multi-homed system.
In order to avoid flooding when the primary port loses its carrier, the
bridge does not flush dynamic FDB entries pointing to the port upon STP
disablement, if the port has a backup port.
The above means that known unicast traffic destined to the primary port
will be blackholed when the port is put administratively down, until the
FDB entries pointing to it are aged-out.
Given that the current behavior is quite weird and unlikely to be
depended on by anyone, amend the bridge to redirect to the backup port
also when the primary port is administratively down and not only when it
does not have a carrier.
The change is motivated by a report from a user who expected traffic to
be redirected to the backup port when the primary port was put
administratively down while debugging a network issue.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250812080213.325298-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5d03847175 ]
The thunderbolt driver sets up device link dependencies from hotplug ports
to the Host Router (aka Native Host Interface, NHI). When resuming from
system sleep, this allows the Host Router to re-establish tunnels to
attached Thunderbolt devices before the hotplug ports resume.
To identify the hotplug ports, the driver utilizes the is_hotplug_bridge
flag which also encompasses ACPI slots handled by the ACPI hotplug driver.
Thunderbolt hotplug ports are always Hot-Plug Capable PCIe ports, so it is
more apt to identify them with the is_pciehp flag.
Similarly, hotplug ports on older Thunderbolt controllers have broken MSI
support and are quirked to use legacy INTx interrupts instead. The quirk
identifies them with is_hotplug_bridge, even though all affected ports are
also matched by is_pciehp. So use is_pciehp here as well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 45bc82563d ]
After a Fatal Error has been reported by a device and has been recovered
through a Secondary Bus Reset, AER updates the device's error_state to
pci_channel_io_normal before invoking its driver's ->resume() callback.
By contrast, EEH updates the error_state earlier, namely after resetting
the device and before invoking its driver's ->slot_reset() callback.
Commit c58dc575f3 ("powerpc/pseries: Set error_state to
pci_channel_io_normal in eeh_report_reset()") explains in great detail
that the earlier invocation is necessitated by various drivers checking
accessibility of the device with pci_channel_offline() and avoiding
accesses if it returns true. It returns true for any other error_state
than pci_channel_io_normal.
The device should be accessible already after reset, hence the reasoning
is that it's safe to update the error_state immediately afterwards.
This deviation between AER and EEH seems problematic because drivers
behave differently depending on which error recovery mechanism the
platform uses. Three drivers have gone so far as to update the
error_state themselves, presumably to work around AER's behavior.
For consistency, amend AER to update the error_state at the same recovery
steps as EEH. Drop the now unnecessary workaround from the three drivers.
Keep updating the error_state before ->resume() in case ->error_detected()
or ->mmio_enabled() return PCI_ERS_RESULT_RECOVERED, which causes
->slot_reset() to be skipped. There are drivers doing this even for Fatal
Errors, e.g. mhi_pci_error_detected().
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/4517af6359ffb9d66152b827a5d2833459144e3f.1755008151.git.lukas@wunner.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2d240b124c ]
Both ACPI and DT-based systems are required to obtain the external
camera sensor clock using the new devm_v4l2_sensor_clk_get() helper
function.
Ensure a dependency on HAVE_CLK when config VIDEO_CAMERA_SENSOR is
enabled.
Signed-off-by: Mehdi Djait <mehdi.djait@linux.intel.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 44d69d3cf2 ]
Drain send WRs of the GSI QP on device removal.
In rare servicing scenarios, the hardware may delete the
state of the GSI QP, preventing it from generating CQEs
for pending send WRs. Since WRs submitted to the GSI QP
hold CM resources, the device cannot be removed until
those WRs are completed. This patch marks all pending
send WRs as failed, allowing the GSI QP to release the CM
resources and enabling safe device removal.
Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://patch.msgid.link/1753779618-23629-1-git-send-email-kotaranov@linux.microsoft.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eea4f89b64 ]
The driver tries to calculate the value for REG_WAKEUP_TIME. However,
the calculation itself is not correct, and to add on it, the resulting
value is almost always larger than the field's size, so the actual
result is more or less random.
According to the docs, figuring out the value for REG_WAKEUP_TIME
requires HW characterization and there's no way to have a generic
algorithm to come up with the value. That doesn't help at all...
However, we know that the value must be smaller than the line time, and,
at least in my understanding, the proper value for it is quite small.
Testing shows that setting it to 1/10 of the line time seems to work
well. All video modes from my HDMI monitor work with this algorithm.
Hopefully we'll get more information on how to calculate the value, and
we can then update this.
Tested-by: Parth Pancholi <parth.pancholi@toradex.com>
Tested-by: Jayesh Choudhary <j-choudhary@ti.com>
Reviewed-by: Devarsh Thakkar <devarsht@ti.com>
Link: https://lore.kernel.org/r/20250723-cdns-dsi-impro-v5-11-e61cc06074c2@ideasonboard.com
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 19fb9c5b81 ]
The v4l2_fh initialized and added in vpu_v4l2_open() is delete and
cleaned up when the last reference to the vpu_inst is released. This may
happen later than at vpu_v4l2_close() time.
Not deleting and cleaning up the v4l2_fh when closing the file handle to
the video device is not ideal, as the v4l2_fh will still be present in
the video device's fh_list, and will store a copy of events queued to
the video device. There may also be other side effects of keeping alive
an object that represents an open file handle after the file handle is
closed.
The v4l2_fh instance is embedded in the vpu_inst structure, and is
accessed in two different ways:
- in vpu_notify_eos() and vpu_notify_source_change(), to queue V4L2
events to the file handle ; and
- through the driver to access the v4l2_fh.m2m_ctx pointer.
The v4l2_fh.m2m_ctx pointer is not touched by v4l2_fh_del() and
v4l2_fh_exit(). It is set to NULL by the driver when closing the file
handle, in vpu_v4l2_close().
The vpu_notify_eos() and vpu_notify_source_change() functions are called
in vpu_set_last_buffer_dequeued() and vdec_handle_resolution_change()
respectively, only if the v4l2_fh.m2m_ctx pointer is not NULL. There is
therefore a guarantee that no new event will be queued to the v4l2_fh
after vpu_v4l2_close() destroys the m2m_ctx.
The vpu_notify_eos() function is also called from vpu_vb2_buf_finish(),
which is guaranteed to be called for all queued buffers when
vpu_v4l2_close() calls v4l2_m2m_ctx_release(), and will not be called
later.
It is therefore safe to assume that the driver will not touch the
v4l2_fh, except to check the m2m_ctx pointer, after vpu_v4l2_close()
destroys the m2m_ctx. We can safely delete and cleanup the v4l2_fh
synchronously in vpu_v4l2_close().
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Ming Qian <ming.qian@oss.nxp.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc6e8d1cce ]
The ivtv driver has a structure named ivtv_open_id that models an open
file handle for the device. It embeds a v4l2_fh instance for file
handles that correspond to a V4L2 video device, and stores a pointer to
that v4l2_fh in struct ivtv_stream to identify which open file handle
owns a particular stream.
In addition to video devices, streams can be owned by ALSA PCM devices.
Those devices do not make use of the v4l2_fh instance for obvious
reasons, but the snd_ivtv_pcm_capture_open() function still initializes
a "fake" v4l2_fh for the sole purpose of using it as an open file handle
identifier. The v4l2_fh is not properly destroyed when the ALSA PCM
device is closed, leading to possible resource leaks.
Fortunately, the v4l2_fh instance pointed to by ivtv_stream is not
accessed, only the pointer value is used for comparison. Replace it with
a pointer to the ivtv_open_id structure that embeds the v4l2_fh, and
don't initialize the v4l2_fh for ALSA PCM devices.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bc4c0a48bd ]
The get_next_frame() function in psock_tpacket.c was missing a return
statement in its default switch case, leading to a compiler warning.
This was caused by a `bug_on(1)` call, which is defined as an
`assert()`, being compiled out because NDEBUG is defined during the
build.
Instead of adding a `return NULL;` which would silently hide the error
and could lead to crashes later, this change restores the original
author's intent. By adding `#undef NDEBUG` before including <assert.h>,
we ensure the assertion is active and will cause the test to abort if
this unreachable code is ever executed.
Signed-off-by: Wake Liu <wakel@google.com>
Link: https://patch.msgid.link/20250809062013.2407822-1-wakel@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c36748e873 ]
The `__WORDSIZE` macro, defined in the non-standard `<bits/wordsize.h>`
header, is a GNU extension and not universally available with all
toolchains, such as Clang when used with musl libc.
This can lead to build failures in environments where this header is
missing.
The intention of the code is to determine the bit width of a C `long`.
Replace the non-portable `__WORDSIZE` with the standard and portable
`sizeof(long) * 8` expression to achieve the same result.
This change also removes the inclusion of the now-unused
`<bits/wordsize.h>` header.
Signed-off-by: Wake Liu <wakel@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27738c3003 ]
Always set the RMDevidCheckIgnore registry key for GSP-RM so that it
will continue support newer variants of already supported GPUs.
GSP-RM maintains an internal list of PCI IDs of GPUs that it supports,
and checks if the current GPU is on this list. While the actual GPU
architecture (as specified in the BOOT_0/BOOT_42 registers) determines
how to enable the GPU, the PCI ID is used for the product name, e.g.
"NVIDIA GeForce RTX 5090".
Unfortunately, if there is no match, GSP-RM will refuse to initialize,
even if the device is fully supported. Nouveau will get an error
return code, but by then it's too late. This behavior may be corrected
in a future version of GSP-RM, but that does not help Nouveau today.
Fortunately, GSP-RM supports an undocumented registry key that tells it
to ignore the mismatch. In such cases, the product name returned will
be a blank string, but otherwise GSP-RM will continue.
Unlike Nvidia's proprietary driver, Nouveau cannot update to newer
firmware versions to keep up with every new hardware release. Instead,
we can permanently set this registry key, and GSP-RM will continue
to function the same with known hardware.
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Link: https://lore.kernel.org/r/20250808191340.1701983-1-ttabi@nvidia.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ebc7086b39 ]
RDC PCI to PCIe bridges, present on Vortex86DX3 and Vortex86EX2 SoCs, do
not support MSIs. If enabled, interrupts generated by PCIe devices never
reach the processor.
I have contacted the manufacturer (DM&P) and they confirmed that PCI MSIs
need to be disabled for them.
Signed-off-by: Marcos Del Sol Vives <marcos@orca.pet>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250705233209.721507-1-marcos@orca.pet
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 136c374d8c ]
Use DRM's shadow-plane helper to map and access the GEM object's buffer
within kernel address space. Encapsulates the vmap logic in the GEM-DMA
helpers.
The sharp-memory driver currently reads the vaddr field from the GME
buffer object directly. This only works because GEM code 'automagically'
sets vaddr.
Shadow-plane helpers perform the same steps, but with correct abstraction
behind drm_gem_vmap(). The shadow-plane state provides the buffer address
in kernel address space and the format-conversion state.
v2:
- fix typo in commit description
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://lore.kernel.org/r/20250627152327.8244-1-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fc973dcd73 ]
While we do need at least 3.6 for kernel-doc to work, and at least
3.7 for it to output functions and structs with parameters at the
right order, let the python binary be compatible with legacy
versions.
The rationale is that the Kernel build nowadays calls kernel-doc
with -none on some places. Better not to bail out when older
versions are found.
With that, potentially this will run with python 2.7 and 3.2+,
according with vermin:
$ vermin --no-tips -v ./scripts/kernel-doc
Detecting python files..
Analyzing using 24 processes..
2.7, 3.2 /new_devel/v4l/docs/scripts/kernel-doc
Minimum required versions: 2.7, 3.2
3.2 minimal requirement is due to argparse.
The minimal version I could check was version 3.4
(using anaconda). Anaconda doesn't support 3.2 or 3.3
anymore, and 3.2 doesn't even compile (I tested compiling
Python 3.2 on Fedora 42 and on Fedora 32 - no show).
With 3.4, the script didn't crash and emitted the right warning:
$ conda create -n py34 python=3.4
$ conda activate py34
python --version
Python 3.4.5
$ python ./scripts/kernel-doc --none include/media
Error: Python 3.6 or later is required by kernel-doc
$ conda deactivate
$ python --version
Python 3.13.5
$ python ./scripts/kernel-doc --none include/media
(no warnings and script ran properly)
Supporting 2.7 is out of scope, as it is EOL for 5 years, and
changing shebang to point to "python" instead of "python3"
would have a wider impact.
I did some extra checks about the differences from 3.2 and
3.4, and didn't find anything that would cause troubles:
grep -rE "yield from|asyncio|pathlib|async|await|enum" scripts/kernel-doc
Also, it doesn't use "@" operator. So, I'm confident that it
should run (producing the exit warning) since Python 3.2.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/87d55e76b0b1391cb7a83e3e965dbddb83fa9786.1753806485.git.mchehab+huawei@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 17593a69b7 ]
For non-leaf paging structures we end up selecting a random index
between [0, 3], depending on the first user if the page-table is shared,
since non-leaf structures only have two bits in the HW for encoding the
PAT index, and here we are just passing along the full user provided
index, which can be an index as large as ~31 on xe2+. The user provided
index is meant for the leaf node, which maps the actual BO pages where
we have more PAT bits, and not the non-leaf nodes which are only mapping
other paging structures, and so only needs a minimal PAT index range.
Also the chosen index might need to consider how the driver mapped the
paging structures on the host side, like wc vs wb, which is separate
from the user provided index.
With that move the PDE PAT index selection under driver control. For now
just use a coherent index on platforms with page-tables that are cached
on host side, and incoherent otherwise. Using a coherent index could
potentially be expensive, and would be overkill if we know the page-table
is always uncached on host side.
v2 (Stuart):
- Add some documentation and split into separate helper.
BSpec: 59510
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250808103455.462424-2-matthew.auld@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e7496c15d8 ]
[Why]
Customer reported an issue that OS starts and stops device multiple times
during driver installation. Frequently disabling and enabling OTG may
prevent OTG from being safely disabled and cause incorrect configuration
upon the next enablement.
[How]
Add a wait until OTG_CURRENT_MASTER_EN_STATE is cleared as a short term
solution.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: TungYu Lu <tungyu.lu@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ad335b5fc9 ]
[WHY&HOW]
The user closed the lid while the system was powering on and opened it
again before the “apply_seamless_boot_optimization” was set to false,
resulting in the eDP remaining blank.
Reset the “apply_seamless_boot_optimization” to false when dpms off.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Danny Wang <Danny.Wang@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6510b62fe9 ]
snprintf() returns the number of characters that *would* have been
written, which can overestimate how much you actually wrote to the
buffer in case of truncation. That leads to 'data += this' advancing
the pointer past the end of the buffer and size going negative.
Switching to scnprintf() prevents potential buffer overflows and ensures
consistent behavior when building the output string.
Signed-off-by: Seyediman Seyedarab <ImanDevel@gmail.com>
Link: https://lore.kernel.org/r/20250724195913.60742-1-ImanDevel@gmail.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2f3b1ccf83 ]
Cached metrics data validity is 1ms on arcturus. It's not reasonable for
any client to query gpu_metrics at a faster rate and constantly
interrupt PMFW.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e87577ef6d ]
Cached metrics data validity is 1ms on aldebaran. It's not reasonable
for any client to query gpu_metrics at a faster rate and constantly
interrupt PMFW.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e72fdba8a ]
[Why]
The reason some high-resolution monitors fail to display properly
is that this platform does not support sufficiently high DPP and
DISP clock frequencies
[How]
Update DISP and DPP clocks from the smu clock table then DML can
filter these mode if not support.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Paul Hsieh <Paul.Hsieh@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c210b757b4 ]
Accessing DC from amdgpu_dm is usually preceded by acquisition of
dc_lock mutex. Most of the DC API that DM calls are under a DC lock.
However, there are a few that are not. Some DC API called from interrupt
context end up sending DMUB commands via a DC API, while other threads were
using DMUB. This was apparent from a race between calls for setting idle
optimization enable/disable and the DC API to set vmin/vmax.
Offload the call to dc_stream_adjust_vmin_vmax() to a thread instead
of directly calling them from the interrupt handler such that it waits
for dc_lock.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit da46735229 ]
Move amdgpu_device_health_check into amdgpu_device_gpu_recover to
ensure that if the device is present can be checked before reset
The reason is:
1.During the dpc event, the device where the dpc event occurs is not
present on the bus
2.When both dpc event and ATHUB event occur simultaneously,the dpc thread
holds the reset domain lock when detecting error,and the gpu recover thread
acquires the hive lock.The device is simultaneously in the states of
amdgpu_ras_in_recovery and occurs_dpc,so gpu recover thread will not go to
amdgpu_device_health_check.It waits for the reset domain lock held by the
dpc thread, but dpc thread has not released the reset domain lock.In the dpc
callback slot_reset,to obtain the hive lock, the hive lock is held by the
gpu recover thread at this time.So a deadlock occurred
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e3967a71e ]
The variable `pm_suspend_target_state` is conditionally defined only when
`CONFIG_SUSPEND` is enabled (see `include/linux/suspend.h`). Directly
referencing it without guarding by `#ifdef CONFIG_SUSPEND` causes build
failures when suspend functionality is disabled (e.g., `CONFIG_SUSPEND=n`).
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1cda3c755b ]
I saw an oops in xe_gem_fault when running the xe-fast-feedback
testlist against the realtime kernel without debug options enabled.
The panic happens after core_hotunplug unbind-rebind finishes.
Presumably what happens is that a process mmaps, unlocks because
of the FAULT_FLAG_RETRY_NOWAIT logic, has no process memory left,
causing ttm_bo_vm_dummy_page() to return VM_FAULT_NOPAGE, since
there was nothing left to populate, and then oopses in
"mem_type_is_vram(tbo->resource->mem_type)" because tbo->resource
is NULL.
It's convoluted, but fits the data and explains the oops after
the test exits.
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20250715152057.23254-2-dev@lankhorst.se
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 45fbb51050 ]
The GuC load process will abort if certain status codes (which are
indicative of a fatal error) are reported. Otherwise, it keeps waiting
until the 'success' code is returned. New error codes have been added
in recent GuC releases, so add support for aborting on those as well.
v2: Shuffle HWCONFIG_START to the front of the switch to keep the
ordering as per the enum define for clarity (review feedback by
Jonathan). Also add a description for the basic 'invalid init data'
code which was missing.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://lore.kernel.org/r/20250726024337.4056272-1-John.C.Harrison@Intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f5b69101f9 ]
[WHY]
Last LT automation update can cause crash by referencing current_state and
calling into dc_update_planes_and_stream which may clobber current_state.
[HOW]
Cache relevant stream pointers and iterate through them instead of relying
on the current_state.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 327aba7f55 ]
[why & how]
Header misalignment in struct dmub_cmd_replay_copy_settings_data and
struct dmub_alpm_auxless_data causes incorrect data read between driver
and dmub.
Fix the misalignment and ensure that everything is aligned to 4-byte
boundaries.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3419e1e44 ]
[WHY]
In the worst case, AUX intra-hop done can take hundreds of milliseconds as
each retimer in a link might have to wait a full AUX_RD_INTERVAL to send
LT abort downstream.
[HOW]
Wait 300ms for each retimer in a link to allow time to propagate a LT abort
without infinitely waiting on intra-hop done.
For no-retimer case, keep the max duration at 10ms.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2681bf4ae8 ]
[WHY]
If symclk RCO is enabled, stream encoder may not be receiving an ungated
clock by the time we attempt to set stream attributes when setting dpms
on. Since the clock is gated, register writes to the stream encoder fail.
[HOW]
Move set_stream_attribute call into enable_stream, just after the point
where symclk32_se is ungated.
Logically there is no need to set stream attributes as early as is
currently done in link_set_dpms_on, so this should have no impact beyond
the RCO fix.
Reviewed-by: Ovidiu (Ovi) Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca74cc428f ]
[Why]
When transitioning between topologies such as multi-display to single
display ODM 2:1, pipes might not be freed before use.
[How]
In dc_commit_streams, commit an additional, minimal transition if
original transition is not seamless to ensure pipes are freed.
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Clay King <clayking@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 400a6da1e9 ]
While we expect config directory names to match PCI device name,
currently we are only scanning provided names for domain, bus,
device and function numbers, without checking their format.
This would pass slightly broken entries like:
/sys/kernel/config/xe/
├── 0000:00:02.0000000000000
│ └── ...
├── 0000:00:02.0x
│ └── ...
├── 0: 0: 2. 0
│ └── ...
└── 0:0:2.0
└── ...
To avoid such mistakes, check if the name provided exactly matches
the canonical PCI device address format, which we recreated from
the parsed BDF data. Also simplify scanf format as it can't really
catch all formatting errors.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250722141059.30707-3-michal.wajdeczko@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 62aec8a0a5 ]
As pm_runtime_force_suspend() will force the device state to suspend,
the driver needs to ensure no IRQ handlers are currently running. If not
those handlers may find they are now running on suspended hardware
despite holding a PM runtime reference. disable_irq() will sync any
currently running handlers, so move the IRQ disabling to cover the whole
of the forced suspend state to avoid such race conditions.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20250903094549.271068-6-ckeepax@opensource.cirrus.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5f4bbee069 ]
When an MFD device is added, a platform_device is allocated. If this
device is linked to a DT description, the corresponding OF node is linked
to the new platform device but the OF node's refcount isn't incremented.
As of_node_put() is called during the platform device release, it leads
to a refcount underflow.
Call of_node_get() to increment the OF node's refcount when the node is
linked to the newly created platform device.
Signed-off-by: Bastien Curutchet <bastien.curutchet@bootlin.com>
Link: https://lore.kernel.org/r/20250820-mfd-refcount-v1-1-6dcb5eb41756@bootlin.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ac4890ac3 ]
We observed the initial probe of the da9063 failing in
da9063_get_device_type in about 30% of boots on a Xilinx ZynqMP based
board. The problem originates in da9063_i2c_blockreg_read, which uses
a single bus transaction to turn the register page and then read a
register. On the bus, this should translate to a write to register 0,
followed by a read to the target register, separated by a repeated
start. However, we found that after the write to register 0, the
controller sometimes continues directly with the register address of
the read request, without sending the chip address or a repeated start
in between, which makes the read request invalid.
To fix this, separate turning the page and reading the register into
two separate transactions. This brings the initialization code in line
with the rest of the driver, which uses register maps (which to my
knowledge do not use repeated starts after turning the page). This has
been included in our kernel for several months and was recently
included in a shipped product. For us, it reliably fixes the issue,
and we have not observed any new issues.
While the underlying problem is probably with the i2c controller or
its driver, I still propose a change here in the interest of
robustness: First, I'm not sure this issue can be fixed on the
controller side, since there are other issues related to repeated
start which can't (AR# 60695, AR# 61664). Second, similar problems
might exist with other controllers.
Signed-off-by: Jens Kehne <jens.kehne@agilent.com>
Link: https://lore.kernel.org/r/20250804133754.3496718-1-jens.kehne@agilent.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 364752aa0c ]
clang-21 warns about one uninitialized variable getting dereferenced
in madera_dev_init:
drivers/mfd/madera-core.c:739:10: error: variable 'mfd_devs' is uninitialized when used here [-Werror,-Wuninitialized]
739 | mfd_devs, n_devs,
| ^~~~~~~~
drivers/mfd/madera-core.c:459:33: note: initialize the variable 'mfd_devs' to silence this warning
459 | const struct mfd_cell *mfd_devs;
| ^
| = NULL
The code is actually correct here because n_devs is only nonzero
when mfd_devs is a valid pointer, but this is impossible for the
compiler to see reliably.
Change the logic to check for the pointer as well, to make this easier
for the compiler to follow.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20250807071932.4085458-1-arnd@kernel.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5e1c886791 ]
Relying on other components to include those basic types is unreliable
and may cause compile errors like:
../include/linux/mfd/qnap-mcu.h:13:9: error: unknown type name ‘u32’
13 | u32 baud_rate;
| ^~~
../include/linux/mfd/qnap-mcu.h:17:9: error: unknown type name ‘bool’
17 | bool usb_led;
| ^~~~
So make sure, the types used in the header are available.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250804130726.3180806-2-heiko@sntech.de
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 81a2c31257 ]
The QIXIS FPGA found on Layerscape boards such as LX2160AQDS, LS1028AQDS
etc deals with power-on-reset timing, muxing etc. Use the simple-mfd-i2c
as its core driver by adding its compatible string (already found in
some dt files). By using the simple-mfd-i2c driver, any child device
will have access to the i2c regmap created by it.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20250707153120.1371719-1-ioana.ciornei@nxp.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2734fdbc9b ]
When we are successful in using cpufreq min/max limits,
skip setting the raw MSR limits entirely.
This is necessary to avoid undoing any modification that
the cpufreq driver makes to our sysfs request.
eg. intel_pstate may take our request for a limit
that is valid according to HWP.CAP.MIN/MAX and clip
it to be within the range available in PLATFORM_INFO.
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c97c057d35 ]
On enabling HWP, preserve the reserved bits in MSR_PM_ENABLE.
Also, skip writing the MSR_PM_ENABLE if HWP is already enabled.
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 62127655b7 ]
The fopen_or_die() function was previously hardcoded
to open files in read-only mode ("r"), ignoring the
mode parameter passed to it. This patch corrects
fopen_or_die() to use the provided mode argument,
allowing for flexible file access as intended.
Additionally, the call to fopen_or_die() in
err_on_hypervisor() incorrectly used the mode
"ro", which is not a valid fopen mode. This is
fixed to use the correct "r" mode.
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cafb47be3f ]
The pmt_telemdir_sort() comparison function was returning a boolean
value (0 or 1) instead of the required negative, zero, or positive
value for proper sorting. This caused unpredictable and incorrect
ordering of telemetry directories named telem0, telem1, ..., telemN.
Update the comparison logic to return -1, 0, or 1 based on the
numerical value extracted from the directory name, ensuring correct
numerical ordering when using scandir.
This change improves stability and correctness when iterating PMT
telemetry directories.
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 105eb5dc74 ]
bpf_cookie can fail on perf_event_open(), when it runs after the task_work
selftest. The task_work test causes perf to lower
sysctl_perf_event_sample_rate, and bpf_cookie uses sample_freq,
which is validated against that sysctl. As a result,
perf_event_open() rejects the attr if the (now tighter) limit is
exceeded.
>From perf_event_open():
if (attr.freq) {
if (attr.sample_freq > sysctl_perf_event_sample_rate)
return -EINVAL;
} else {
if (attr.sample_period & (1ULL << 63))
return -EINVAL;
}
Switch bpf_cookie to use sample_period, which is not checked against
sysctl_perf_event_sample_rate.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250925215230.265501-1-mykyta.yatsenko5@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23199d2aa6 ]
Fix incorrect size parameter passed to cpuidle_state_write_file() in
cpuidle_state_disable().
The function was incorrectly using sizeof(disable) which returns the
size of the unsigned int variable (4 bytes) instead of the actual
length of the string stored in the 'value' buffer.
Since 'value' is populated with snprintf() to contain the string
representation of the disable value, we should use the length
returned by snprintf() to get the correct string length for
writing to the sysfs file.
This ensures the correct number of bytes is written to the cpuidle
state disable file in sysfs.
Link: https://lore.kernel.org/r/20250917050820.1785377-1-kaushlendra.kumar@intel.com
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ddb61e737f ]
It turns out the second fan on the Dell Precision 490 does not
really support I8K_FAN_TURBO. Setting the fan state to 3 enables
automatic fan control, just like on the other two fans.
The reason why this was misinterpreted as turbo mode was that
the second fan normally spins faster in automatic mode than
in the previous fan states. Yet when in state 3, the fan speed
reacts to heat exposure, exposing the automatic mode setting.
Link: https://github.com/lm-sensors/lm-sensors/pull/383
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20250917181036.10972-2-W_Armin@gmx.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4363264111 ]
If uprobe handler changes instruction pointer we still execute single
step) or emulate the original instruction and increment the (new) ip
with its length.
This makes the new instruction pointer bogus and application will
likely crash on illegal instruction execution.
If user decided to take execution elsewhere, it makes little sense
to execute the original instruction, so let's skip it.
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20250916215301.664963-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e48265501 ]
The NVMe Base Specification 2.1 states that:
"""
A host requests an explicit persistent connection ... by specifying a
non-zero Keep Alive Timer value in the Connect command.
"""
As such if we are starting a persistent connection to a discovery
controller and the KATO is currently 0 we need to update KATO to a non
zero value to avoid continuous timeouts on the target.
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f12d1137c ]
It is possible for bpf_xdp_adjust_tail() to free all fragments. The
kfunc currently clears the XDP_FLAGS_HAS_FRAGS bit, but not
XDP_FLAGS_FRAGS_PF_MEMALLOC. So far, this has not caused a issue when
building sk_buff from xdp_buff since all readers of xdp_buff->flags
use the flag only when there are fragments. Clear the
XDP_FLAGS_FRAGS_PF_MEMALLOC bit as well to make the flags correct.
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20250922233356.3356453-2-ameryhung@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d0bf7cd5df ]
In the __arch_prepare_bpf_trampoline() function, retval_off is only
meaningful when save_ret is true, so the current logic is correct.
However, in the original logic, retval_off is only initialized under
certain conditions; for example, in the fmod_ret logic, the compiler is
not aware that the flags of the fmod_ret program (prog) have set
BPF_TRAMP_F_CALL_ORIG, which results in an uninitialized symbol
compilation warning.
So initialize retval_off unconditionally to fix it.
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20250922062244.822937-2-duanchenghao@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5d726c4dbe ]
Following deadlock can be triggered easily by lockdep:
WARNING: possible circular locking dependency detected
6.17.0-rc3-00124-ga12c2658ced0 #1665 Not tainted
------------------------------------------------------
check/1334 is trying to acquire lock:
ff1100011d9d0678 (&q->sysfs_lock){+.+.}-{4:4}, at: blk_unregister_queue+0x53/0x180
but task is already holding lock:
ff1100011d9d00e0 (&q->q_usage_counter(queue)#3){++++}-{0:0}, at: del_gendisk+0xba/0x110
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&q->q_usage_counter(queue)#3){++++}-{0:0}:
blk_queue_enter+0x40b/0x470
blkg_conf_prep+0x7b/0x3c0
tg_set_limit+0x10a/0x3e0
cgroup_file_write+0xc6/0x420
kernfs_fop_write_iter+0x189/0x280
vfs_write+0x256/0x490
ksys_write+0x83/0x190
__x64_sys_write+0x21/0x30
x64_sys_call+0x4608/0x4630
do_syscall_64+0xdb/0x6b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
-> #1 (&q->rq_qos_mutex){+.+.}-{4:4}:
__mutex_lock+0xd8/0xf50
mutex_lock_nested+0x2b/0x40
wbt_init+0x17e/0x280
wbt_enable_default+0xe9/0x140
blk_register_queue+0x1da/0x2e0
__add_disk+0x38c/0x5d0
add_disk_fwnode+0x89/0x250
device_add_disk+0x18/0x30
virtblk_probe+0x13a3/0x1800
virtio_dev_probe+0x389/0x610
really_probe+0x136/0x620
__driver_probe_device+0xb3/0x230
driver_probe_device+0x2f/0xe0
__driver_attach+0x158/0x250
bus_for_each_dev+0xa9/0x130
driver_attach+0x26/0x40
bus_add_driver+0x178/0x3d0
driver_register+0x7d/0x1c0
__register_virtio_driver+0x2c/0x60
virtio_blk_init+0x6f/0xe0
do_one_initcall+0x94/0x540
kernel_init_freeable+0x56a/0x7b0
kernel_init+0x2b/0x270
ret_from_fork+0x268/0x4c0
ret_from_fork_asm+0x1a/0x30
-> #0 (&q->sysfs_lock){+.+.}-{4:4}:
__lock_acquire+0x1835/0x2940
lock_acquire+0xf9/0x450
__mutex_lock+0xd8/0xf50
mutex_lock_nested+0x2b/0x40
blk_unregister_queue+0x53/0x180
__del_gendisk+0x226/0x690
del_gendisk+0xba/0x110
sd_remove+0x49/0xb0 [sd_mod]
device_remove+0x87/0xb0
device_release_driver_internal+0x11e/0x230
device_release_driver+0x1a/0x30
bus_remove_device+0x14d/0x220
device_del+0x1e1/0x5a0
__scsi_remove_device+0x1ff/0x2f0
scsi_remove_device+0x37/0x60
sdev_store_delete+0x77/0x100
dev_attr_store+0x1f/0x40
sysfs_kf_write+0x65/0x90
kernfs_fop_write_iter+0x189/0x280
vfs_write+0x256/0x490
ksys_write+0x83/0x190
__x64_sys_write+0x21/0x30
x64_sys_call+0x4608/0x4630
do_syscall_64+0xdb/0x6b0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
other info that might help us debug this:
Chain exists of:
&q->sysfs_lock --> &q->rq_qos_mutex --> &q->q_usage_counter(queue)#3
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&q->q_usage_counter(queue)#3);
lock(&q->rq_qos_mutex);
lock(&q->q_usage_counter(queue)#3);
lock(&q->sysfs_lock);
Root cause is that queue_usage_counter is grabbed with rq_qos_mutex
held in blkg_conf_prep(), while queue should be freezed before
rq_qos_mutex from other context.
The blk_queue_enter() from blkg_conf_prep() is used to protect against
policy deactivation, which is already protected with blkcg_mutex, hence
convert blk_queue_enter() to blkcg_mutex to fix this problem. Meanwhile,
consider that blkcg_mutex is held after queue is freezed from policy
deactivation, also convert blkg_alloc() to use GFP_NOIO.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e7a2510633 ]
The OpenWrt distribution has switched from kernel longterm 6.6 to
6.12. Reports show that devices with the Realtek Otto switch platform
die during operation and are rebooted by the watchdog. Sorting out
other possible reasons the Otto timer is to blame. The platform
currently consists of 4 targets with different hardware revisions.
It is not 100% clear which devices and revisions are affected.
Analysis shows:
A more aggressive sched/deadline handling leads to more timer starts
with small intervals. This increases the bug chances. See
https://marc.info/?l=linux-kernel&m=175276556023276&w=2
Focusing on the real issue a hardware limitation on some devices was
found. There is a minimal chance that a timer ends without firing an
interrupt if it is reprogrammed within the 5us before its expiration
time. Work around this issue by introducing a bounce() function. It
restarts the timer directly before the normal restart functions as
follows:
- Stop timer
- Restart timer with a slow frequency.
- Target time will be >5us
- The subsequent normal restart is outside the critical window
Downstream has already tested and confirmed a patch. See
https://github.com/openwrt/openwrt/pull/19468https://forum.openwrt.org/t/support-for-rtl838x-based-managed-switches/57875/3788
Signed-off-by: Markus Stockhausen <markus.stockhausen@gmx.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Stephen Howell <howels@allthatwemight.be>
Tested-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20250804080328.2609287-2-markus.stockhausen@gmx.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a3835a4410 ]
Some ublk selftests have strange behavior when fio is not installed.
While most tests behave correctly (run if they don't need fio, or skip
if they need fio), the following tests have different behavior:
- test_null_01, test_null_02, test_generic_01, test_generic_02, and
test_generic_12 try to run fio without checking if it exists first,
and fail on any failure of the fio command (including "fio command
not found"). So these tests fail when they should skip.
- test_stress_05 runs fio without checking if it exists first, but
doesn't fail on fio command failure. This test passes, but that pass
is misleading as the test doesn't do anything useful without fio
installed. So this test passes when it should skip.
Fix these issues by adding _have_program fio checks to the top of all of
these tests.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b54082c3e ]
sys_get_robust_list() and compat_get_robust_list() use ptrace_may_access()
to check if the calling task is allowed to access another task's
robust_list pointer. This check is racy against a concurrent exec() in the
target process.
During exec(), a task may transition from a non-privileged binary to a
privileged one (e.g., setuid binary) and its credentials/memory mappings
may change. If get_robust_list() performs ptrace_may_access() before
this transition, it may erroneously allow access to sensitive information
after the target becomes privileged.
A racy access allows an attacker to exploit a window during which
ptrace_may_access() passes before a target process transitions to a
privileged state via exec().
For example, consider a non-privileged task T that is about to execute a
setuid-root binary. An attacker task A calls get_robust_list(T) while T
is still unprivileged. Since ptrace_may_access() checks permissions
based on current credentials, it succeeds. However, if T begins exec
immediately afterwards, it becomes privileged and may change its memory
mappings. Because get_robust_list() proceeds to access T->robust_list
without synchronizing with exec() it may read user-space pointers from a
now-privileged process.
This violates the intended post-exec access restrictions and could
expose sensitive memory addresses or be used as a primitive in a larger
exploit chain. Consequently, the race can lead to unauthorized
disclosure of information across privilege boundaries and poses a
potential security risk.
Take a read lock on signal->exec_update_lock prior to invoking
ptrace_may_access() and accessing the robust_list/compat_robust_list.
This ensures that the target task's exec state remains stable during the
check, allowing for consistent and synchronized validation of
credentials.
Suggested-by: Jann Horn <jann@thejh.net>
Signed-off-by: Pranav Tyagi <pranav.tyagi03@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/linux-fsdevel/1477863998-3298-5-git-send-email-jann@thejh.net/
Link: https://github.com/KSPP/linux/issues/119
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7b1b796117 ]
Refuse to register a cpuidle device if the given CPU has a cpuidle
device already and print a message regarding it.
Without this, an attempt to register a new cpuidle device without
unregistering the existing one leads to the removal of the existing
cpuidle device without removing its sysfs interface.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5612ea8b55 ]
This fixes the build with -Werror -Wall.
btf_dumper.c:71:31: error: variable 'finfo' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
71 | info.func_info = ptr_to_u64(&finfo);
| ^~~~~
prog.c:2294:31: error: variable 'func_info' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
2294 | info.func_info = ptr_to_u64(&func_info);
|
v2:
- Initialize instead of using memset.
Signed-off-by: Tom Stellard <tstellar@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20250917183847.318163-1-tstellar@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 41307ec7df ]
The X1E80100 battery management firmware sends a notification with
code 0x83 when the battery charging state changes, such as switching
between fast charge, taper charge, end of charge, or any other error
charging states.
The same notification code is used with bit[8] set when charging stops
because the charge control end threshold is reached. Additionally,
a 2-bit value is included in bit[10:9] with the same code to indicate
the charging source capability, which is determined by the calculated
power from voltage and current readings from PDOs: 2 means a strong
charger over 60W, 1 indicates a weak charger, and 0 means there is no
charging source.
These 3-MSB [10:8] in the notification code is not much useful for now,
hence just ignore them and trigger a power supply change event whenever
0x83 notification code is received. This helps to eliminate the unknown
notification error messages.
Reported-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Closes: https://lore.kernel.org/all/r65idyc4of5obo6untebw4iqfj2zteiggnnzabrqtlcinvtddx@xc4aig5abesu/
Signed-off-by: Fenglin Wu <fenglin.wu@oss.qualcomm.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 57b100d4cf ]
The cpupower_write_sysfs() function currently returns -1 on
write failure, but the function signature indicates it should
return an unsigned int. Returning -1 from an unsigned function
results in a large positive value rather than indicating
an error condition.
Fix this by returning 0 on failure, which is more appropriate
for an unsigned return type and maintains consistency with typical
success/failure semantics where 0 indicates failure and non-zero
indicates success (bytes written).
Link: https://lore.kernel.org/r/20250828063000.803229-1-kaushlendra.kumar@intel.com
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d7ae46b454 ]
Add a warning if io_populate_area_dma() can't fill in all net_iovs, it
should never happen.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2c89513395 ]
The bpf_cgroup_from_id kfunc relies on cgroup_get_from_id to obtain the
cgroup corresponding to a given cgroup ID. This helper can be called in
a lot of contexts where the current thread can be random. A recent
example was its use in sched_ext's ops.tick(), to obtain the root cgroup
pointer. Since the current task can be whatever random user space task
preempted by the timer tick, this makes the behavior of the helper
unreliable.
Refactor out __cgroup_get_from_id as the non-namespace aware version of
cgroup_get_from_id, and change bpf_cgroup_from_id to make use of it.
There is no compatibility breakage here, since changing the namespace
against which the lookup is being done to the root cgroup namespace only
permits a wider set of lookups to succeed now. The cgroup IDs across
namespaces are globally unique, and thus don't need to be retranslated.
Reported-by: Dan Schatzberg <dschatzberg@meta.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20250915032618.1551762-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a9d4e9f0e8 ]
For systems having CONFIG_NR_CPUS set to > 1024 in kernel config
the selftest fails as arena_spin_lock_irqsave() returns EOPNOTSUPP.
(eg - incase of powerpc default value for CONFIG_NR_CPUS is 8192)
The selftest is skipped incase bpf program returns EOPNOTSUPP,
with a descriptive message logged.
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com>
Link: https://lore.kernel.org/r/20250913091337.1841916-1-skb99@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 031cdd3bc3 ]
Various KUnit tests require PCI infrastructure to work. All normal
platforms enable PCI by default, but UML does not. Enabling PCI from
.kunitconfig files is problematic as it would not be portable. So in
commit 6fc3a8636a ("kunit: tool: Enable virtio/PCI by default on UML")
PCI was enabled by way of CONFIG_UML_PCI_OVER_VIRTIO=y. However
CONFIG_UML_PCI_OVER_VIRTIO requires additional configuration of
CONFIG_UML_PCI_OVER_VIRTIO_DEVICE_ID or will otherwise trigger a WARN() in
virtio_pcidev_init(). However there is no one correct value for
UML_PCI_OVER_VIRTIO_DEVICE_ID which could be used by default.
This warning is confusing when debugging test failures.
On the other hand, the functionality of CONFIG_UML_PCI_OVER_VIRTIO is not
used at all, given that it is completely non-functional as indicated by
the WARN() in question. Instead it is only used as a way to enable
CONFIG_UML_PCI which itself is not directly configurable.
Instead of going through CONFIG_UML_PCI_OVER_VIRTIO, introduce a custom
configuration option which enables CONFIG_UML_PCI without triggering
warnings or building dead code.
Link: https://lore.kernel.org/r/20250908-kunit-uml-pci-v2-1-d8eba5f73c9d@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f2537be4f8 ]
When forcefully shutting down a port via the configfs interface,
nvmet_port_subsys_drop_link() first calls nvmet_port_del_ctrls() and
then nvmet_disable_port(). Both functions will eventually schedule all
remaining associations for deletion.
The current implementation checks whether an association is about to be
removed, but only after the work item has already been scheduled. As a
result, it is possible for the first scheduled work item to free all
resources, and then for the same work item to be scheduled again for
deletion.
Because the association list is an RCU list, it is not possible to take
a lock and remove the list entry directly, so it cannot be looked up
again. Instead, a flag (terminating) must be used to determine whether
the association is already in the process of being deleted.
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Closes: https://lore.kernel.org/all/rsdinhafrtlguauhesmrrzkybpnvwantwmyfq2ih5aregghax5@mhr7v3eryci3/
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6dbcd5a9ab ]
A TEE driver doesn't always need to provide a pool if it doesn't
support memory sharing ioctls and can allocate memory for TEE
messages in another way. Although this is mentioned in the
documentation for tee_device_alloc(), it is not handled correctly.
Reviewed-by: Sumit Garg <sumit.garg@oss.qualcomm.com>
Signed-off-by: Amirreza Zarrabi <amirreza.zarrabi@oss.qualcomm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e9dff11a7a ]
When deleting the previous walkstate operand stack
acpi_ds_call_control_method() was deleting obj_desc->Method.param_count
operands. But Method.param_count does not necessarily match
this_walk_state->num_operands, it may be either less or more.
After correcting the for loop to check `i < this_walk_state->num_operands`
the code is identical to acpi_ds_clear_operands(), so just outright
replace the code with acpi_ds_clear_operands() to fix this.
Link: https://github.com/acpica/acpica/commit/53fc0220
Signed-off-by: Hans de Goede <hansg@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 08b68ca543 ]
For Qualcomm SoCs which needs level shifter for SD card, extra delay is
seen on receiver data path.
To compensate this delay enable tuning for SDR50 mode for targets which
has level shifter. SDHCI_SDR50_NEEDS_TUNING caps will be set for targets
with level shifter on Qualcomm SOC's.
Signed-off-by: Sarthak Garg <quic_sartgarg@quicinc.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f3cfb7943 ]
IO time is considered busy by default for modern Intel processors. The
current check covers recent Family 6 models but excludes the brand new
Families 18 and 19.
According to Arjan van de Ven, the model check was mainly due to a lack
of testing on systems before INTEL_CORE2_MEROM. He suggests considering
all Intel processors as having an efficient idle.
Extend the IO busy classification to all Intel processors starting with
Family 6, including Family 15 (Pentium 4s) and upcoming Families 18/19.
Use an x86 VFM check and move the function to the header file to avoid
using arch-specific #ifdefs in the C file.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://patch.msgid.link/20250908230655.2562440-1-sohil.mehta@intel.com
[ rjw: Added empty line after #include ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c33c43f71b ]
On certain Loongson platforms, drivers attempting to request a legacy
ISA IRQ directly via request_irq() (e.g., IRQ 4) may fail. The
virtual IRQ descriptor is not fully initialized and lacks a valid irqchip.
This issue does not affect ACPI-enumerated devices described in DSDT,
as their interrupts are properly mapped via the GSI translation path.
This indicates the LPC irqdomain itself is functional but is not correctly
handling direct VIRQ-to-HWIRQ mappings.
The root cause is the use of irq_domain_create_linear(). This API sets
up a domain for dynamic, on-demand mapping, typically triggered by a GSI
request. It does not pre-populate the mappings for the legacy VIRQ range
(0-15). Consequently, if no ACPI device claims a specific GSI
(e.g., GSI 4), the corresponding VIRQ (e.g., VIRQ 4) is never mapped to
the LPC domain. A direct call to request_irq(4, ...) then fails because
the kernel cannot resolve this VIRQ to a hardware interrupt managed by
the LPC controller.
The PCH-LPC interrupt controller is an i8259-compatible legacy device
that requires a deterministic, static 1-to-1 mapping for IRQs 0-15 to
support legacy drivers.
Fix this by replacing irq_domain_create_linear() with
irq_domain_create_legacy(). This API is specifically designed for such
controllers. It establishes the required static 1-to-1 VIRQ-to-HWIRQ
mapping for the entire legacy range (0-15) immediately upon domain
creation. This ensures that any VIRQ in this range is always resolvable,
making direct calls to request_irq() for legacy IRQs function correctly.
Signed-off-by: Ming Wang <wangming01@loongson.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fec2e70572 ]
We're already iterating every segment, so check these for a valid IO
lengths at the same time. Individual segment lengths will not be checked
on passthrough commands. The read/write command segments must be sized
to the dma alignment.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2f076a453f ]
io_ring_ctx's enabled with IORING_SETUP_SINGLE_ISSUER are only allowed
a single task submitting to the ctx. Although the documentation only
mentions this restriction applying to io_uring_enter() syscalls,
commit d7cce96c44 ("io_uring: limit registration w/ SINGLE_ISSUER")
extends it to io_uring_register(). Ensuring only one task interacts
with the io_ring_ctx will be important to allow this task to avoid
taking the uring_lock.
There is, however, one gap in these checks: io_register_clone_buffers()
may take the uring_lock on a second (source) io_ring_ctx, but
__io_uring_register() only checks the current thread against the
*destination* io_ring_ctx's submitter_task. Fail the
IORING_REGISTER_CLONE_BUFFERS with -EEXIST if the source io_ring_ctx has
a registered submitter_task other than the current task.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aae7a2876c ]
Unlike all the other allocations in this driver, the memory for storing
the pin function descriptions allocated with kcalloc() and later resized
with krealloc() is never freed. Use devres like elsewhere to handle
that. While at it - replace krealloc() with more suitable
devm_krealloc_array().
Note: the logic in this module is pretty convoluted and could probably
use some revisiting, we should probably be able to calculate the exact
amount of memory needed in advance or even skip the allocation
altogether and just add each function to the radix tree separately.
Tested-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d9d61f1da3 ]
Many AMD CPUs can support this feature now. We would get a wrong CPU DIE
temperature if don't consider this. In low-temperature environments,
the CPU die temperature can drop below zero. So many platforms would like
to make extended temperature range as their default configuration.
Default temperature range (0C to 255.875C).
Extended temperature range (-49C to +206.875C).
Ref Doc: AMD V3000 PPR (Doc ID #56558).
Signed-off-by: Chuande Chen <chuachen@cisco.com>
Link: https://lore.kernel.org/r/20250814053940.96764-1-chenchuande@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e82368359 ]
The current behavior of the Step-wise thermal governor is to increase
the cooling level one step at a time after trip point threshold passing
by thermal zone temperature until the temperature stops to rise. Then,
nothing is done until the temperature decreases below the (possibly
updated) trip point threshold, at which point the cooling level is
reduced straight to the applicable minimum.
While this generally works, it is not in agreement with the throttling
logic description comment in step_wise_manage() any more after some
relatively recent changes, and in the case of passive cooling, it may
lead to undesirable performance oscillations between high and low
levels.
For this reason, modify the governor's cooling device state selection
function, get_target_state(), to reduce cooling by one level even if
the temperature is still above the thermal zone threshold, but the
temperature has started to fall down. However, ensure that the cooling
level will remain above the applicable minimum in that case to pull
the zone temperature further down, possibly until it falls below the
trip threshold (which may now be equal to the low temperature of the
trip).
Doing so should help higher performance to be restored earlier in some
cases which is desirable especially for passive trip points with
relatively high hysteresis values.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/1947735.tdWV9SEqCh@rafael.j.wysocki
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4405a214df ]
Some x86/ACPI laptops with MIPI cameras have a INTC10DE or INTC10E0 ACPI
device in the _DEP dependency list of the ACPI devices for the camera-
sensors (which have flags.honor_deps set).
These devices are for an Intel Vision CVS chip for which an out of tree
driver is available [1].
The camera sensor works fine without a driver being loaded for this
ACPI device on the 2 laptops this was tested on:
ThinkPad X1 Carbon Gen 12 (Meteor Lake)
ThinkPad X1 2-in-1 Gen 10 (Arrow Lake)
For now add these HIDs to acpi_ignore_dep_ids[] so that
acpi_dev_ready_for_enumeration() will return true once the other _DEP
dependencies are met and an i2c_client for the camera sensor will get
instantiated.
Link: https://github.com/intel/vision-drivers/ [1]
Signed-off-by: Hans de Goede <hansg@kernel.org>
Link: https://patch.msgid.link/20250829142748.21089-1-hansg@kernel.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3a351de0d9 ]
Just like the other Vivobooks here, the N6506CU has its keyboard IRQ
described as ActiveLow in the DSDT, which the kernel overrides to
EdgeHigh, causing the internal keyboard not to work.
Add the N6506CU to the irq1_level_low_skip_override[] quirk table to fix
this.
Signed-off-by: Sam van Kampen <sam@tehsvk.net>
Link: https://patch.msgid.link/20250829145221.2294784-2-sam@tehsvk.net
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2ef3886ce6 ]
The PCI Local Bus Specification 3.0 (section 6.8.1.6) allows modifying the
low-order bits of the MSI Message DATA register to encode nr_irqs interrupt
numbers in the log2(nr_irqs) bits for the domain.
The problem arises if the base vector (GICV2m base spi) is not aligned with
nr_irqs; in this case, the low-order log2(nr_irqs) bits from the base
vector conflict with the nr_irqs masking, causing the wrong MSI interrupt
to be identified.
To fix this, use bitmap_find_next_zero_area_off() instead of
bitmap_find_free_region() to align the initial base vector with nr_irqs.
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20250902091045.220847-1-christian.bruel@foss.st.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a3fecb9160 ]
While tracking down a problem where constant expressions used by
BUILD_BUG_ON() suddenly stopped working[1], we found that an added static
initializer was convincing the compiler that it couldn't track the state
of the prior statically initialized value. Tracing this down found that
ffs() was used in the initializer macro, but since it wasn't marked with
__attribute__const__, the compiler had to assume the function might
change variable states as a side-effect (which is not true for ffs(),
which provides deterministic math results).
For arc architecture with CONFIG_ISA_ARCV2=y, the __fls() function
uses __builtin_arc_fls() which lacks GCC's const attribute, preventing
compile-time constant folding, and KUnit testing of ffs/fls fails on
arc[3]. A patch[2] to GCC to solve this has been sent.
Add a fix for this by handling compile-time constants with the standard
__builtin_clzl() builtin (which has const attribute) while preserving
the optimized arc-specific builtin for runtime cases. This has the added
benefit of skipping runtime calculation of compile-time constant values.
Even with the GCC bug fixed (which is about "attribute const") this is a
good change to avoid needless runtime costs, and should be done
regardless of the state of GCC's bug.
Build tested ARCH=arc allyesconfig with GCC arc-linux 15.2.0.
Link: https://github.com/KSPP/linux/issues/364 [1]
Link: https://gcc.gnu.org/pipermail/gcc-patches/2025-August/693273.html
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508031025.doWxtzzc-lkp@intel.com/ [3]
Signed-off-by: Kees Cook <kees@kernel.org>
Acked-by: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 592532a77b ]
longhaul_exit() was calling cpufreq_cpu_get(0) without checking
for a NULL policy pointer. On some systems, this could lead to a
NULL dereference and a kernel warning or panic.
This patch adds a check using unlikely() and returns early if the
policy is NULL.
Bugzilla: #219962
Signed-off-by: Dennis Beier <nanovim@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b60b74f82e ]
As per the design specification
"The 16-bit Seconds Calibration Value represents the number of
Oscillator Ticks that are required to measure the largest time period
that is less than or equal to 1 second.
For an oscillator that is 32.768kHz, this value will be 0x7FFF."
Signed-off-by: Harini T <harini.t@amd.com>
Link: https://lore.kernel.org/r/20250710061309.25601-1-harini.t@amd.com
Signed-off-by: Michal Simek <michal.simek@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 758acb9ccf ]
On x86-64, USDT arguments can be specified using Scale-Index-Base (SIB)
addressing, e.g. "1@-96(%rbp,%rax,8)". The current USDT implementation
in libbpf cannot parse this format, causing `bpf_program__attach_usdt()`
to fail with -ENOENT (unrecognized register).
This patch fixes this by implementing the necessary changes:
- add correct handling for SIB-addressed arguments in `bpf_usdt_arg`.
- add adaptive support to `__bpf_usdt_arg_type` and
`__bpf_usdt_arg_spec` to represent SIB addressing parameters.
Signed-off-by: Jiawei Zhao <phoenix500526@163.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250827053128.1301287-2-phoenix500526@163.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7fb83eb664 ]
Interrupt controller eiointc routes interrupts to CPU interface IP0 - IP7.
It is currently hard-coded that eiointc routes interrupts to the CPU
starting from IP1, but it should base that decision on the parent
interrupt, which is provided by ACPI or DTS.
Retrieve the parent's hardware interrupt number and store it in the
descriptor of the eointc instance, so that the routing function can utilize
it for the correct route settings.
[ tglx: Massaged change log ]
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250804081946.1456573-2-maobibo@loongson.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0fdd3240fe ]
The PM co-processor (device manager or DM) adds the ability to abort
entry to a low power mode by clearing the mode selection in the
latest version of its firmware (11.01.09) [1].
Enable the ti_sci driver to support the LPM abort call which clears the
low power mode selection of the DM. This fixes an issue where failed
system suspend attempts would cause subsequent suspends to fail.
After system suspend completes, regardless of if system suspend succeeds
or fails, the ->complete() hook in TI SCI will be called. In the
->complete() hook, a message will be sent to the DM to clear the current
low power mode selection. Clearing the low power mode selection
unconditionally will not cause any error in the DM.
[1] https://software-dl.ti.com/tisci/esd/latest/2_tisci_msgs/pm/lpm.html
Signed-off-by: Kendall Willis <k-willis@ti.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/20250819195453.1094520-1-k-willis@ti.com
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f41345f47f ]
In the following toy program (reg states minimized for readability), R0
and R1 always have different values at instruction 6. This is obvious
when reading the program but cannot be guessed from ranges alone as
they overlap (R0 in [0; 0xc0000000], R1 in [1024; 0xc0000400]).
0: call bpf_get_prandom_u32#7 ; R0_w=scalar()
1: w0 = w0 ; R0_w=scalar(var_off=(0x0; 0xffffffff))
2: r0 >>= 30 ; R0_w=scalar(var_off=(0x0; 0x3))
3: r0 <<= 30 ; R0_w=scalar(var_off=(0x0; 0xc0000000))
4: r1 = r0 ; R1_w=scalar(var_off=(0x0; 0xc0000000))
5: r1 += 1024 ; R1_w=scalar(var_off=(0x400; 0xc0000000))
6: if r1 != r0 goto pc+1
Looking at tnums however, we can deduce that R1 is always different from
R0 because their tnums don't agree on known bits. This patch uses this
logic to improve is_scalar_branch_taken in case of BPF_JEQ and BPF_JNE.
This change has a tiny impact on complexity, which was measured with
the Cilium complexity CI test. That test covers 72 programs with
various build and load time configurations for a total of 970 test
cases. For 80% of test cases, the patch has no impact. On the other
test cases, the patch decreases complexity by only 0.08% on average. In
the best case, the verifier needs to walk 3% less instructions and, in
the worst case, 1.5% more. Overall, the patch has a small positive
impact, especially for our largest programs.
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/be3ee70b6e489c49881cb1646114b1d861b5c334.1755694147.git.paul.chaignon@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b5af45302e ]
Add support for TI K3 AM62D2 SoC to read speed and revision values
from hardware and pass to OPP layer. AM62D shares the same configuations
as AM62A so use existing am62a7_soc_data.
Signed-off-by: Paresh Bhagat <p-bhagat@ti.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 236152dd9b ]
In the pin_config_set function, when handling PIN_CONFIG_BIAS_PULL_DOWN or
PIN_CONFIG_BIAS_PULL_UP, the function calls pcs_pinconf_clear_bias()
which writes the register. However, the subsequent operations continue
using the stale 'data' value from before the register write, effectively
causing the bias clear operation to be overwritten and not take effect.
Fix this by reading the 'data' value from the register after calling
pcs_pinconf_clear_bias().
This bug seems to have existed when this code was first merged in commit
9dddb4df90 ("pinctrl: single: support generic pinconf").
Signed-off-by: Chi Zhang <chizhang@asrmicro.com>
Link: https://lore.kernel.org/20250807062038.13610-1-chizhang@asrmicro.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5f755ba95a ]
Per the SD Host Controller Simplified Specification v4.20 §3.2.3, change
the SD card clock parameters only after first disabling the external card
clock. Doing this fixes a spurious clock pulse on Baytrail and Apollo Lake
SD controllers which otherwise breaks voltage switching with a specific
Swissbit SD card. This change is limited to Intel host controllers to
avoid an issue reported on ARM64 devices.
Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com>
Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Signed-off-by: Erick Shepherd <erick.shepherd@ni.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250724185354.815888-1-erick.shepherd@ni.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2caa6b88e0 ]
In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d24 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through printk(). They can still unintentionally leak raw pointers or
acquire sleeping locks in atomic contexts.
Switch to the regular pointer formatting which is safer and
easier to reason about.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250811-restricted-pointers-bpf-v1-1-a1d7cc3cb9e7@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a5039648f8 ]
In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d24 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through printk(). They can still unintentionally leak raw pointers or
acquire sleeping locks in atomic contexts.
Switch to the regular pointer formatting which is safer and
easier to reason about.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://lore.kernel.org/r/20250811-restricted-pointers-soc-v2-1-7af7ed993546@linutronix.de
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9c45f95222 ]
During raw read, neither the status of the ECC correction nor the erased
state of the codeword gets checked by the qcom_spi_read_cw_raw() function,
so in case of raw access reading the corresponding registers via DMA is
superfluous.
Extend the qcom_spi_config_cw_read() function to evaluate the existing
(but actually unused) 'use_ecc' parameter, and configure reading only
the flash status register when ECC is not used.
With the change, the code gets in line with the corresponding part of
the config_nand_cw_read() function in the qcom_nandc driver.
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://patch.msgid.link/20250808-qpic-snand-handle-use_ecc-v1-1-67289fbb5e2f@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b832b19318 ]
In the past %pK was preferable to %p as it would not leak raw pointer
values into the kernel log.
Since commit ad67b74d24 ("printk: hash addresses printed with %p")
the regular %p has been improved to avoid this issue.
Furthermore, restricted pointers ("%pK") were never meant to be used
through printk(). They can still unintentionally leak raw pointers or
acquire sleeping locks in atomic contexts.
Switch to the regular pointer formatting which is safer and
easier to reason about.
There are still a few users of %pK left, but these use it through seq_file,
for which its usage is safe.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Link: https://patch.msgid.link/20250811-restricted-pointers-spi-v1-1-32c47f954e4d@linutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit cfd6f1a7b4 upstream.
A race condition occurs when ffs_func_eps_enable() runs concurrently
with ffs_data_reset(). The ffs_data_clear() called in ffs_data_reset()
sets ffs->epfiles to NULL before resetting ffs->eps_count to 0, leading
to a NULL pointer dereference when accessing epfile->ep in
ffs_func_eps_enable() after successful usb_ep_enable().
The ffs->epfiles pointer is set to NULL in both ffs_data_clear() and
ffs_data_close() functions, and its modification is protected by the
spinlock ffs->eps_lock. And the whole ffs_func_eps_enable() function
is also protected by ffs->eps_lock.
Thus, add NULL pointer handling for ffs->epfiles in the
ffs_func_eps_enable() function to fix issues
Signed-off-by: Owen Gu <guhuinan@xiaomi.com>
Link: https://lore.kernel.org/r/20250915092907.17802-1-guhuinan@xiaomi.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 35e4a69b20 ]
Allow pm_restrict_gfp_mask() to be called many times in a row to avoid
issues with calling dpm_suspend_start() when the GFP mask has been
already restricted.
Only the first invocation of pm_restrict_gfp_mask() will actually
restrict the GFP mask and the subsequent calls will warn if there is
a mismatch between the expected allowed GFP mask and the actual one.
Moreover, if pm_restrict_gfp_mask() is called many times in a row,
pm_restore_gfp_mask() needs to be called matching number of times in
a row to actually restore the GFP mask. Calling it when the GFP mask
has not been restricted will cause it to warn.
This is necessary for the GFP mask restriction starting in
hibernation_snapshot() to continue throughout the entire hibernation
flow until it completes or it is aborted (either by a wakeup event or
by an error).
Fixes: 449c9c0253 ("PM: hibernate: Restrict GFP mask in hibernation_snapshot()")
Fixes: 469d80a371 ("PM: hibernate: Fix hybrid-sleep")
Reported-by: Askar Safin <safinaskar@gmail.com>
Closes: https://lore.kernel.org/linux-pm/20251025050812.421905-1-safinaskar@gmail.com/
Link: https://lore.kernel.org/linux-pm/20251028111730.2261404-1-safinaskar@gmail.com/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Tested-by: Mario Limonciello (AMD) <superm1@kernel.org>
Cc: 6.16+ <stable@vger.kernel.org> # 6.16+
Link: https://patch.msgid.link/5935682.DvuYhMxLoT@rafael.j.wysocki
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 382bd6a792 upstream.
Before commit 33056a97ae ("drm/amd/display: Remove double checks for
`debug.enable_mem_low_power.bits.cm`"), dpp3_program_blnd_lut(NULL)
checked the low-power debug flag before calling
dpp3_power_on_blnd_lut(false).
After commit 33056a97ae ("drm/amd/display: Remove double checks for
`debug.enable_mem_low_power.bits.cm`"), dpp3_program_blnd_lut(NULL)
unconditionally calls dpp3_power_on_blnd_lut(false). The BLNDGAM power
helper writes BLNDGAM_MEM_PWR_FORCE when CM low-power is disabled, causing
immediate SRAM power toggles instead of deferring at vupdate. This can
disrupt atomic color/LUT sequencing during transitions between
direct scanout and composition within gamescope's DRM backend on
Steam Deck OLED.
To fix this, leave the BLNDGAM power state unchanged when low-power is
disabled, matching dpp3_power_on_hdr3dlut and dpp3_power_on_shaper.
Fixes: 33056a97ae ("drm/amd/display: Remove double checks for `debug.enable_mem_low_power.bits.cm`")
Signed-off-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 13ff4f63fcddfc84ec8632f1443936b00aa26725)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba10f8d92a upstream.
[Why]
Newer VPE microcode has functionality that will decrease DPM level
only when a workload has run for 2 or more seconds. If VPE is turned
off before this DPM decrease and the PMFW doesn't reset it when
power gating VPE, the SOC can get stuck with a higher DPM level.
This can happen from amdgpu's ring buffer test because it's a short
quick workload for VPE and VPE is turned off after 1s.
[How]
In idle handler besides checking fences are drained check PMFW version
to determine if it will reset DPM when power gating VPE. If PMFW will
not do this, then check VPE DPM level. If it is not DPM0 reschedule
delayed work again until it is.
v2: squash in return fix (Alex)
Cc: Peyton.Lee@amd.com
Reported-by: Sultan Alsawaf <sultan@kerneltoast.com>
Reviewed-by: Sultan Alsawaf <sultan@kerneltoast.com>
Tested-by: Sultan Alsawaf <sultan@kerneltoast.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4615
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3ac635367eb589bee8edcc722f812a89970e14b7)
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e0023c8a74 upstream.
nouveau_sched_fini() uses a memory barrier before wait_event().
wait_event(), however, is a macro which expands to a loop which might
check the passed condition several times. The barrier would only take
effect for the first check.
Replace the barrier with a function which takes the spinlock.
Cc: stable@vger.kernel.org # v6.8+
Fixes: 5f03a507b2 ("drm/nouveau: implement 1:1 scheduler - entity relationship")
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patch.msgid.link/20251024161221.196155-2-phasta@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07ad45e06b upstream.
The function has a memory leak when kvrealloc() fails.
The function directly assigns NULL to the markers pointer, losing the
reference to the previously allocated memory. This causes kvfree() in
pt_dump_init() to free NULL instead of the leaked memory.
Fix by:
1. Using kvrealloc() uniformly for all allocations
2. Using a temporary variable to preserve the original pointer until
allocation succeeds
3. Removing the error path that sets markers_cnt=0 to keep
consistency between markers and markers_cnt
Found via static analysis and this is similar to commit 42378a9ca5
("bpf, verifier: Fix memory leak in array reallocation for stack state")
Fixes: d0e7915d2a ("s390/mm/ptdump: Generate address marker array dynamically")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64e2f60f35 upstream.
As reported by Luiz Capitulino enabling HVO on s390 leads to reproducible
crashes. The problem is that kernel page tables are modified without
flushing corresponding TLB entries.
Even if it looks like the empty flush_tlb_all() implementation on s390 is
the problem, it is actually a different problem: on s390 it is not allowed
to replace an active/valid page table entry with another valid page table
entry without the detour over an invalid entry. A direct replacement may
lead to random crashes and/or data corruption.
In order to invalidate an entry special instructions have to be used
(e.g. ipte or idte). Alternatively there are also special instructions
available which allow to replace a valid entry with a different valid
entry (e.g. crdte or cspg).
Given that the HVO code currently does not provide the hooks to allow for
an implementation which is compliant with the s390 architecture
requirements, disable ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP again, which is
basically a revert of the original patch which enabled it.
Reported-by: Luiz Capitulino <luizcap@redhat.com>
Closes: https://lore.kernel.org/all/20251028153930.37107-1-luizcap@redhat.com/
Fixes: 00a34d5a99 ("s390: select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP")
Cc: stable@vger.kernel.org
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0fd20f65df upstream.
Do not block PCI config accesses through pci_cfg_access_lock() when
executing the s390 variant of PCI error recovery: Acquire just
device_lock() instead of pci_dev_lock() as powerpc's EEH and
generig PCI AER processing do.
During error recovery testing a pair of tasks was reported to be hung:
mlx5_core 0000:00:00.1: mlx5_health_try_recover:338:(pid 5553): health recovery flow aborted, PCI reads still not working
INFO: task kmcheck:72 blocked for more than 122 seconds.
Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kmcheck state:D stack:0 pid:72 tgid:72 ppid:2 flags:0x00000000
Call Trace:
[<000000065256f030>] __schedule+0x2a0/0x590
[<000000065256f356>] schedule+0x36/0xe0
[<000000065256f572>] schedule_preempt_disabled+0x22/0x30
[<0000000652570a94>] __mutex_lock.constprop.0+0x484/0x8a8
[<000003ff800673a4>] mlx5_unload_one+0x34/0x58 [mlx5_core]
[<000003ff8006745c>] mlx5_pci_err_detected+0x94/0x140 [mlx5_core]
[<0000000652556c5a>] zpci_event_attempt_error_recovery+0xf2/0x398
[<0000000651b9184a>] __zpci_event_error+0x23a/0x2c0
INFO: task kworker/u1664:6:1514 blocked for more than 122 seconds.
Not tainted 5.14.0-570.12.1.bringup7.el9.s390x #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u1664:6 state:D stack:0 pid:1514 tgid:1514 ppid:2 flags:0x00000000
Workqueue: mlx5_health0000:00:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
[<000000065256f030>] __schedule+0x2a0/0x590
[<000000065256f356>] schedule+0x36/0xe0
[<0000000652172e28>] pci_wait_cfg+0x80/0xe8
[<0000000652172f94>] pci_cfg_access_lock+0x74/0x88
[<000003ff800916b6>] mlx5_vsc_gw_lock+0x36/0x178 [mlx5_core]
[<000003ff80098824>] mlx5_crdump_collect+0x34/0x1c8 [mlx5_core]
[<000003ff80074b62>] mlx5_fw_fatal_reporter_dump+0x6a/0xe8 [mlx5_core]
[<0000000652512242>] devlink_health_do_dump.part.0+0x82/0x168
[<0000000652513212>] devlink_health_report+0x19a/0x230
[<000003ff80075a12>] mlx5_fw_fatal_reporter_err_work+0xba/0x1b0 [mlx5_core]
No kernel log of the exact same error with an upstream kernel is
available - but the very same deadlock situation can be constructed there,
too:
- task: kmcheck
mlx5_unload_one() tries to acquire devlink lock while the PCI error
recovery code has set pdev->block_cfg_access by way of
pci_cfg_access_lock()
- task: kworker
mlx5_crdump_collect() tries to set block_cfg_access through
pci_cfg_access_lock() while devlink_health_report() had acquired
the devlink lock.
A similar deadlock situation can be reproduced by requesting a
crdump with
> devlink health dump show pci/<BDF> reporter fw_fatal
while PCI error recovery is executed on the same <BDF> physical function
by mlx5_core's pci_error_handlers. On s390 this can be injected with
> zpcictl --reset-fw <BDF>
Tests with this patch failed to reproduce that second deadlock situation,
the devlink command is rejected with "kernel answers: Permission denied" -
and we get a kernel log message of:
mlx5_core 1ed0:00:00.1: mlx5_crdump_collect:50:(pid 254382): crdump: failed to lock vsc gw err -5
because the config read of VSC_SEMAPHORE is rejected by the underlying
hardware.
Two prior attempts to address this issue have been discussed and
ultimately rejected [see link], with the primary argument that s390's
implementation of PCI error recovery is imposing restrictions that
neither powerpc's EEH nor PCI AER handling need. Tests show that PCI
error recovery on s390 is running to completion even without blocking
access to PCI config space.
Link: https://lore.kernel.org/all/20251007144826.2825134-1-gbayer@linux.ibm.com/
Cc: stable@vger.kernel.org
Fixes: 4cdf2f4e24 ("s390/pci: implement minimal PCI error recovery")
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 434f7349a1 upstream.
Commit 4e65bda827 ("ASoC: wcd934x: fix error handling in
wcd934x_codec_parse_data()") revealed the problem in the slimbus regmap.
That commit breaks audio playback, for instance, on sdm845 Thundercomm
Dragonboard 845c board:
Unable to handle kernel paging request at virtual address ffff8000847cbad4
...
CPU: 5 UID: 0 PID: 776 Comm: aplay Not tainted 6.18.0-rc1-00028-g7ea30958b305 #11 PREEMPT
Hardware name: Thundercomm Dragonboard 845c (DT)
...
Call trace:
slim_xfer_msg+0x24/0x1ac [slimbus] (P)
slim_read+0x48/0x74 [slimbus]
regmap_slimbus_read+0x18/0x24 [regmap_slimbus]
_regmap_raw_read+0xe8/0x174
_regmap_bus_read+0x44/0x80
_regmap_read+0x60/0xd8
_regmap_update_bits+0xf4/0x140
_regmap_select_page+0xa8/0x124
_regmap_raw_write_impl+0x3b8/0x65c
_regmap_bus_raw_write+0x60/0x80
_regmap_write+0x58/0xc0
regmap_write+0x4c/0x80
wcd934x_hw_params+0x494/0x8b8 [snd_soc_wcd934x]
snd_soc_dai_hw_params+0x3c/0x7c [snd_soc_core]
__soc_pcm_hw_params+0x22c/0x634 [snd_soc_core]
dpcm_be_dai_hw_params+0x1d4/0x38c [snd_soc_core]
dpcm_fe_dai_hw_params+0x9c/0x17c [snd_soc_core]
snd_pcm_hw_params+0x124/0x464 [snd_pcm]
snd_pcm_common_ioctl+0x110c/0x1820 [snd_pcm]
snd_pcm_ioctl+0x34/0x4c [snd_pcm]
__arm64_sys_ioctl+0xac/0x104
invoke_syscall+0x48/0x104
el0_svc_common.constprop.0+0x40/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x34/0xec
el0t_64_sync_handler+0xa0/0xf0
el0t_64_sync+0x198/0x19c
The __devm_regmap_init_slimbus() started to be used instead of
__regmap_init_slimbus() after the commit mentioned above and turns out
the incorrect bus_context pointer (3rd argument) was used in
__devm_regmap_init_slimbus(). It should be just "slimbus" (which is equal
to &slimbus->dev). Correct it. The wcd934x codec seems to be the only or
the first user of devm_regmap_init_slimbus() but we should fix it till
the point where __devm_regmap_init_slimbus() was introduced therefore
two "Fixes" tags.
While at this, also correct the same argument in __regmap_init_slimbus().
Fixes: 4e65bda827 ("ASoC: wcd934x: fix error handling in wcd934x_codec_parse_data()")
Fixes: 7d6f7fb053 ("regmap: add SLIMbus support")
Cc: stable@vger.kernel.org
Cc: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Cc: Ma Ke <make24@iscas.ac.cn>
Cc: Steev Klimaszewski <steev@kali.org>
Cc: Srinivas Kandagatla <srini@kernel.org>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://patch.msgid.link/20251022201013.1740211-1-alexey.klimov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ba6502ce1 upstream.
When running "perf mem record" command on CWF, the below KASAN
global-out-of-bounds warning is seen.
==================================================================
BUG: KASAN: global-out-of-bounds in cmt_latency_data+0x176/0x1b0
Read of size 4 at addr ffffffffb721d000 by task dtlb/9850
Call Trace:
kasan_report+0xb8/0xf0
cmt_latency_data+0x176/0x1b0
setup_arch_pebs_sample_data+0xf49/0x2560
intel_pmu_drain_arch_pebs+0x577/0xb00
handle_pmi_common+0x6c4/0xc80
The issue is caused by below code in __grt_latency_data(). The code
tries to access x86_hybrid_pmu structure which doesn't exist on
non-hybrid platform like CWF.
WARN_ON_ONCE(hybrid_pmu(event->pmu)->pmu_type == hybrid_big)
So add is_hybrid() check before calling this WARN_ON_ONCE to fix the
global-out-of-bounds access issue.
Fixes: 090262439f ("perf/x86/intel: Rename model-specific pebs_latency_data functions")
Reported-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Zide Chen <zide.chen@intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251028064214.1451968-1-dapeng1.mi@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d50f210913 upstream.
Previously linker scripts would always generate vmlinuz that has sections
aligned. And thus padded (correct Authenticode calculation) and unpadded
calculation would be same. As in https://github.com/rhboot/pesign userspace
tool would produce the same authenticode digest for both of the following
commands:
pesign --padding --hash --in ./arch/x86_64/boot/bzImage
pesign --nopadding --hash --in ./arch/x86_64/boot/bzImage
The commit 3e86e4d74c ("kbuild: keep .modinfo section in
vmlinux.unstripped") added .modinfo section of variable length. Depending
on kernel configuration it may or may not be aligned.
All userspace signing tooling correctly pads such section to calculation
spec compliant authenticode digest.
However, if bzImage is not further processed and is attempted to be loaded
directly by EDK2 firmware, it calculates unpadded Authenticode digest and
fails to correct accept/reject such kernel builds even when propoer
Authenticode values are enrolled in db/dbx. One can say EDK2 requires
aligned/padded kernels in Secureboot.
Thus add ALIGN(8) to the .modinfo section, to esure kernels irrespective of
modinfo contents can be loaded by all existing EDK2 firmware builds.
Fixes: 3e86e4d74c ("kbuild: keep .modinfo section in vmlinux.unstripped")
Cc: stable@vger.kernel.org
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@surgut.co.uk>
Link: https://patch.msgid.link/20251026202100.679989-1-dimitri.ledkov@surgut.co.uk
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 19de03b312 upstream.
A REQ_OP_OPEN_ZONE request changes the condition of a sequential zone of
a zoned block device to the explicitly open condition
(BLK_ZONE_COND_EXP_OPEN). As such, it should be considered a write
operation.
Change this operation code to be an odd number to reflect this. The
following operation numbers are changed to keep the numbering compact.
No problems were reported without this change as this operation has no
data. However, this unifies the zone operation to reflect that they
modify the device state and also allows strengthening checks in the
block layer, e.g. checking if this operation is not issued against a
read-only device.
Fixes: 6c1b1da58f ("block: add zone open, close and finish operations")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12a1c9353c upstream.
REQ_OP_ZONE_RESET_ALL is a zone management request. Fix
op_is_zone_mgmt() to return true for that operation, like it already
does for REQ_OP_ZONE_RESET.
While no problems were reported without this fix, this change allows
strengthening checks in various block device drivers (scsi sd,
virtioblk, DM) where op_is_zone_mgmt() is used to verify that a zone
management command is not being issued to a regular block device.
Fixes: 6c1b1da58f ("block: add zone open, close and finish operations")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58764259eb upstream.
Usage of the ACPI device should be phased out in the future, as
the driver itself is now using the platform bus.
Replace any usage of struct acpi_device in acpi_fan_get_fst() to
allow users to drop usage of struct acpi_device.
Also extend the integer check to all three package elements.
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20251007234149.2769-2-W_Armin@gmx.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 501672e3c1 ]
Previously this was initialized with zero which represented PCIe Gen
1.0 instead of using the
maximum value from the speed table which is the behaviour of all other
smumgr implementations.
Fixes: 18aafc59b1 ("drm/amd/powerplay: implement fw related smu interface for iceland.")
Signed-off-by: John Smith <itistotalbotnet@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 92b0a6ae6672857ddeabf892223943d2f0e06c97)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 07a13f913c ]
Previously this was initialized with zero which represented PCIe Gen
1.0 instead of using the
maximum value from the speed table which is the behaviour of all other
smumgr implementations.
Fixes: 18edef19ea ("drm/amd/powerplay: implement fw image related smu interface for Fiji.")
Signed-off-by: John Smith <itistotalbotnet@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c52238c9fb414555c68340cd80e487d982c1921c)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 238d468d3e ]
'table_index' is a variable defined by the smu driver (kmd)
'table_id' is a variable defined by the hw smu (pmfw)
This code should use table_index as a bounds check.
Fixes: caad2613dc ("drm/amd/powerplay: move table setting common code to smu_cmn.c")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fca0c66b22303de0d1d6313059baf4dc960a4753)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 745bae76ac ]
Since the allocation of the drivers main structure was changed to
devm_drm_dev_alloc() drm_put_dev()'ing to trigger it to be free'd
should be done by devres.
However, drm_put_dev() is still in the probe error and device remove
paths. When the driver fails to probe warnings like the following are
shown because devres is trying to drm_put_dev() after the driver
already did it.
[ 5.642230] radeon 0000:01:05.0: probe with driver radeon failed with error -22
[ 5.649605] ------------[ cut here ]------------
[ 5.649607] refcount_t: underflow; use-after-free.
[ 5.649620] WARNING: CPU: 0 PID: 357 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
Fixes: a9ed2f052c ("drm/radeon: change drm_dev_alloc to devm_drm_dev_alloc")
Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3eb8c0b4c091da0a623ade0d3ee7aa4a93df1ea4)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3328443363 ]
Since the allocation of the drivers main structure was changed to
devm_drm_dev_alloc() rdev is managed by devres and we shouldn't be calling
kfree() on it.
This fixes things exploding if the driver probe fails and devres cleans up
the rdev after we already free'd it.
Fixes: a9ed2f052c ("drm/radeon: change drm_dev_alloc to devm_drm_dev_alloc")
Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 16c0681617b8a045773d4d87b6140002fa75b03b)
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2dd1d0d32 ]
When configured for default synchronisation (Rx syncs to Tx) and the
SAI operates in consumer mode (clocks provided externally to Tx), a
synchronisation error occurs on Tx on the first attempt after device
initialisation when the playback stream is started while a capture
stream is already active. This results in channel shift/swap on the
playback stream.
Subsequent streams (ie after that first failing one) always work
correctly, no matter the order, with or without the other stream active.
This issue was observed (and fix tested) on an i.MX6UL board connected
to an ADAU1761 codec, where the codec provides both frame and bit clock
(connected to TX pins).
To fix this, always initialize the 'other' xCR4 and xCR5 registers when
we're starting a stream which is synced to the opposite one, irregardless
of the producer/consumer status.
Fixes: 51659ca069 ("ASoC: fsl-sai: set xCR4/xCR5/xMR for SAI master mode")
Signed-off-by: Maarten Zanders <maarten@zanders.be>
Reviewed-by: Shengjiu Wang <shengjiu.wang@gmail.com>
Link: https://patch.msgid.link/20251024135716.584265-1-maarten@zanders.be
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 520ad9e969 ]
The dpll.yaml spec incorrectly omitted module-name and clock-id from the
pin-get operation reply specification, even though the kernel DPLL
implementation has always included these attributes in pin-get responses
since the initial implementation.
This spec inconsistency caused issues with the C YNL code generator.
The generated dpll_pin_get_rsp structure was missing these fields.
Fix the spec by adding module-name and clock-id to the pin-attrs reply
specification to match the actual kernel behavior.
Fixes: 3badff3a25 ("dpll: spec: Add Netlink spec in YAML")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20251024185512.363376-1-poros@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e396694055 ]
When request a none support device operation, there will be no reply.
In this case, the len(desc) check will always be true, causing print_field
to enter an infinite loop and crash the program. Example reproducer:
# ethtool.py -c veth0
To fix this, return immediately if there is no reply.
Fixes: f3d07b02b2 ("tools: ynl: ethtool testing tool")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251024125853.102916-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 65f9c4c588 ]
The ynl_attr_put_str() function was not including the null terminator
in the attribute length calculation. This caused kernel to reject
CTRL_CMD_GETFAMILY requests with EINVAL:
"Attribute failed policy validation".
For a 4-character family name like "dpll":
- Sent: nla_len=8 (4 byte header + 4 byte string without null)
- Expected: nla_len=9 (4 byte header + 5 byte string with null)
The bug was introduced in commit 15d2540e0d ("tools: ynl: check for
overflow of constructed messages") when refactoring from stpcpy() to
strlen(). The original code correctly included the null terminator:
end = stpcpy(ynl_attr_data(attr), str);
attr->nla_len = NLA_HDRLEN + NLA_ALIGN(end -
(char *)ynl_attr_data(attr));
Since stpcpy() returns a pointer past the null terminator, the length
included it. The refactored version using strlen() omitted the +1.
The fix also removes NLA_ALIGN() from nla_len calculation, since
nla_len should contain actual attribute length, not aligned length.
Alignment is only for calculating next attribute position. This makes
the code consistent with ynl_attr_put().
CTRL_ATTR_FAMILY_NAME uses NLA_NUL_STRING policy which requires
null terminator. Kernel validates with memchr() and rejects if not
found.
Fixes: 15d2540e0d ("tools: ynl: check for overflow of constructed messages")
Signed-off-by: Petr Oros <poros@redhat.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Link: https://lore.kernel.org/20251018151737.365485-3-zahari.doychev@linux.com
Link: https://patch.msgid.link/20251024132438.351290-1-poros@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a042beac6e ]
The current logic uses the flush sequence from the current address
space. This is harmless when deducing the flush requirements for the
current submit, as either the incoming address space is the same one
as the currently active one or we switch context, in which case the
flush is unconditional.
However, this sequence is also stored as the current flush sequence
of the GPU. If we switch context the stored flush sequence will no
longer belong to the currently active address space. This incoherency
can then cause missed flushes, resulting in translation errors.
Fixes: 27b67278e0 ("drm/etnaviv: rework MMU handling")
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Link: https://lore.kernel.org/r/20251021093723.3887980-1-l.stach@pengutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 75cdae446d ]
The log messages for the PreSonus STUDIO 1810c about
device_setup are not applicable to the 1824c, and should
not be logged when 1824c initializes.
Refactor from if statement to switch statement as there
might be more STUDIO series devices added later.
Fixes: 080564558e ("ALSA: usb-audio: enable support for Presonus Studio 1824c within 1810c file")
Signed-off-by: Roy Vegard Ovesen <roy.vegard.ovesen@gmail.com>
Link: https://patch.msgid.link/aPaYTP7ceuABf8c7@ark
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 659169c4eb ]
The 1824c does not have the A/B switch that the 1810c has,
but instead it has a mono main switch that sums the two
main output channels to mono.
Signed-off-by: Roy Vegard Ovesen <roy.vegard.ovesen@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Stable-dep-of: 75cdae446d ("ALSA: usb-audio: don't log messages meant for 1810c when initializing 1824c")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 751463ceef ]
Periodic advertising enabled flag cannot be tracked by the enabled
flag since advertising and periodic advertising each can be
enabled/disabled separately from one another causing the states to be
inconsistent when for example an advertising set is disabled its
enabled flag is set to false which is then used for periodic which has
not being disabled.
Fixes: eca0ae4aea ("Bluetooth: Add initial implementation of BIS connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 857eb0fabc ]
This fixes bis_cleanup not considering connections in BT_OPEN state
before attempting to remove the BIG causing the following error:
btproxy[20110]: < HCI Command: LE Terminate Broadcast Isochronous Group (0x08|0x006a) plen 2
BIG Handle: 0x01
Reason: Connection Terminated By Local Host (0x16)
> HCI Event: Command Status (0x0f) plen 4
LE Terminate Broadcast Isochronous Group (0x08|0x006a) ncmd 1
Status: Unknown Advertising Identifier (0x42)
Fixes: fa224d0c09 ("Bluetooth: ISO: Reassociate a socket with an active BIS")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 057b6ca596 ]
In the current btintel_pcie driver implementation, when an interrupt is
received, the driver checks for the alive cause before the TX/RX cause.
Handling the alive cause involves resetting the TX/RX queue indices.
This flow works correctly when the causes are mutually exclusive.
However, if both cause bits are set simultaneously, the alive cause
resets the queue indices, resulting in an event packet drop and a
command timeout. To fix this issue, the driver is modified to handle all
other causes before checking for the alive cause.
Test case:
Issue is seen with stress reboot scenario - 50x run
[20.337589] Bluetooth: hci0: Device revision is 0
[20.346750] Bluetooth: hci0: Secure boot is enabled
[20.346752] Bluetooth: hci0: OTP lock is disabled
[20.346752] Bluetooth: hci0: API lock is enabled
[20.346752] Bluetooth: hci0: Debug lock is disabled
[20.346753] Bluetooth: hci0: Minimum firmware build 1 week 10 2014
[20.346754] Bluetooth: hci0: Bootloader timestamp 2023.43 buildtype 1 build 11631
[20.359070] Bluetooth: hci0: Found device firmware: intel/ibt-00a0-00a1-iml.sfi
[20.371499] Bluetooth: hci0: Boot Address: 0xb02ff800
[20.385769] Bluetooth: hci0: Firmware Version: 166-34.25
[20.538257] Bluetooth: hci0: Waiting for firmware download to complete
[20.554424] Bluetooth: hci0: Firmware loaded in 178651 usecs
[21.081588] Bluetooth: hci0: Timeout (500 ms) on tx completion
[21.096541] Bluetooth: hci0: Failed to send frame (-62)
[21.110240] Bluetooth: hci0: sending frame failed (-62)
[21.138551] Bluetooth: hci0: Failed to send Intel Reset command
[21.170153] Bluetooth: hci0: Intel Soft Reset failed (-62)
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Sai Teja Aluvala <aluvala.sai.teja@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Fixes: c2b636b3f7 ("Bluetooth: btintel_pcie: Add support for PCIe transport")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c403da5e98 ]
Socket dst_type cannot be directly assigned to hci_conn->type since
there domain is different which may lead to the wrong address type being
used.
Fixes: 6a5ad251b7 ("Bluetooth: ISO: Fix possible circular locking dependency")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8785404de ]
There is a BUG: KASAN: stack-out-of-bounds in set_mesh_sync due to
memcpy from badly declared on-stack flexible array.
Another crash is in set_mesh_complete() due to double list_del via
mgmt_pending_valid + mgmt_pending_remove.
Use DEFINE_FLEX to declare the flexible array right, and don't memcpy
outside bounds.
As mgmt_pending_valid removes the cmd from list, use mgmt_pending_free,
and also report status on error.
Fixes: 302a1f674c ("Bluetooth: MGMT: Fix possible UAFs")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d92808024 ]
This fixes the state tracking of advertisement set/instance 0x00 which
is considered a legacy instance and is not tracked individually by
adv_instances list, previously it was assumed that hci_dev itself would
track it via HCI_LE_ADV but that is a global state not specifc to
instance 0x00, so to fix it a new flag is introduced that only tracks the
state of instance 0x00.
Fixes: 1488af7b8b ("Bluetooth: hci_sync: Fix hci_resume_advertising_sync")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 77343b8b4f ]
This patch adds logic to handle power management control when the
Bluetooth function is closed during the SDIO reset sequence.
Specifically, if BT is closed before reset, the driver enables the
SDIO function and sets driver pmctrl. After reset, if BT remains
closed, the driver sets firmware pmctrl and disables the SDIO function.
These changes ensure proper power management and device state consistency
across the reset flow.
Fixes: 8fafe70225 ("Bluetooth: mt7921s: support bluetooth reset mechanism")
Signed-off-by: Chris Lu <chris.lu@mediatek.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f0c200a4a5 ]
Socket dst_type cannot be directly assigned to hci_conn->type since
there domain is different which may lead to the wrong address type being
used.
Fixes: 6a5ad251b7 ("Bluetooth: ISO: Fix possible circular locking dependency")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09b0cd1297 ]
hci_cmd_sync_dequeue_once() does lookup and then cancel
the entry under two separate lock sections. Meanwhile,
hci_cmd_sync_work() can also delete the same entry,
leading to double list_del() and "UAF".
Fix this by holding cmd_sync_work_lock across both
lookup and cancel, so that the entry cannot be removed
concurrently.
Fixes: 505ea2b295 ("Bluetooth: hci_sync: Add helper functions to manipulate cmd_sync queue")
Reported-by: Cen Zhang <zzzccc427@163.com>
Signed-off-by: Cen Zhang <zzzccc427@163.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 420c84c330 ]
The root cause of this issue are:
1. When probing the usbnet device, executing usbnet_link_change(dev, 0, 0);
put the kevent work in global workqueue. However, the kevent has not yet
been scheduled when the usbnet device is unregistered. Therefore, executing
free_netdev() results in the "free active object (kevent)" error reported
here.
2. Another factor is that when calling usbnet_disconnect()->unregister_netdev(),
if the usbnet device is up, ndo_stop() is executed to cancel the kevent.
However, because the device is not up, ndo_stop() is not executed.
The solution to this problem is to cancel the kevent before executing
free_netdev().
Fixes: a69e617e53 ("usbnet: Fix linkwatch use-after-free on disconnect")
Reported-by: Sam Sun <samsun1006219@gmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=8bfd7bcc98f7300afb84
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Link: https://patch.msgid.link/20251022024007.1831898-1-lizhi.xu@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79a6f2da16 ]
Both mt8195-afe-pcm and mt8365-afe-pcm drivers use devm_pm_runtime_enable()
in probe function, which automatically calls pm_runtime_disable() on device
removal via devres mechanism. However, the remove callbacks explicitly call
pm_runtime_disable() again, resulting in double pm_runtime_disable() calls.
Fix by removing the redundant pm_runtime_disable() calls from remove
functions, letting the devres framework handle it automatically.
Fixes: 2ca0ec01d4 ("ASoC: mediatek: mt8195-afe-pcm: Simplify runtime PM during probe")
Fixes: e1991d102b ("ASoC: mediatek: mt8365: Add the AFE driver support")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Link: https://patch.msgid.link/20251020170440.585-1-vulab@iscas.ac.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cfca1637bc ]
The pcm->prepare() function may be called multiple times in a row by the
userspace, as mentioned in the documentation. The driver shall take that
into account and prevent redundancy. However, the exact same function is
called during XRUNs and in such case, the particular stream shall be
reset and setup anew.
Fixes: 9114700b49 ("ASoC: Intel: avs: Generic PCM FE operations")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://patch.msgid.link/20251023092348.3119313-2-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3c9bf72cc1 ]
The clock obtained via devm_clk_get_enabled() is automatically managed
by devres and will be disabled and freed on driver detach. Manually
calling clk_disable_unprepare() in error path and remove function
causes double free.
Remove the manual clock cleanup in both aspeed_acry_probe()'s error
path and aspeed_acry_remove().
Fixes: 2f1cf4e50c ("crypto: aspeed - Add ACRY RSA driver")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ac2939bc4 ]
The phmac implementation used the req->nbytes field on combined
operations (finup, digest) to track the state:
with req->nbytes > 0 the update needs to be processed,
while req->nbytes == 0 means to do the final operation. For
this purpose the req->nbytes field was set to 0 after successful
update operation. However, aead uses the req->nbytes field after a
successful hash operation to determine the amount of data to
en/decrypt. So an implementation must not modify the nbytes field.
Fixed by a slight rework on the phmac implementation. There is
now a new field async_op in the request context which tracks
the (asynch) operation to process. So the 'state' via req->nbytes
is not needed any more and now this field is untouched and may
be evaluated even after a request is processed by the phmac
implementation.
Fixes: cbbc675506 ("crypto: s390 - New s390 specific protected key hash phmac")
Reported-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Tested-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 60ad1de8e5 ]
The target code should set the sc_c bit in calculating the host response
based on the status of the 'concat' setting, otherwise we'll get an
authentication mismatch for hosts setting that bit correctly.
Fixes: 7e091add9c ("nvme-auth: update sc_c in host response")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 881a9c9cb7 ]
The failure of this check only results in a security mitigation being
applied, slightly affecting performance of the compiled BPF program. It
doesn't result in a failed syscall, an thus auditing a failed LSM
permission check for it is unwanted. For example with SELinux, it causes
a denial to be reported for confined processes running as root, which
tends to be flagged as a problem to be fixed in the policy. Yet
dontauditing or allowing CAP_SYS_ADMIN to the domain may not be
desirable, as it would allow/silence also other checks - either going
against the principle of least privilege or making debugging potentially
harder.
Fix it by changing it from capable() to ns_capable_noaudit(), which
instructs the LSMs to not audit the resulting denials.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2369326
Fixes: d4e89d212d ("x86/bpf: Call branch history clearing sequence on exit")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20251021122758.2659513-1-omosnace@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2551a1eedc ]
The previous implementation incorrectly assumed the original type of
'priv' was void**, leading to an unnecessary and misleading
cast. Correct the cast of the 'priv' pointer in test_dev_action() to
its actual type, long*, removing an unnecessary cast.
As an additional benefit, this fixes an out-of-bounds CHERI fault on
hardware with architectural capabilities. The original implementation
tried to store a capability-sized pointer using the priv
pointer. However, the priv pointer's capability only granted access to
the memory region of its original long type, leading to a bounds
violation since the size of a long is smaller than the size of a
capability. This change ensures that the pointer usage respects the
capabilities' bounds.
Link: https://lore.kernel.org/r/20251017092814.80022-1-florian.schmaus@codasip.com
Fixes: d03c720e03 ("kunit: Add APIs for managing devices")
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Florian Schmaus <florian.schmaus@codasip.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6078447614 ]
When ieee80211_stop_ap() deletes the FILS discovery and unsolicited
broadcast probe response templates, the associated interval values
are not reset. This can lead to drivers subsequently operating with
the non-zero values, leading to unexpected behavior.
Trigger repeated retrieval attempts of the FILS discovery template in
ath12k, resulting in excessive log messages such as:
mac vdev 0 failed to retrieve FILS discovery template
mac vdev 4 failed to retrieve FILS discovery template
Fix this by resetting the intervals in ieee80211_stop_ap() to ensure
proper cleanup of FILS discovery and unsolicited broadcast probe
response templates.
Fixes: 295b02c4be ("mac80211: Add FILS discovery support")
Fixes: 632189a018 ("mac80211: Unsolicited broadcast probe response support")
Signed-off-by: Aloka Dixit <aloka.dixit@oss.qualcomm.com>
Signed-off-by: Aaradhana Sahu <aaradhana.sahu@oss.qualcomm.com>
Link: https://patch.msgid.link/20250924130014.2575533-1-aaradhana.sahu@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9c78e747dd ]
Bitwise operations with WMI_KEY_PAIRWISE (defined as 0) are ineffective
and misleading. This results in pairwise key validations added in
commit 97acb0259c ("wifi: ath11k: fix group data packet drops
during rekey") to always evaluate false and clear key commands for
pairwise keys are not honored.
Since firmware supports overwriting the new key without explicitly
clearing the previous one, there is no visible impact currently.
However, to restore consistency with the previous behavior and improve
clarity, replace bitwise operations with direct assignments and
comparisons for key flags.
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.9.0.1-02146-QCAHKSWPL_SILICONZ-1
Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-wireless/aLlaetkalDvWcB7b@stanley.mountain
Fixes: 97acb0259c ("wifi: ath11k: fix group data packet drops during rekey")
Signed-off-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20251003092158.1080637-1-rameshkumar.sundaram@oss.qualcomm.com
[update copyright per current guidance]
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 92282074e1 ]
ath12k just like ath11k [1] did not handle skb cleanup during idr
cleanup callback. Both ath12k_mac_vif_txmgmt_idr_remove() and
ath12k_mac_tx_mgmt_pending_free() performed idr cleanup and DMA
unmapping for skb but only ath12k_mac_tx_mgmt_pending_free() freed
skb. As a result, during vdev deletion a memory leak occurs.
Refactor all clean up steps into a new function. New function
ath12k_mac_tx_mgmt_free() creates a centralized area where idr
cleanup, DMA unmapping for skb and freeing skb is performed. Utilize
skb pointer given by idr_remove(), instead of passed as a function
argument because IDR will be protected by locking. This will prevent
concurrent modification of the same IDR.
Now ath12k_mac_tx_mgmt_pending_free() and
ath12k_mac_vif_txmgmt_idr_remove() call ath12k_mac_tx_mgmt_free().
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.4.1-00199-QCAHKSWPL_SILICONZ-1
Link: https://lore.kernel.org/r/1637832614-13831-1-git-send-email-quic_srirrama@quicinc.com > # [1]
Fixes: d889913205 ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: Karthik M <quic_karm@quicinc.com>
Signed-off-by: Muna Sinada <muna.sinada@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Link: https://patch.msgid.link/20250923220316.1595758-1-muna.sinada@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 388eff894d upstream.
Sean reported [1] the following splat when running KVM tests:
WARNING: CPU: 232 PID: 15391 at xfd_validate_state+0x65/0x70
Call Trace:
<TASK>
fpu__clear_user_states+0x9c/0x100
arch_do_signal_or_restart+0x142/0x210
exit_to_user_mode_loop+0x55/0x100
do_syscall_64+0x205/0x2c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Chao further identified [2] a reproducible scenario involving signal
delivery: a non-AMX task is preempted by an AMX-enabled task which
modifies the XFD MSR.
When the non-AMX task resumes and reloads XSTATE with init values,
a warning is triggered due to a mismatch between fpstate::xfd and the
CPU's current XFD state. fpu__clear_user_states() does not currently
re-synchronize the XFD state after such preemption.
Invoke xfd_update_state() which detects and corrects the mismatch if
there is a dynamic feature.
This also benefits the sigreturn path, as fpu__restore_sig() may call
fpu__clear_user_states() when the sigframe is inaccessible.
[ dhansen: minor changelog munging ]
Closes: https://lore.kernel.org/lkml/aDCo_SczQOUaB2rS@google.com [1]
Fixes: 672365477a ("x86/fpu: Update XFD state where required")
Reported-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
Link: https://lore.kernel.org/all/aDWbctO%2FRfTGiCg3@intel.com [2]
Cc:stable@vger.kernel.org
Link: https://patch.msgid.link/20250610001700.4097-1-chang.seok.bae%40intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 607b9fb2ce upstream.
There's an issue with RDSEED's 16-bit and 32-bit register output
variants on Zen5 which return a random value of 0 "at a rate inconsistent
with randomness while incorrectly signaling success (CF=1)". Search the
web for AMD-SB-7055 for more detail.
Add a fix glue which checks microcode revisions.
[ bp: Add microcode revisions checking, rewrite. ]
Cc: stable@vger.kernel.org
Signed-off-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20251018024010.4112396-1-gourry@gourry.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d6e9ec80c upstream.
Leyvi Rose reported that his X86_NATIVE_CPU=y build is failing because our
instruction decoder doesn't support SSE4a and the AMDGPU code seems to be
generating those with his compiler of choice (CLANG+LTO).
Now, our normal build flags disable SSE MMX SSE2 3DNOW AVX, but then
CC_FLAGS_FPU re-enable SSE SSE2.
Since nothing mentions SSE3 or SSE4, I'm assuming that -msse (or its negative)
control all SSE variants -- but why then explicitly enumerate SSE2 ?
Anyway, until the instruction decoder gets fixed, explicitly disallow SSE4a
(an AMD specific SSE4 extension).
Fixes: ea1dcca1de ("x86/kbuild/64: Add the CONFIG_X86_NATIVE_CPU option to locally optimize the kernel with '-march=native'")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Arisu Tachibana <arisu.tachibana@miraclelinux.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c76f9961c upstream.
When smb2_query_info_compound() retries, a previously allocated cfid may
have been freed in the first attempt.
Because cfid wasn't reset on replay, later cleanup could act on a stale
pointer, leading to a potential use-after-free.
Reinitialize cfid to NULL under the replay label.
Example trace (trimmed):
refcount_t: underflow; use-after-free.
WARNING: CPU: 1 PID: 11224 at ../lib/refcount.c:28 refcount_warn_saturate+0x9c/0x110
[...]
RIP: 0010:refcount_warn_saturate+0x9c/0x110
[...]
Call Trace:
<TASK>
smb2_query_info_compound+0x29c/0x5c0 [cifs f90b72658819bd21c94769b6a652029a07a7172f]
? step_into+0x10d/0x690
? __legitimize_path+0x28/0x60
smb2_queryfs+0x6a/0xf0 [cifs f90b72658819bd21c94769b6a652029a07a7172f]
smb311_queryfs+0x12d/0x140 [cifs f90b72658819bd21c94769b6a652029a07a7172f]
? kmem_cache_alloc+0x18a/0x340
? getname_flags+0x46/0x1e0
cifs_statfs+0x9f/0x2b0 [cifs f90b72658819bd21c94769b6a652029a07a7172f]
statfs_by_dentry+0x67/0x90
vfs_statfs+0x16/0xd0
user_statfs+0x54/0xa0
__do_sys_statfs+0x20/0x50
do_syscall_64+0x58/0x80
Cc: stable@kernel.org
Fixes: 4f1fffa237 ("cifs: commands that are retried should have replay flag set")
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Acked-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b45873c3f0 upstream.
Commit c1e18c17bd ("s390/pci: add zpci_set_irq()/zpci_clear_irq()"),
introduced the zpci_set_irq() and zpci_clear_irq(), to be used while
resetting a zPCI device.
Commit da995d538d ("s390/pci: implement reset_slot for hotplug
slot"), mentions zpci_clear_irq() being called in the path for
zpci_hot_reset_device(). But that is not the case anymore and these
functions are not called outside of this file. Instead
zpci_hot_reset_device() relies on zpci_disable_device() also clearing
the IRQs, but misses to reset the zdev->irqs_registered flag.
However after a CLP disable/enable reset, the device's IRQ are
unregistered, but the flag zdev->irq_registered does not get cleared. It
creates an inconsistent state and so arch_restore_msi_irqs() doesn't
correctly restore the device's IRQ. This becomes a problem when a PCI
driver tries to restore the state of the device through
pci_restore_state(). Restore IRQ unconditionally for the device and remove
the irq_registered flag as its redundant.
Fixes: c1e18c17bd ("s390/pci: add zpci_set_irq()/zpci_clear_irq()")
Cc: stable@vger.kernnel.org
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22897e5686 upstream.
When the driver supports DMA, it enqueues four DMA descriptors per
substream before the substream is started. New descriptors are enqueued in
the DMA completion callback, and each time a new descriptor is queued, the
dma_buffer_pos is incremented.
During suspend, the DMA transactions are terminated. There might be cases
where the four extra enqueued DMA descriptors are not completed and are
instead canceled on suspend. However, the cancel operation does not take
into account that the dma_buffer_pos was already incremented.
Previously, the suspend code reinitialized dma_buffer_pos to zero, but this
is not always correct.
To avoid losing any audio periods during suspend/resume and to prevent
clip sound, save the completed DMA buffer position in the DMA callback and
reinitialize dma_buffer_pos on resume.
Cc: stable@vger.kernel.org
Fixes: 1fc778f7c8 ("ASoC: renesas: rz-ssi: Add suspend to RAM support")
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20251029141134.2556926-3-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 27b0e701d3 upstream.
Accessing the transmit queue without owning the msk socket lock is
inherently racy, hence __mptcp_check_push() could actually quit early
even when there is pending data.
That in turn could cause unexpected tx lock and timeout.
Dropping the early check avoids the race, implicitly relaying on later
tests under the relevant lock. With such change, all the other
mptcp_send_head() call sites are now under the msk socket lock and we
can additionally drop the now unneeded annotation on the transmit head
pointer accesses.
Fixes: 6e628cd3a8 ("mptcp: use mptcp release_cb for delayed tasks")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Tested-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251028-net-mptcp-send-timeout-v1-1-38ffff5a9ec8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb53368f8d upstream.
The of_find_node_by_name() function returns a device tree node with its
reference count incremented. The caller is responsible for calling
of_node_put() to release this reference when done.
Found via static analysis.
Fixes: cc5d0189b9 ("[PATCH] powerpc: Remove device_node addrs/n_addr")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f566c0ac5 upstream.
Commit e24cca19ba ("sh: Kill off MAX_DMA_ADDRESS leftovers.") removed
the define ONCHIP_NR_DMA_CHANNELS. So that the leftover reference needs
to be replaced by CONFIG_NR_ONCHIP_DMA_CHANNELS to compile successfully
with CONFIG_PVR2_DMA enabled.
Signed-off-by: Florian Fuchs <fuchsfl@gmail.com>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3776c685eb upstream.
Currently, whenever there is a need to transmit an Action frame,
the brcmfmac driver always uses the P2P vif to send the "actframe" IOVAR to
firmware. The P2P interfaces were available when wpa_supplicant is managing
the wlan interface.
However, the P2P interfaces are not created/initialized when only hostapd
is managing the wlan interface. And if hostapd receives an ANQP Query REQ
Action frame even from an un-associated STA, the brcmfmac driver tries
to use an uninitialized P2P vif pointer for sending the IOVAR to firmware.
This NULL pointer dereferencing triggers a driver crash.
[ 1417.074538] Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000000
[...]
[ 1417.075188] Hardware name: Raspberry Pi 4 Model B Rev 1.5 (DT)
[...]
[ 1417.075653] Call trace:
[ 1417.075662] brcmf_p2p_send_action_frame+0x23c/0xc58 [brcmfmac]
[ 1417.075738] brcmf_cfg80211_mgmt_tx+0x304/0x5c0 [brcmfmac]
[ 1417.075810] cfg80211_mlme_mgmt_tx+0x1b0/0x428 [cfg80211]
[ 1417.076067] nl80211_tx_mgmt+0x238/0x388 [cfg80211]
[ 1417.076281] genl_family_rcv_msg_doit+0xe0/0x158
[ 1417.076302] genl_rcv_msg+0x220/0x2a0
[ 1417.076317] netlink_rcv_skb+0x68/0x140
[ 1417.076330] genl_rcv+0x40/0x60
[ 1417.076343] netlink_unicast+0x330/0x3b8
[ 1417.076357] netlink_sendmsg+0x19c/0x3f8
[ 1417.076370] __sock_sendmsg+0x64/0xc0
[ 1417.076391] ____sys_sendmsg+0x268/0x2a0
[ 1417.076408] ___sys_sendmsg+0xb8/0x118
[ 1417.076427] __sys_sendmsg+0x90/0xf8
[ 1417.076445] __arm64_sys_sendmsg+0x2c/0x40
[ 1417.076465] invoke_syscall+0x50/0x120
[ 1417.076486] el0_svc_common.constprop.0+0x48/0xf0
[ 1417.076506] do_el0_svc+0x24/0x38
[ 1417.076525] el0_svc+0x30/0x100
[ 1417.076548] el0t_64_sync_handler+0x100/0x130
[ 1417.076569] el0t_64_sync+0x190/0x198
[ 1417.076589] Code: f9401e80 aa1603e2 f9403be1 5280e483 (f9400000)
Fix this, by always using the vif corresponding to the wdev on which the
Action frame Transmission request was initiated by the userspace. This way,
even if P2P vif is not available, the IOVAR is sent to firmware on AP vif
and the ANQP Query RESP Action frame is transmitted without crashing the
driver.
Move init_completion() for "send_af_done" from brcmf_p2p_create_p2pdev()
to brcmf_p2p_attach(). Because the former function would not get executed
when only hostapd is managing wlan interface, and it is not safe to do
reinit_completion() later in brcmf_p2p_tx_action_frame(), without any prior
init_completion().
And in the brcmf_p2p_tx_action_frame() function, the condition check for
P2P Presence response frame is not needed, since the wpa_supplicant is
properly sending the P2P Presense Response frame on the P2P-GO vif instead
of the P2P-Device vif.
Cc: stable@vger.kernel.org
Fixes: 18e2f61db3 ("brcmfmac: P2P action frame tx")
Signed-off-by: Gokul Sivakumar <gokulkumar.sivakumar@infineon.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://patch.msgid.link/20251013102819.9727-1-gokulkumar.sivakumar@infineon.com
[Cc stable]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91d35ec9b3 upstream.
The RFCOMM driver confuses the local and remote modem control signals,
which specifically means that the reported DTR and RTS state will
instead reflect the remote end (i.e. DSR and CTS).
This issue dates back to the original driver (and a follow-on update)
merged in 2002, which resulted in a non-standard implementation of
TIOCMSET that allowed controlling also the TS07.10 IC and DV signals by
mapping them to the RI and DCD input flags, while TIOCMGET failed to
return the actual state of DTR and RTS.
Note that the bogus control of input signals in tiocmset() is just
dead code as those flags will have been masked out by the tty layer
since 2003.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f12b69d8f2 upstream.
Trying to dump the originators or the neighbors via netlink for a meshif
with an inactive primary interface is not allowed. The dump functions were
checking this correctly but they didn't handle non-existing primary
interfaces and existing _inactive_ interfaces differently.
(Primary) batadv_hard_ifaces hold a references to a net_device. And
accessing them is only allowed when either being in a RCU/spinlock
protected section or when holding a valid reference to them. The netlink
dump functions use the latter.
But because the missing specific error handling for inactive primary
interfaces, the reference was never dropped. This reference counting error
was only detected when the interface should have been removed from the
system:
unregister_netdevice: waiting for batadv_slave_0 to become free. Usage count = 2
Cc: stable@vger.kernel.org
Fixes: 6ecc4fd6c2 ("batman-adv: netlink: reduce duplicate code by returning interfaces")
Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ab6658174 upstream.
In virtio-net, we have not yet supported multi-buffer XDP packet in
zerocopy mode when there is a binding XDP program. However, in that
case, when receiving multi-buffer XDP packet, we skip the XDP program
and return XDP_PASS. As a result, the packet is passed to normal network
stack which is an incorrect behavior (e.g. a XDP program for packet
count is installed, multi-buffer XDP packet arrives and does go through
XDP program. As a result, the packet count does not increase but the
packet is still received from network stack).This commit instead returns
XDP_ABORTED in that case.
Fixes: 99c861b44e ("virtio_net: xsk: rx: support recv merge mode")
Cc: stable@vger.kernel.org
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Bui Quang Minh <minhquangbui99@gmail.com>
Link: https://patch.msgid.link/20251022155630.49272-1-minhquangbui99@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d91a1d129b upstream.
Device-managed resources are cleaned up when the driver unbinds from
the underlying device. In our case this is the platform device as this
driver is a platform driver. Registering device-managed resources on
the associated ACPI device will thus result in a resource leak when
this driver unbinds.
Ensure that any device-managed resources are only registered on the
platform device to ensure that they are cleaned up during removal.
Fixes: 35c50d853a ("ACPI: fan: Add hwmon support")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Cc: 6.11+ <stable@vger.kernel.org> # 6.11+
Link: https://patch.msgid.link/20251007234149.2769-4-W_Armin@gmx.de
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f067aa594 upstream.
The switch_brightness_work delayed work accesses device->brightness
and device->backlight, freed by acpi_video_dev_unregister_backlight()
during device removal.
If the work executes after acpi_video_bus_unregister_backlight()
frees these resources, it causes a use-after-free when
acpi_video_switch_brightness() dereferences device->brightness or
device->backlight.
Fix this by calling cancel_delayed_work_sync() for each device's
switch_brightness_work in acpi_video_bus_remove_notify_handler()
after removing the notify handler that queues the work. This ensures
the work completes before the memory is freed.
Fixes: 8ab58e8e7e ("ACPI / video: Fix backlight taking 2 steps on a brightness up/down keypress")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Yuhao Jiang <danisjiang@gmail.com>
Reviewed-by: Hans de Goede <hansg@kernel.org>
[ rjw: Changelog edit ]
Link: https://patch.msgid.link/20251022200704.2655507-1-danisjiang@gmail.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7073c7fc8d upstream.
Actually check the return value from pll_ops->init_pll()
as it can return an error.
If the card's BIOS didn't run because it's not the primary VGA card
the fact that the xclk source is unsupported is printed as shown
below but the driver continues on regardless and on my machine causes
a hard lock up.
[ 61.470088] atyfb 0000:03:05.0: enabling device (0080 -> 0083)
[ 61.476191] atyfb: using auxiliary register aperture
[ 61.481239] atyfb: 3D RAGE XL (Mach64 GR, PCI-33) [0x4752 rev 0x27]
[ 61.487569] atyfb: 512K SGRAM (1:1), 14.31818 MHz XTAL, 230 MHz PLL, 83 Mhz MCLK, 63 MHz XCLK
[ 61.496112] atyfb: Unsupported xclk source: 5.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Daniel Palmer <daniel@0x0f.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a1f3058930 upstream.
Recently, we discovered the following issue through syzkaller:
BUG: KASAN: slab-use-after-free in fb_mode_is_equal+0x285/0x2f0
Read of size 4 at addr ff11000001b3c69c by task syz.xxx
...
Call Trace:
<TASK>
dump_stack_lvl+0xab/0xe0
print_address_description.constprop.0+0x2c/0x390
print_report+0xb9/0x280
kasan_report+0xb8/0xf0
fb_mode_is_equal+0x285/0x2f0
fbcon_mode_deleted+0x129/0x180
fb_set_var+0xe7f/0x11d0
do_fb_ioctl+0x6a0/0x750
fb_ioctl+0xe0/0x140
__x64_sys_ioctl+0x193/0x210
do_syscall_64+0x5f/0x9c0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Based on experimentation and analysis, during framebuffer unregistration,
only the memory of fb_info->modelist is freed, without setting the
corresponding fb_display[i]->mode to NULL for the freed modes. This leads
to UAF issues during subsequent accesses. Here's an example of reproduction
steps:
1. With /dev/fb0 already registered in the system, load a kernel module
to register a new device /dev/fb1;
2. Set fb1's mode to the global fb_display[] array (via FBIOPUT_CON2FBMAP);
3. Switch console from fb to VGA (to allow normal rmmod of the ko);
4. Unload the kernel module, at this point fb1's modelist is freed, leaving
a wild pointer in fb_display[];
5. Trigger the bug via system calls through fb0 attempting to delete a mode
from fb0.
Add a check in do_unregister_framebuffer(): if the mode to be freed exists
in fb_display[], set the corresponding mode pointer to NULL.
Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e7f011c25 upstream.
I've found that pynfs COMP6 now leaves the connection or lease in a
strange state, which causes CLOSE9 to hang indefinitely. I've dug
into it a little, but I haven't been able to root-cause it yet.
However, I bisected to commit 48aab1606f ("NFSD: Remove the cap on
number of operations per NFSv4 COMPOUND").
Tianshuo Han also reports a potential vulnerability when decoding
an NFSv4 COMPOUND. An attacker can place an arbitrarily large op
count in the COMPOUND header, which results in:
[ 51.410584] nfsd: vmalloc error: size 1209533382144, exceeds total
pages, mode:0xdc0(GFP_KERNEL|__GFP_ZERO),
nodemask=(null),cpuset=/,mems_allowed=0
when NFSD attempts to allocate the COMPOUND op array.
Let's restore the operation-per-COMPOUND limit, but increased to 200
for now.
Reported-by: tianshuo han <hantianshuo233@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Cc: stable@vger.kernel.org
Tested-by: Tianshuo Han <hantianshuo233@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f76435fd5 upstream.
NFSv4 clients won't send legitimate GETATTR requests for these new
attributes because they are intended to be used only with CB_GETATTR
and SETATTR. But NFSD has to do something besides crashing if it
ever sees a GETATTR request that queries these attributes.
RFC 8881 Section 18.7.3 states:
> The server MUST return a value for each attribute that the client
> requests if the attribute is supported by the server for the
> target file system. If the server does not support a particular
> attribute on the target file system, then it MUST NOT return the
> attribute value and MUST NOT set the attribute bit in the result
> bitmap. The server MUST return an error if it supports an
> attribute on the target but cannot obtain its value. In that case,
> no attribute values will be returned.
Further, RFC 9754 Section 5 states:
> These new attributes are invalid to be used with GETATTR, VERIFY,
> and NVERIFY, and they can only be used with CB_GETATTR and SETATTR
> by a client holding an appropriate delegation.
Thus there does not appear to be a specific server response mandated
by specification. Taking the guidance that querying these attributes
via GETATTR is "invalid", NFSD will return nfserr_inval, failing the
request entirely.
Reported-by: Robert Morris <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/linux-nfs/7819419cf0cb50d8130dc6b747765d2b8febc88a.camel@kernel.org/T/#t
Fixes: 51c0d4f7e3 ("nfsd: add support for FATTR4_OPEN_ARGUMENTS")
Cc: stable@vger.kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54e96258a6 upstream.
scx_bpf_dsq_move_set_slice() and scx_bpf_dsq_move_set_vtime() take a DSQ
iterator argument which has to be valid. Mark them with KF_RCU.
Fixes: 4c30f5ce4f ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()")
Cc: stable@vger.kernel.org # v6.12+
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76e20da0bd upstream.
This reverts commit c9d84da18d. It
replaces in L2CAP calls to msecs_to_jiffies() to secs_to_jiffies()
and updates the constants accordingly. But the constants are also
used in LCAP Configure Request and L2CAP Configure Response which
expect values in milliseconds.
This may prevent correct usage of L2CAP channel.
To fix it, keep those constants in milliseconds and so revert this
change.
Fixes: c9d84da18d ("Bluetooth: L2CAP: convert timeouts to secs_to_jiffies()")
Signed-off-by: Frédéric Danis <frederic.danis@collabora.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e92c294120 upstream.
The parentheses for the unlikely() annotation were put in the wrong
place so it means that the condition is basically never true and the
bounds checking is skipped.
Fixes: aab9458b9f ("btrfs: tree-checker: add inode extref checks")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 35561bab76 ]
The include/generated/asm-offsets.h is generated in Kbuild during
compiling from arch/SRCARCH/kernel/asm-offsets.c. When we want to
generate another similar offset header file, circular dependency can
happen.
For example, we want to generate a offset file include/generated/test.h,
which is included in include/sched/sched.h. If we generate asm-offsets.h
first, it will fail, as include/sched/sched.h is included in asm-offsets.c
and include/generated/test.h doesn't exist; If we generate test.h first,
it can't success neither, as include/generated/asm-offsets.h is included
by it.
In x86_64, the macro COMPILE_OFFSETS is used to avoid such circular
dependency. We can generate asm-offsets.h first, and if the
COMPILE_OFFSETS is defined, we don't include the "generated/test.h".
And we define the macro COMPILE_OFFSETS for all the asm-offsets.c for this
purpose.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d452972858 ]
The qmap dump operation was destructively consuming queue entries while
displaying them. As dump can be triggered anytime, this can easily lead to
stalls. Add a temporary dump_store queue and modify the dump logic to pop
entries, display them, and then restore them back to the original queue.
This allows dump operations to be performed without affecting the
scheduler's queue state.
Note that if racing against new enqueues during dump, ordering can get
mixed up, but this is acceptable for debugging purposes.
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 45c222468d ]
After setting the BTRFS_ROOT_FORCE_COW flag on the root we are doing a
full write barrier, smp_wmb(), but we don't need to, all we need is a
smp_mb__after_atomic(). The use of the smp_wmb() is from the old days
when we didn't use a bit and used instead an int field in the root to
signal if cow is forced. After the int field was changed to a bit in
the root's state (flags field), we forgot to update the memory barrier
in create_pending_snapshot() to smp_mb__after_atomic(), but we did the
change in commit_fs_roots() after clearing BTRFS_ROOT_FORCE_COW. That
happened in commit 27cdeb7096 ("Btrfs: use bitfield instead of integer
data type for the some variants in btrfs_root"). On the reader side, in
should_cow_block(), we also use the counterpart smp_mb__before_atomic()
which generates further confusion.
So change the smp_wmb() to smp_mb__after_atomic(). In fact we don't
even need any barrier at all since create_pending_snapshot() is called
in the critical section of a transaction commit and therefore no one
can concurrently join/attach the transaction, or start a new one, until
the transaction is unblocked. By the time someone starts a new transaction
and enters should_cow_block(), a lot of implicit memory barriers already
took place by having acquired several locks such as fs_info->trans_lock
and extent buffer locks on the root node at least. Nevertlheless, for
consistency use smp_mb__after_atomic() after setting the force cow bit
in create_pending_snapshot().
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aab9458b9f ]
Like inode refs, inode extrefs have a variable length name, which means
we have to do a proper check to make sure no header nor name can exceed
the item limits.
The check itself is very similar to check_inode_ref(), just a different
structure (btrfs_inode_extref vs btrfs_inode_ref).
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5a0565cad3 ]
If we fail to update the inode at link_to_fixup_dir(), we don't abort the
transaction and propagate the error up the call chain, which makes it hard
to pinpoint the error to the inode update. So abort the transaction if the
inode update call fails, so that if it happens we known immediately.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6cb7f0b8c9 ]
We already have the extent buffer's level in an argument, there's no need
to first ensure the extent buffer's data is loaded (by calling
btrfs_read_extent_buffer()) and then call btrfs_header_level() to check
the level. So use the level argument and do the check before calling
btrfs_read_extent_buffer().
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2f5b8095ea ]
Currently we have this odd behaviour:
1) At btrfs_replay_log() we drop the reference of the log root tree if
the call to btrfs_recover_log_trees() failed;
2) But if the call to btrfs_recover_log_trees() did not fail, we don't
drop the reference in btrfs_replay_log() - we expect that
btrfs_recover_log_trees() does it in case it returns success.
Let's simplify this and make btrfs_replay_log() always drop the reference
on the log root tree, not only this simplifies code as it's what makes
sense since it's btrfs_replay_log() who grabbed the reference in the first
place.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a7f3dfb829 ]
Replace max_t() followed by min_t() with a single clamp().
As was pointed by David Laight in
https://lore.kernel.org/linux-btrfs/20250906122458.75dfc8f0@pumpkin/
the calculation may overflow u32 when the input value is too large, so
clamp_t() is not used. In practice the expected values are in range of
megabytes to gigabytes (throughput limit) so the bug would not happen.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: David Sterba <dsterba@suse.com>
[ Use clamp() and add explanation. ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d703963d2 ]
The hint block group selection in the extent allocator is wrong in the
first place, as it can select the dedicated data relocation block group for
the normal data allocation.
Since we separated the normal data space_info and the data relocation
space_info, we can easily identify a block group is for data relocation or
not. Do not choose it for the normal data allocation.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3c44cd3c79 ]
Now that btrfs_zone_finish_endio_workfn() is directly calling
do_zone_finish() the only caller of btrfs_zone_finish_endio() is
btrfs_finish_one_ordered().
btrfs_finish_one_ordered() already has error handling in-place so
btrfs_zone_finish_endio() can return an error if the block group lookup
fails.
Also as btrfs_zone_finish_endio() already checks for zoned filesystems and
returns early, there's no need to do this in the caller.
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6dd405b66 ]
In the process_one_buffer() log tree walk callback we return errors to the
log tree walk caller and then the caller aborts the transaction, if we
have one, or turns the fs into error state if we don't have one. While
this reduces code it makes it harder to figure out where exactly an error
came from. So add the transaction aborts after every failure inside the
process_one_buffer() callback, so that it helps figuring out why failures
happen.
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ebd726b10 ]
We do several things while walking a log tree (for replaying and for
freeing a log tree) like reading extent buffers and cleaning them up,
but we don't immediately abort the transaction, or turn the fs into an
error state, when one of these things fails. Instead we the transaction
abort or turn the fs into error state in the caller of the entry point
function that walks a log tree - walk_log_tree() - which means we don't
get to know exactly where an error came from.
Improve on this by doing a transaction abort / turn fs into error state
after each such failure so that when it happens we have a better
understanding where the failure comes from. This deliberately leaves
the transaction abort / turn fs into error state in the callers of
walk_log_tree() as to ensure we don't get into an inconsistent state in
case we forget to do it deeper in call chain. It also deliberately does
not do it after errors from the calls to the callback defined in
struct walk_control::process_func(), as we will do it later on another
patch.
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 59d5de3655 ]
A previous patch fixed a bug where new_prs should be assigned before
checking housekeeping conflicts. This patch addresses another potential
issue: the nocpu error check currently uses the xcpus which is not updated.
Although no issue has been observed so far, the check should be performed
using the new effective exclusive cpus.
The comment has been removed because the function returns an error if
nocpu checking fails, which is unrelated to the parent.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6e1c2c6c2c ]
Newer AMD systems can support up to 16 channels per EDAC "mc" device.
These are detected by the EDAC module running on the device, and the
current EDAC interface is appropriately enumerated.
The legacy EDAC sysfs interface however, provides device attributes for
channels 0 through 11 only. Consequently, the last four channels, 12
through 15, will not be enumerated and will not be visible through the
legacy sysfs interface.
Add additional device attributes to ensure that all 16 channels, if
present, are enumerated by and visible through the legacy EDAC sysfs
interface.
Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250916203242.1281036-1-avadhut.naik@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d1cc1baef6 ]
The LFENCE retpoline mitigation is not secure but the kernel prints
inconsistent messages about this fact. The dmesg log says 'Mitigation:
LFENCE', implying the system is mitigated. But sysfs reports 'Vulnerable:
LFENCE' implying the system (correctly) is not mitigated.
Fix this by printing a consistent 'Vulnerable: LFENCE' string everywhere
when this mitigation is selected.
Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250915134706.3201818-1-david.kaplan@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4a1d9d73aa ]
scx_enable() turns on the bypass mode while enable is in progress. If
enabling fails, it turns off the bypass mode and then triggers scx_error().
scx_error() will trigger scx_disable_workfn() which will turn on the bypass
mode again and unload the failed scheduler.
This moves the system out of bypass mode between the enable error path and
the disable path, which is unnecessary and can be brittle - e.g. the thread
running scx_enable() may already be on the failed scheduler and can be
switched out before it triggers scx_error() leading to a stall. The watchdog
would eventually kick in, so the situation isn't critical but is still
suboptimal.
There is nothing to be gained by turning off the bypass mode between
scx_enable() failure and scx_disable_workfn(). Keep bypass on.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 71965cae7d ]
Three EDAC source files were mistakenly marked as executable when adding the
EDAC scrub controls.
These are plain C source files and should not carry the executable bit.
Correcting their modes follows the principle of least privilege and avoids
unnecessary execute permissions in the repository.
[ bp: Massage commit message. ]
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828191954.903125-1-visitorckw@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ce8370e2e6 ]
When no audit rules are in place, fanotify event results are
unconditionally dropped due to an explicit check for the existence of
any audit rules. Given this is a report from another security
sub-system, allow it to be recorded regardless of the existence of any
audit rules.
To test, install and run the fapolicyd daemon with default config. Then
as an unprivileged user, create and run a very simple binary that should
be denied. Then check for an event with
ausearch -m FANOTIFY -ts recent
Link: https://issues.redhat.com/browse/RHEL-9065
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 204ced4108 ]
When retbleed mitigation is disabled, the kernel already prints an info
message that the system is vulnerable. Recent code restructuring also
inadvertently led to RETBLEED_INTEL_MSG being printed as an error, which is
unnecessary as retbleed mitigation was already explicitly disabled (by config
option, cmdline, etc.).
Qualify this print statement so the warning is not printed unless an actual
retbleed mitigation was selected and is being disabled due to incompatibility
with spectre_v2.
Fixes: e3b78a7ad5 ("x86/bugs: Restructure retbleed mitigation")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220624
Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20251003171936.155391-1-david.kaplan@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 930f2361fe ]
On Intel CPUs, the default retbleed mitigation is IBRS/eIBRS but this
requires that a similar spectre_v2 mitigation is applied. If the user
selects a different spectre_v2 mitigation (like spectre_v2=retpoline) a
warning is printed but sysfs will still report 'Mitigation: IBRS' or
'Mitigation: Enhanced IBRS'. This is incorrect because retbleed is not
mitigated, and IBRS is not actually set.
Fix this by choosing RETBLEED_MITIGATION_NONE in this scenario so the
kernel correctly reports the system as vulnerable to retbleed.
Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250915134706.3201818-1-david.kaplan@amd.com
Stable-dep-of: 204ced4108 ("x86/bugs: Qualify RETBLEED_INTEL_MSG")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 39a9ed0fb6 ]
The loop in tk_aux_sysfs_init() uses `i <= MAX_AUX_CLOCKS` as the
termination condition, which results in 9 iterations (i=0 to 8) when
MAX_AUX_CLOCKS is defined as 8. However, the kernel is designed to support
only up to 8 auxiliary clocks.
This off-by-one error causes the creation of a 9th sysfs entry that exceeds
the intended auxiliary clock range.
Fix the loop bound to use `i < MAX_AUX_CLOCKS` to ensure exactly 8
auxiliary clock entries are created, matching the design specification.
Fixes: 7b95663a3d ("timekeeping: Provide interface to control auxiliary clocks")
Signed-off-by: Haofeng Li <lihaofeng@kylinos.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/tencent_2376993D9FC06A3616A4F981B3DE1C599607@qq.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit efeeaac9ae ]
By the time scx_sched_free_rcu_work() runs, the scx_sched is no longer
reachable. However, a previously queued error_irq_work may still be pending or
running. Ensure it completes before proceeding with teardown.
Fixes: bff3b5aec1 ("sched_ext: Move disable machinery into scx_sched")
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bcb7c23056 ]
scx_sched.event_stats_cpu is the percpu counters that are used to track
stats. Introduce struct scx_sched_pcpu and move the counters inside. This
will ease adding more per-cpu fields. No functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Stable-dep-of: efeeaac9ae ("sched_ext: Sync error_irq_work before freeing scx_sched")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c2b8356e4 ]
There currently isn't a place to place SCX-internal types and accessors to
be shared between ext.c and ext_idle.c. Create kernel/sched/ext_internal.h
and move internal type and accessor definitions there. This trims ext.c a
bit and makes future additions easier. Pure code reorganization. No
functional changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Stable-dep-of: efeeaac9ae ("sched_ext: Sync error_irq_work before freeing scx_sched")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6f40e50ceb upstream.
handle_response() dereferences the payload as a 4-byte handle without
verifying that the declared payload size is at least 4 bytes. A malformed
or truncated message from ksmbd.mountd can lead to a 4-byte read past the
declared payload size. Validate the size before dereferencing.
This is a minimal fix to guard the initial handle read.
Fixes: 0626e6641f ("cifsd: add server handler for central processing and tranport layers")
Cc: stable@vger.kernel.org
Reported-by: Qianchang Zhao <pioooooooooip@gmail.com>
Signed-off-by: Qianchang Zhao <pioooooooooip@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 00aaae60fa ]
There are GPIO controllers such as the one present in the LX2160ARDB
QIXIS FPGA which have fixed-direction input and output GPIO lines mixed
together in a single register. This cannot be modeled using the
gpio-regmap as-is since there is no way to present the true direction of
a GPIO line.
In order to make this use case possible, add a new configuration
parameter - fixed_direction_output - into the gpio_regmap_config
structure. This will enable user drivers to provide a bitmap that
represents the fixed direction of the GPIO lines.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Michael Walle <mwalle@kernel.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Stable-dep-of: 2ba5772e53 ("gpio: idio-16: Define fixed direction of the GPIO lines")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 630785bfbe ]
The deprecation of the 'attr2' mount option in 6.18 wasn't entirely
successful because nobody noticed that the kernel never printed a
warning about attr2 being set in fstab if the only xfs filesystem is the
root fs; the initramfs mounts the root fs with no mount options; and the
init scripts only conveyed the fstab options by remounting the root fs.
Fix this by making it complain all the time.
Cc: stable@vger.kernel.org # v5.13
Fixes: 92cf7d3638 ("xfs: Skip repetitive warnings about mount options")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
[ Update existing xfs_fs_warn_deprecated() callers ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4ba5a8a7fa ]
When migrating a balloon page, we first deflate the old page to then
inflate the new page.
However, if inflating the new page succeeded, we effectively deflated the
old page, reducing the balloon size.
In that case, the migration actually worked: similar to migrating+
immediately deflating the new page. The old page will be freed back to
the buddy.
Right now, the core will leave the page be marked as isolated (as we
returned an error). When later trying to putback that page, we will run
into the WARN_ON_ONCE() in balloon_page_putback().
That handling was changed in commit 3544c4facc ("mm/balloon_compaction:
stop using __ClearPageMovable()"); before that change, we would have
tolerated that way of handling it.
To fix it, let's just return 0 in that case, making the core effectively
just clear the "isolated" flag + freeing it back to the buddy as if the
migration succeeded. Note that the new page will also get freed when the
core puts the last reference.
Note that this also makes it all be more consistent: we will no longer
unisolate the page in the balloon driver while keeping it marked as being
isolated in migration core.
This was found by code inspection.
Link: https://lkml.kernel.org/r/20251014124455.478345-1-david@redhat.com
Fixes: 3544c4facc ("mm/balloon_compaction: stop using __ClearPageMovable()")
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Jerrin Shaji George <jerrin.shaji-george@broadcom.com>
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 92a2b74a6b upstream.
This driver was not sending device clear or trigger events when the
board entered the DCAS or DTAS state respectively in device mode.
DCAS is the Device Clear Active State which is entered on receiving a
selective device clear message (SDC) or universal device clear message
(DCL) from the controller in charge.
DTAS is the Device Trigger Active State which is entered on receiving
a group execute trigger (GET) message from the controller.
In order for an application, implementing a particular device, to
detect when one of these states is entered the driver needs to send
the appropriate event.
Send the appropriate gpib_event when DCAS or DTAS is set in the
reported status word. This sets the DCAS or DTAS bits in the board's
status word which can be monitored by the application.
Fixes: 4e127de14f ("staging: gpib: Add National Instruments USB GPIB driver")
Cc: stable <stable@kernel.org>
Tested-by: Dave Penkler <dpenkler@gmail.com>
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aaf2af1ed1 upstream.
When the ATN (Attention) line is asserted during a read we get a
NIUSB_ATN_STATE_ERROR during a read. For the controller to send a
device clear it asserts ATN. Normally this is an error but in the case
of a device clear it should be regarded as an interrupt.
Return -EINTR when the Device Clear Active State (DCAS) is entered
else signal an error with dev_dbg with status instead of just dev_err.
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3c4c1f29a upstream.
EOI (End Or Identify) is a hardware line on the GPIB bus that can be
asserted with the last byte of a message to indicate the end of the
transfer to the receiving device.
In this driver, a write with send_eoi true is done in 3 parts:
Send first byte directly
Send remaining but 1 bytes using the fifo
Send the last byte directly with EOI asserted
The first byte in a write is always sent by writing to the tms9914
chip directly to setup for the subsequent fifo transfer. We were not
checking for a 1 byte write with send_eoi true resulting in EOI not
being asserted. Since the fifo transfer was not executed
(fifotransfersize == 0) the retval in the test after the fifo transfer
code was still 1 from the preceding direct write. This caused it to
return without executing the final direct write which would have sent
an unsollicited extra byte.
For a 2 byte message the first byte was sent directly. But since the
fifo transfer was not executed (fifotransfersize == 1) and the retval
in the test after the fifo transfer code was still 1 from the
preceding first byte write it returned before the final direct byte
write with send_eoi true. The second byte was then sent as a separate
1 byte write to complete the 2 byte write count again without EOI
being asserted as above.
Only send the first byte directly if more than 1 byte is to be
transferred with send_eoi true.
Also check for retval < 0 for the error return in case the fifo code
is not used (1 or 2 byte message with send_eoi true).
Fixes: 09a4655ee1 ("staging: gpib: Add HP/Agilent/Keysight 8235xx PCI GPIB driver")
Cc: stable <stable@kernel.org>
Tested-by: Dave Penkler <dpenkler@gmail.com>
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b1aabb8ef0 upstream.
The fmh_gpib driver contains a device reference count leak in
fmh_gpib_attach_impl() where driver_find_device() increases the
reference count of the device by get_device() when matching but this
reference is not properly decreased. Add put_device() in
fmh_gpib_detach(), which ensures that the reference count of the
device is correctly managed.
Found by code review.
Cc: stable <stable@kernel.org>
Fixes: 8e4841a088 ("staging: gpib: Add Frank Mori Hess FPGA PCI GPIB driver")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c05bf6c02 upstream.
Commit 43c51bb573 ("sc16is7xx: make sure device is in suspend once
probed") permanently enabled access to the enhanced features in
sc16is7xx_probe(), and it is never disabled after that.
Therefore, remove re-enable of enhanced features in
sc16is7xx_set_baud(). This eliminates a potential useless read + write
cycle each time the baud rate is reconfigured.
Fixes: 43c51bb573 ("sc16is7xx: make sure device is in suspend once probed")
Cc: stable <stable@kernel.org>
Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Link: https://patch.msgid.link/20251006142002.177475-1-hugo@hugovil.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d518314a1f upstream.
Some MediaTek SoCs got a gated UART baud clock, which currently gets
disabled as the clk subsystem believes it would be unused. This results in
the uart freezing right after "clk: Disabling unused clocks" on those
platforms.
Request the baud clock to be prepared and enabled during probe, and to
restore run-time power management capabilities to what it was before commit
e32a83c70c ("serial: 8250-mtk: modify mtk uart power and clock
management") disable and unprepare the baud clock when suspending the UART,
prepare and enable it again when resuming it.
Fixes: e32a83c70c ("serial: 8250-mtk: modify mtk uart power and clock management")
Fixes: b6c7ff2693 ("serial: 8250_mtk: Simplify clock sequencing and runtime PM")
Cc: stable <stable@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://patch.msgid.link/de5197ccc31e1dab0965cabcc11ca92e67246cf6.1758058441.git.daniel@makrotopia.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit daeb4037ad upstream.
Check the return value of reset_control_deassert() in the probe
function to prevent continuing probe when reset deassertion fails.
Previously, reset_control_deassert() was called without checking its
return value, which could lead to probe continuing even when the
device reset wasn't properly deasserted.
The fix checks the return value and returns an error with dev_err_probe()
if reset deassertion fails, providing better error handling and
diagnostics.
Fixes: acbdad8dd1 ("serial: 8250_dw: simplify optional reset handling")
Cc: stable <stable@kernel.org>
Signed-off-by: Artem Shimko <a.shimko.dev@gmail.com>
Link: https://patch.msgid.link/20251019095131.252848-1-a.shimko.dev@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 51cb04abd3 upstream.
Add the missing multiport controller binding to target list.
Fix minItems for interrupt-names to avoid the following error on High
Speed controller:
usb@a200000: interrupt-names: ['dwc_usb3', 'pwr_event', 'dp_hs_phy_irq', 'dm_hs_phy_irq'] is too short
Fixes: 6e762f7b8e ("dt-bindings: usb: Introduce qcom,snps-dwc3")
Cc: stable@vger.kernel.org
Signed-off-by: Krishna Kurapati <krishna.kurapati@oss.qualcomm.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef8fef45c7 upstream.
The receive error handling code is shared between RSCI and all other
SCIF port types, but the RSCI overrun_reg is specified as a memory
offset, while for other SCIF types it is an enum value used to index
into the sci_port_params->regs array, as mentioned above the
sci_serial_in() function.
For RSCI, the overrun_reg is CSR (0x48), causing the sci_getreg() call
inside the sci_handle_fifo_overrun() function to index outside the
bounds of the regs array, which currently has a size of 20, as specified
by SCI_NR_REGS.
Because of this, we end up accessing memory outside of RSCI's
rsci_port_params structure, which, when interpreted as a plat_sci_reg,
happens to have a non-zero size, causing the following WARN when
sci_serial_in() is called, as the accidental size does not match the
supported register sizes.
The existence of the overrun_reg needs to be checked because
SCIx_SH3_SCIF_REGTYPE has overrun_reg set to SCLSR, but SCLSR is not
present in the regs array.
Avoid calling sci_getreg() for port types which don't use standard
register handling.
Use the ops->read_reg() and ops->write_reg() functions to properly read
and write registers for RSCI, and change the type of the status variable
to accommodate the 32-bit CSR register.
sci_getreg() and sci_serial_in() are also called with overrun_reg in the
sci_mpxed_interrupt() interrupt handler, but that code path is not used
for RSCI, as it does not have a muxed interrupt.
------------[ cut here ]------------
Invalid register access
WARNING: CPU: 0 PID: 0 at drivers/tty/serial/sh-sci.c:522 sci_serial_in+0x38/0xac
Modules linked in: renesas_usbhs at24 rzt2h_adc industrialio_adc sha256 cfg80211 bluetooth ecdh_generic ecc rfkill fuse drm backlight ipv6
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.17.0-rc1+ #30 PREEMPT
Hardware name: Renesas RZ/T2H EVK Board based on r9a09g077m44 (DT)
pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : sci_serial_in+0x38/0xac
lr : sci_serial_in+0x38/0xac
sp : ffff800080003e80
x29: ffff800080003e80 x28: ffff800082195b80 x27: 000000000000000d
x26: ffff8000821956d0 x25: 0000000000000000 x24: ffff800082195b80
x23: ffff000180e0d800 x22: 0000000000000010 x21: 0000000000000000
x20: 0000000000000010 x19: ffff000180e72000 x18: 000000000000000a
x17: ffff8002bcee7000 x16: ffff800080000000 x15: 0720072007200720
x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720
x11: 0000000000000058 x10: 0000000000000018 x9 : ffff8000821a6a48
x8 : 0000000000057fa8 x7 : 0000000000000406 x6 : ffff8000821fea48
x5 : ffff00033ef88408 x4 : ffff8002bcee7000 x3 : ffff800082195b80
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff800082195b80
Call trace:
sci_serial_in+0x38/0xac (P)
sci_handle_fifo_overrun.isra.0+0x70/0x134
sci_er_interrupt+0x50/0x39c
__handle_irq_event_percpu+0x48/0x140
handle_irq_event+0x44/0xb0
handle_fasteoi_irq+0xf4/0x1a0
handle_irq_desc+0x34/0x58
generic_handle_domain_irq+0x1c/0x28
gic_handle_irq+0x4c/0x140
call_on_irq_stack+0x30/0x48
do_interrupt_handler+0x80/0x84
el1_interrupt+0x34/0x68
el1h_64_irq_handler+0x18/0x24
el1h_64_irq+0x6c/0x70
default_idle_call+0x28/0x58 (P)
do_idle+0x1f8/0x250
cpu_startup_entry+0x34/0x3c
rest_init+0xd8/0xe0
console_on_rootfs+0x0/0x6c
__primary_switched+0x88/0x90
---[ end trace 0000000000000000 ]---
Cc: stable <stable@kernel.org>
Fixes: 0666e3fe95 ("serial: sh-sci: Add support for RZ/T2H SCI")
Signed-off-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com>
Link: https://patch.msgid.link/20250923154707.1089900-1-cosmin-gabriel.tanislav.xa@renesas.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d8713f807 upstream.
When there is no port entry in the tcpci entry itself, the driver will
trigger an error message "OF: graph: no port node found in /...../typec" .
It is documented that the dts node should contain an connector entry
with ports and several port pointing to devices with usb-role-switch
property set. Only when those connector entry is missing, it should
check for port entries in the main node.
We switch the search order for looking after ports, which will avoid the
failure message while there are explicit connector entries.
Fixes: d56de8c9a1 ("usb: typec: tcpm: try to get role switch from tcpc fwnode")
Cc: stable <stable@kernel.org>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Badhri Jagan Sridharan <badhri@google.com>
Link: https://patch.msgid.link/20251013-b4-ml-topic-tcpm-v2-1-63c9b2ab8a0b@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8cc9e5fcb upstream.
The early error path in hdm_probe() can jump to err_free_mdev before
&mdev->dev has been initialized with device_initialize(). Calling
put_device(&mdev->dev) there triggers a device core WARN and ends up
invoking kref_put(&kobj->kref, kobject_release) on an uninitialized
kobject.
In this path the private struct was only kmalloc'ed and the intended
release is effectively kfree(mdev) anyway, so free it directly instead
of calling put_device() on an uninitialized device.
This removes the WARNING and fixes the pre-initialization error path.
Fixes: 97a6f772f3 ("drivers: most: add USB adapter driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Victoria Votokina <Victoria.Votokina@kaspersky.com>
Link: https://patch.msgid.link/20251010105241.4087114-3-Victoria.Votokina@kaspersky.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b12709026 upstream.
hdm_disconnect() calls most_deregister_interface(), which eventually
unregisters the MOST interface device with device_unregister(iface->dev).
If that drops the last reference, the device core may call release_mdev()
immediately while hdm_disconnect() is still executing.
The old code also freed several mdev-owned allocations in
hdm_disconnect() and then performed additional put_device() calls.
Depending on refcount order, this could lead to use-after-free or
double-free when release_mdev() ran (or when unregister paths also
performed puts).
Fix by moving the frees of mdev-owned allocations into release_mdev(),
so they happen exactly once when the device is truly released, and by
dropping the extra put_device() calls in hdm_disconnect() that are
redundant after device_unregister() and most_deregister_interface().
This addresses the KASAN slab-use-after-free reported by syzbot in
hdm_disconnect(). See report and stack traces in the bug link below.
Reported-by: syzbot+916742d5d24f6c254761@syzkaller.appspotmail.com
Cc: stable <stable@kernel.org>
Closes: https://syzkaller.appspot.com/bug?extid=916742d5d24f6c254761
Fixes: 97a6f772f3 ("drivers: most: add USB adapter driver")
Signed-off-by: Victoria Votokina <Victoria.Votokina@kaspersky.com>
Link: https://patch.msgid.link/20251010105241.4087114-2-Victoria.Votokina@kaspersky.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dbdf2a7feb upstream.
Between Rust 1.79 and 1.86, under `CONFIG_RUST_KERNEL_DOCTESTS=y`,
`objtool` may report:
rust/doctests_kernel_generated.o: warning: objtool:
rust_doctest_kernel_alloc_kbox_rs_13() falls through to next
function rust_doctest_kernel_alloc_kvec_rs_0()
(as well as in rust_doctest_kernel_alloc_kvec_rs_0) due to calls to the
`noreturn` symbol:
core::option::expect_failed
from code added in commits 779db37373 ("rust: alloc: kvec: implement
AsPageIter for VVec") and 671618432f ("rust: alloc: kbox: implement
AsPageIter for VBox").
Thus add the mangled one to the list so that `objtool` knows it is
actually `noreturn`.
This can be reproduced as well in other versions by tweaking the code,
such as the latest stable Rust (1.90.0).
Stable does not have code that triggers this, but it could have it in
the future. Downstream forks could too. Thus tag it for backport.
See commit 56d680dd23 ("objtool/rust: list `noreturn` Rust functions")
for more details.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Cc: stable@vger.kernel.org # Needed in 6.12.y and later.
Link: https://patch.msgid.link/20251020020714.2511718-1-ojeda@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d90eeb8ecd upstream.
There are no scenarios where a weak increment is invalid on binder_node.
The only possible case where it could be invalid is if the kernel
delivers BR_DECREFS to the process that owns the node, and then
increments the weak refcount again, effectively "reviving" a dead node.
However, that is not possible: when the BR_DECREFS command is delivered,
the kernel removes and frees the binder_node. The fact that you were
able to call binder_inc_node_nilocked() implies that the node is not yet
destroyed, which implies that BR_DECREFS has not been delivered to
userspace, so incrementing the weak refcount is valid.
Note that it's currently possible to trigger this condition if the owner
calls BINDER_THREAD_EXIT while node->has_weak_ref is true. This causes
BC_INCREFS on binder_ref instances to fail when they should not.
Cc: stable@vger.kernel.org
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Reported-by: Yu-Ting Tseng <yutingtseng@google.com>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://patch.msgid.link/20251015-binder-weak-inc-v1-1-7914b092c371@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3d12ec847 upstream.
DbC may add 1024 bogus bytes to the beginneing of the receiving endpoint
if DbC hw triggers a STALL event before any Transfer Blocks (TRBs) for
incoming data are queued, but driver handles the event after it queued
the TRBs.
This is possible as xHCI DbC hardware may trigger spurious STALL transfer
events even if endpoint is empty. The STALL event contains a pointer
to the stalled TRB, and "remaining" untransferred data length.
As there are no TRBs queued yet the STALL event will just point to first
TRB position of the empty ring, with '0' bytes remaining untransferred.
DbC driver is polling for events, and may not handle the STALL event
before /dev/ttyDBC0 is opened and incoming data TRBs are queued.
The DbC event handler will now assume the first queued TRB (length 1024)
has stalled with '0' bytes remaining untransferred, and copies the data
This race situation can be practically mitigated by making sure the event
handler handles all pending transfer events when DbC reaches configured
state, and only then create dev/ttyDbC0, and start queueing transfers.
The event handler can this way detect the STALL events on empty rings
and discard them before any transfers are queued.
This does in practice solve the issue, but still leaves a small possible
gap for the race to trigger.
We still need a way to distinguish spurious STALLs on empty rings with '0'
bytes remaing, from actual STALL events with all bytes transmitted.
Cc: stable <stable@kernel.org>
Fixes: dfba2174dc ("usb: xhci: Add DbC support in xHCI driver")
Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2bbd38fcd2 upstream.
DbC is currently only enabled back if it's in configured state during
suspend.
If system is suspended after DbC is enabled, but before the device is
properly enumerated by the host, then DbC would not be enabled back in
resume.
Always enable DbC back in resume if it's suspended in enabled,
connected, or configured state
Cc: stable <stable@kernel.org>
Fixes: dfba2174dc ("usb: xhci: Add DbC support in xHCI driver")
Tested-by: Łukasz Bartosik <ukaszb@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37b9dd0d11 upstream.
Drop the check on the maximum transfer length in Raw Gadget for both
control and non-control transfers.
Limiting the transfer length causes a problem with emulating USB devices
whose full configuration descriptor exceeds PAGE_SIZE in length.
Overall, there does not appear to be any reason to enforce any kind of
transfer length limit on the Raw Gadget side for either control or
non-control transfers, so let's just drop the related check.
Cc: stable <stable@kernel.org>
Fixes: f2c2e71764 ("usb: gadget: add raw-gadget interface")
Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://patch.msgid.link/a6024e8eab679043e9b8a5defdb41c4bda62f02b.1761085528.git.andreyknvl@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 53abe3e1c1 ]
Clang is not happy with set but unused variable (this is visible
with `make W=1` build:
kernel/sched/sched.h:3744:18: error: variable 'cpumask' set but not used [-Werror,-Wunused-but-set-variable]
It seems like the variable was never used along with the assignment
that does not have side effects as far as I can see. Remove those
altogether.
Fixes: 223baf9d17 ("sched: Fix performance regression introduced by mm_cid")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c5efc6a0b3 ]
The __must_hold annotation references &req->ctx->uring_lock, but req
is not in scope in io_install_fixed_file. This change updates the
annotation to reference the correct ctx->uring_lock.
improving code clarity.
Fixes: f110ed8498 ("io_uring: split out fixed file installation and removal")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4c4e6ea4a1 ]
The generic_handle_domain_irq() function resolves the hardware IRQ
internally. The driver performed a duplicative mapping by calling
irq_find_mapping() first, which could lead to an RCU stall.
Delete the redundant irq_find_mapping() call and pass the hardware IRQ
directly to generic_handle_domain_irq().
Fixes: c5a4b6fd31 ("gpio: Add support for Intel LJCA USB GPIO driver")
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Link: https://lore.kernel.org/r/20251023070231.1305-1-vulab@iscas.ac.cn
[Bartosz: remove unused variable]
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4c8cf6bd28 ]
The block layer PI generation / verification code expects the bio_vecs
to have at least LBA size (or more correctly integrity internal)
granularity. With the direct I/O alignment relaxation in 2022, user
space can now feed bios with less alignment than that, leading to
scribbling outside the PI buffers. Apparently this wasn't noticed so far
because none of the tests generate such buffers, but since 851c4c96db
("xfs: implement XFS_IOC_DIOINFO in terms of vfs_getattr"), xfstests
generic/013 by default generates such I/O now that the relaxed alignment
is advertised by the XFS_IOC_DIOINFO ioctl.
Fix this by increasing the required alignment when using PI, although
handling arbitrary alignment in the long run would be even nicer.
Fixes: bf8d08532b ("iomap: add support for dma aligned direct-io")
Fixes: b1a000d3b8 ("block: relax direct io memory alignment")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 119aaeed0b ]
In some legacy platforms the MSI controller for a PCI host bridge is
identified by an msi-parent property whose phandle points at an MSI
controller node with no #msi-cells property, that implicitly
means #msi-cells == 0.
For such platforms, mapping a device ID and retrieving the MSI controller
node becomes simply a matter of checking whether in the device hierarchy
there is an msi-parent property pointing at an MSI controller node with
such characteristics.
Add a helper function to of_msi_xlate() to check the msi-parent property in
addition to msi-map and retrieve the MSI controller node (with a 1:1 ID
deviceID-IN<->deviceID-OUT mapping) to provide support for deviceID
mapping and MSI controller node retrieval for such platforms.
Fixes: 57d72196df ("irqchip/gic-v5: Add GICv5 ITS support")
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Cc: Sascha Bischoff <sascha.bischoff@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/20251021124103.198419-2-lpieralisi@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a576a849d5 ]
With the introduction of the of_msi_xlate() function, the OF layer
provides an API to map a device ID and retrieve the MSI controller
node the ID is mapped to with a single call.
of_msi_map_id() is currently used to map a deviceID to a specific
MSI controller node; of_msi_xlate() can be used for that purpose
too, there is no need to keep the two functions.
Convert of_msi_map_id() to of_msi_xlate() calls and update the
of_msi_xlate() documentation to describe how the struct device_node
pointer passed in should be set-up to either provide the MSI controller
node target or receive its pointer upon mapping completion.
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250805133443.936955-1-lpieralisi@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Stable-dep-of: 119aaeed0b ("of/irq: Add msi-parent check to of_msi_xlate()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23437509a6 ]
When using page list framebuffer, and using RGB888 format, some
pixels can cross the page boundaries, and this case was not handled,
leading to writing 1 or 2 bytes on the next virtual address.
Add a check and a specific function to handle this case.
Fixes: c9ff280879 ("drm/panic: Add support to scanout buffer as array of pages")
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://lore.kernel.org/r/20251009122955.562888-7-jfalempe@redhat.com
Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 81ccca3121 ]
sock_{send,recv}msg() internally calls security_socket_{send,recv}msg(),
which does security checks (e.g. SELinux) for socket access against the
current task. However, _sock_xmit() in drivers/block/nbd.c may be called
indirectly from a userspace syscall, where the NBD socket access would
be incorrectly checked against the calling userspace task (which simply
tries to read/write a file that happens to reside on an NBD device).
To fix this, temporarily override creds to kernel ones before calling
the sock_*() functions. This allows the security modules to recognize
this as internal access by the kernel, which will normally be allowed.
A way to trigger the issue is to do the following (on a system with
SELinux set to enforcing):
### Create nbd device:
truncate -s 256M /tmp/testfile
nbd-server localhost:10809 /tmp/testfile
### Connect to the nbd server:
nbd-client localhost
### Create mdraid array
mdadm --create -l 1 -n 2 /dev/md/testarray /dev/nbd0 missing
After these steps, assuming the SELinux policy doesn't allow the
unexpected access pattern, errors will be visible on the kernel console:
[ 142.204243] nbd0: detected capacity change from 0 to 524288
[ 165.189967] md: async del_gendisk mode will be removed in future, please upgrade to mdadm-4.5+
[ 165.252299] md/raid1:md127: active with 1 out of 2 mirrors
[ 165.252725] md127: detected capacity change from 0 to 522240
[ 165.255434] block nbd0: Send control failed (result -13)
[ 165.255718] block nbd0: Request send failed, requeueing
[ 165.256006] block nbd0: Dead connection, failed to find a fallback
[ 165.256041] block nbd0: Receive control failed (result -32)
[ 165.256423] block nbd0: shutting down sockets
[ 165.257196] I/O error, dev nbd0, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.257736] Buffer I/O error on dev md127, logical block 0, async page read
[ 165.258263] I/O error, dev nbd0, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.259376] Buffer I/O error on dev md127, logical block 0, async page read
[ 165.259920] I/O error, dev nbd0, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.260628] Buffer I/O error on dev md127, logical block 0, async page read
[ 165.261661] ldm_validate_partition_table(): Disk read failed.
[ 165.262108] I/O error, dev nbd0, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.262769] Buffer I/O error on dev md127, logical block 0, async page read
[ 165.263697] I/O error, dev nbd0, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.264412] Buffer I/O error on dev md127, logical block 0, async page read
[ 165.265412] I/O error, dev nbd0, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.265872] Buffer I/O error on dev md127, logical block 0, async page read
[ 165.266378] I/O error, dev nbd0, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.267168] Buffer I/O error on dev md127, logical block 0, async page read
[ 165.267564] md127: unable to read partition table
[ 165.269581] I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.269960] Buffer I/O error on dev nbd0, logical block 0, async page read
[ 165.270316] I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.270913] Buffer I/O error on dev nbd0, logical block 0, async page read
[ 165.271253] I/O error, dev nbd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
[ 165.271809] Buffer I/O error on dev nbd0, logical block 0, async page read
[ 165.272074] ldm_validate_partition_table(): Disk read failed.
[ 165.272360] nbd0: unable to read partition table
[ 165.289004] ldm_validate_partition_table(): Disk read failed.
[ 165.289614] nbd0: unable to read partition table
The corresponding SELinux denial on Fedora/RHEL will look like this
(assuming it's not silenced):
type=AVC msg=audit(1758104872.510:116): avc: denied { write } for pid=1908 comm="mdadm" laddr=::1 lport=32772 faddr=::1 fport=10809 scontext=system_u:system_r:mdadm_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=tcp_socket permissive=0
The respective backtrace looks like this:
@security[mdadm, -13,
handshake_exit+221615650
handshake_exit+221615650
handshake_exit+221616465
security_socket_sendmsg+5
sock_sendmsg+106
handshake_exit+221616150
sock_sendmsg+5
__sock_xmit+162
nbd_send_cmd+597
nbd_handle_cmd+377
nbd_queue_rq+63
blk_mq_dispatch_rq_list+653
__blk_mq_do_dispatch_sched+184
__blk_mq_sched_dispatch_requests+333
blk_mq_sched_dispatch_requests+38
blk_mq_run_hw_queue+239
blk_mq_dispatch_plug_list+382
blk_mq_flush_plug_list.part.0+55
__blk_flush_plug+241
__submit_bio+353
submit_bio_noacct_nocheck+364
submit_bio_wait+84
__blkdev_direct_IO_simple+232
blkdev_read_iter+162
vfs_read+591
ksys_read+95
do_syscall_64+92
entry_SYSCALL_64_after_hwframe+120
]: 1
The issue has started to appear since commit 060406c61c ("block: add
plug while submitting IO").
Cc: Ming Lei <ming.lei@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2348878
Fixes: 060406c61c ("block: add plug while submitting IO")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4ec703ec0c ]
The negation operator is incorrectly placed outside the unlikely()
macro:
if (!unlikely(iwa))
This inverts the compiler branch prediction hint, marking the NULL case
as likely instead of unlikely. The intent is to indicate that allocation
failures are rare, consistent with common kernel patterns.
Moving the negation inside unlikely():
if (unlikely(!iwa))
Fixes: 2b4fc4cd43 ("io_uring/waitid: setup async data in the prep handler")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Reviewed-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8dcc66ad37 ]
Handling of errors when reading status, temperature, and humidity returns
the error number as negative attribute value. Fix it up by returning
the error as return value.
Fixes: a0ac418c60 ("hwmon: (sht3x) convert some of sysfs interface to hwmon")
Cc: JuenKit Yip <JuenKit_Yip@hotmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7776a802f ]
Resolve this smatch warning:
arch/riscv/kernel/sys_hwprobe.c:50 hwprobe_arch_id() error: uninitialized symbol 'cpu_id'.
This could happen if hwprobe_arch_id() was called with a key ID of
something other than MVENDORID, MIMPID, and MARCHID. This does not
happen in the current codebase. The only caller of hwprobe_arch_id()
is a function that only passes one of those three key IDs.
For the sake of reducing static analyzer warning noise, and in the
unlikely event that hwprobe_arch_id() is someday called with some
other key ID, validate hwprobe_arch_id()'s input to ensure that
'cpu_id' is always initialized before use.
Fixes: ea3de9ce8a ("RISC-V: Add a syscall for HW probing")
Cc: Evan Green <evan@rivosinc.com>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Link: https://lore.kernel.org/r/cf5a13ec-19d0-9862-059b-943f36107bf3@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca525d53f9 ]
The pgprot_dmacoherent() is used when allocating memory for
non-coherent devices and by default pgprot_dmacoherent() is
same as pgprot_noncached() unless architecture overrides it.
Currently, there is no pgprot_dmacoherent() definition for
RISC-V hence non-coherent device memory is being mapped as
IO thereby making CPU access to such memory slow.
Define pgprot_dmacoherent() to be same as pgprot_writecombine()
for RISC-V so that CPU access non-coherent device memory as
NOCACHE which is better than accessing it as IO.
Fixes: ff689fd21c ("riscv: add RISC-V Svpbmt extension support")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Han Gao <rabenda.cn@gmail.com>
Tested-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Link: https://lore.kernel.org/r/20250820152316.1012757-1-apatel@ventanamicro.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b7d9b25e4 ]
Attaching UBI on the flash with more than one plane per lun will lead to
the following error:
[ 2.980989] spi-nand spi0.0: Micron SPI NAND was found.
[ 2.986309] spi-nand spi0.0: 256 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
[ 2.994978] 2 fixed-partitions partitions found on MTD device spi0.0
[ 3.001350] Creating 2 MTD partitions on "spi0.0":
[ 3.006159] 0x000000000000-0x000000020000 : "bl2"
[ 3.011663] 0x000000020000-0x000010000000 : "ubi"
...
[ 6.391748] ubi0: attaching mtd1
[ 6.412545] ubi0 error: ubi_attach: PEB 0 contains corrupted VID header, and the data does not contain all 0xFF
[ 6.422677] ubi0 error: ubi_attach: this may be a non-UBI PEB or a severe VID header corruption which requires manual inspection
[ 6.434249] Volume identifier header dump:
[ 6.438349] magic 55424923
[ 6.441482] version 1
[ 6.444007] vol_type 0
[ 6.446539] copy_flag 0
[ 6.449068] compat 0
[ 6.451594] vol_id 0
[ 6.454120] lnum 1
[ 6.456651] data_size 4096
[ 6.459442] used_ebs 1061644134
[ 6.462748] data_pad 0
[ 6.465274] sqnum 0
[ 6.467805] hdr_crc 61169820
[ 6.470943] Volume identifier header hexdump:
[ 6.475308] hexdump of PEB 0 offset 4096, length 126976
[ 6.507391] ubi0 warning: ubi_attach: valid VID header but corrupted EC header at PEB 4
[ 6.515415] ubi0 error: ubi_compare_lebs: unsupported on-flash UBI format
[ 6.522222] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd1, error -22
[ 6.529294] UBI error: cannot attach mtd1
Non dirmap reading works good. Looking to spi_mem_no_dirmap_read() code we'll see:
static ssize_t spi_mem_no_dirmap_read(struct spi_mem_dirmap_desc *desc,
u64 offs, size_t len, void *buf)
{
struct spi_mem_op op = desc->info.op_tmpl;
int ret;
// --- see here ---
op.addr.val = desc->info.offset + offs;
//-----------------
op.data.buf.in = buf;
op.data.nbytes = len;
ret = spi_mem_adjust_op_size(desc->mem, &op);
if (ret)
return ret;
ret = spi_mem_exec_op(desc->mem, &op);
if (ret)
return ret;
return op.data.nbytes;
}
The similar happens for spi_mem_no_dirmap_write(). Thus the address
passed to the flash should take in the account the value of
desc->info.offset.
This patch fix dirmap reading/writing of flashes with more than one
plane per lun.
Fixes: a403997c12 ("spi: airoha: add SPI-NAND Flash controller driver")
Signed-off-by: Mikhail Kshevetskiy <mikhail.kshevetskiy@iopsys.eu>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patch.msgid.link/20251012121707.2296160-7-mikhail.kshevetskiy@iopsys.eu
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit edd2e261b1 ]
Booting without this patch and disabled dirmap support results in
[ 2.980719] spi-nand spi0.0: Micron SPI NAND was found.
[ 2.986040] spi-nand spi0.0: 256 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
[ 2.994709] 2 fixed-partitions partitions found on MTD device spi0.0
[ 3.001075] Creating 2 MTD partitions on "spi0.0":
[ 3.005862] 0x000000000000-0x000000020000 : "bl2"
[ 3.011272] 0x000000020000-0x000010000000 : "ubi"
...
[ 6.195594] ubi0: attaching mtd1
[ 13.338398] ubi0: scanning is finished
[ 13.342188] ubi0 error: ubi_read_volume_table: the layout volume was not found
[ 13.349784] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd1, error -22
[ 13.356897] UBI error: cannot attach mtd1
If dirmap is disabled or not supported in the spi driver, the dirmap requests
will be executed via exec_op() handler. Thus, if the hardware supports
dual/quad spi modes, then corresponding requests will be sent to exec_op()
handler. Current driver does not support such requests, so error is arrised.
As result the flash can't be read/write.
This patch adds support of dual and quad wires spi modes to exec_op() handler.
Fixes: a403997c12 ("spi: airoha: add SPI-NAND Flash controller driver")
Signed-off-by: Mikhail Kshevetskiy <mikhail.kshevetskiy@iopsys.eu>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patch.msgid.link/20251012121707.2296160-4-mikhail.kshevetskiy@iopsys.eu
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 20b93a0088 ]
The SCMI_XFER_FLAG_IS_RAW flag was being cleared prematurely in
scmi_xfer_raw_put() before the transfer completion was properly
acknowledged by the raw message handlers.
Move the clearing of SCMI_XFER_FLAG_IS_RAW and SCMI_XFER_FLAG_CHAN_SET
from scmi_xfer_raw_put() to __scmi_xfer_put() to ensure the flags remain
set throughout the entire raw message processing pipeline until the
transfer is returned to the free pool.
Fixes: 3095a3e25d ("firmware: arm_scmi: Add xfer helpers to provide raw access")
Suggested-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Artem Shimko <a.shimko.dev@gmail.com>
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>
Message-Id: <20251008091057.1969260-1-a.shimko.dev@gmail.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2290ab43b9 ]
When the SCMI debug subsystem fails to initialize, the related debug root
will be missing, and the underlying descriptor will be NULL.
Handle this fault condition in the SCMI debug helpers that maintain
metrics counters.
Fixes: 0b3d48c472 ("firmware: arm_scmi: Track basic SCMI communication debug metrics")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Message-Id: <20251014115346.2391418-1-cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 278b6cabf1 ]
Add missing address-cells 0 to GIC interrupt node to silence W=1
warning:
bcm2712.dtsi:494.4-497.31: Warning (interrupt_map): /axi/pcie@1000110000:interrupt-map:
Missing property '#address-cells' in node /soc@107c000000/interrupt-controller@7fff9000, using 0 as fallback
Value '0' is correct because:
1. GIC interrupt controller does not have children,
2. interrupt-map property (in PCI node) consists of five components and
the fourth component "parent unit address", which size is defined by
'#address-cells' of the node pointed to by the interrupt-parent
component, is not used (=0)
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250822133407.312505-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Stable-dep-of: aa960b5976 ("arm64: dts: broadcom: bcm2712: Define VGIC interrupt")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8735696ace ]
In csqspi_probe(), when cqspi_request_mmap_dma() returns -EPROBE_DEFER,
we handle the error by jumping to probe_setup_failed.
In that label, we call pm_runtime_disable(), even if we never called
pm_runtime_enable() before.
Because of this, the driver cannot probe:
[ 2.690018] cadence-qspi 47040000.spi: No Rx DMA available
[ 2.699735] spi-nor spi0.0: resume failed with -13
[ 2.699741] spi-nor: probe of spi0.0 failed with error -13
Only call pm_runtime_disable() if it was enabled by adding a new
label to handle cqspi_request_mmap_dma() failures.
Fixes: b07f349d18 ("spi: spi-cadence-quadspi: Fix pm runtime unbalance")
Signed-off-by: Mattijs Korpershoek <mkorpershoek@kernel.org>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://patch.msgid.link/20251009-cadence-quadspi-fix-pm-runtime-v2-1-8bdfefc43902@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c07f270323 ]
flexspi define four mode for sample clock source selection.
Here is the list of modes:
mode 0: Dummy Read strobe generated by FlexSPI Controller and loopback
internally
mode 1: Dummy Read strobe generated by FlexSPI Controller and loopback
from DQS pad
mode 2: Reserved
mode 3: Flash provided Read strobe and input from DQS pad
In default, flexspi use mode 0 after reset. And for DTR mode, flexspi
only support 8D-8D-8D mode. For 8D-8D-8D mode, IC suggest to use mode 3,
otherwise read always get incorrect data.
For DTR mode, flexspi will automatically div 2 of the root clock
and output to device. the formula is:
device_clock = root_clock / (is_dtr ? 2 : 1)
So correct the clock rate setting for DTR mode to get the max
performance.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250917-flexspi-ddr-v2-4-bb9fe2a01889@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: a89103f671 ("spi: spi-nxp-fspi: re-config the clock rate when operation require new clock rate")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11fb1a82ae ]
FF-A v1.2 introduced 16 byte IMPLEMENTATION DEFINED value in the endpoint
memory access descriptor to allow any sender could to specify an its any
custom value for each receiver. Also this value must be specified by the
receiver when retrieving the memory region. The sender must ensure it
informs the receiver of this value via an IMPLEMENTATION DEFINED mechanism
such as a partition message.
So the FF-A driver can use the message interfaces to communicate the value
and set the same in the ffa_mem_region_attributes structures when using
the memory interfaces.
The driver ensure that the size of the endpoint memory access descriptors
is set correctly based on the FF-A version.
Fixes: 9fac08d9d9 ("firmware: arm_ffa: Upgrade FF-A version to v1.2 in the driver")
Reported-by: Lixiang Mao <liximao@qti.qualcomm.com>
Tested-by: Lixiang Mao <liximao@qti.qualcomm.com>
Message-Id: <20250923150927.1218364-1-sudeep.holla@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit f0c5118ebb upstream.
Patch series "mm/damon/sysfs: fix commit test damon_ctx [de]allocation".
DAMON sysfs interface dynamically allocates and uses a damon_ctx object
for testing if given inputs for online DAMON parameters update is valid.
The object is being used without an allocation failure check, and leaked
when the test succeeds. Fix the two bugs.
This patch (of 2):
The damon_ctx for testing online DAMON parameters commit inputs is used
without its allocation failure check. This could result in an invalid
memory access. Fix it by directly returning an error when the allocation
failed.
Link: https://lkml.kernel.org/r/20251003201455.41448-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20251003201455.41448-2-sj@kernel.org
Fixes: 4c9ea539ad ("mm/damon/sysfs: validate user inputs from damon_sysfs_commit_input()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3fa5b1bfd upstream.
Each damon_ctx maintains callback requests using a linked list
(damon_ctx->call_controls). When a new callback request is received via
damon_call(), the new request should be added to the list. However, the
function is making a mistake at list_add_tail() invocation: putting the
new item to add and the list head to add it before, in the opposite order.
Because of the linked list manipulation implementation, the new request
can still be reached from the context's list head. But the list items
that were added before the new request are dropped from the list.
As a result, the callbacks are unexpectedly not invocated. Worse yet, if
the dropped callback requests were dynamically allocated, the memory is
leaked. Actually DAMON sysfs interface is using a dynamically allocated
repeat-mode callback request for automatic essential stats update. And
because the online DAMON parameters commit is using a non-repeat-mode
callback request, the issue can easily be reproduced, like below.
# damo start --damos_action stat --refresh_stat 1s
# damo tune --damos_action stat --refresh_stat 1s
The first command dynamically allocates the repeat-mode callback request
for automatic essential stat update. Users can see the essential stats
are automatically updated for every second, using the sysfs interface.
The second command calls damon_commit() with a new callback request that
was made for the commit. As a result, the previously added repeat-mode
callback request is dropped from the list. The automatic stats refresh
stops working, and the memory for the repeat-mode callback request is
leaked. It can be confirmed using kmemleak.
Fix the mistake on the list_add_tail() call.
Link: https://lkml.kernel.org/r/20251014205939.1206-1-sj@kernel.org
Fixes: 004ded6bee ("mm/damon: accept parallel damon_call() requests")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7eca961dd7 upstream.
When damos_commit_quota_goals() is called for adding new DAMOS quota goals
of DAMOS_QUOTA_USER_INPUT metric, current_value fields of the new goals
should be also set as requested.
However, damos_commit_quota_goals() is not updating the field for the
case, since it is setting only metrics and target values using
damos_new_quota_goal(), and metric-optional union fields using
damos_commit_quota_goal_union(). As a result, users could see the first
current_value parameter that committed online with a new quota goal is
ignored. Users are assumed to commit the current_value for
DAMOS_QUOTA_USER_INPUT quota goals, since it is being used as a feedback.
Hence the real impact would be subtle. That said, this is obviously not
intended behavior.
Fix the issue by using damos_commit_quota_goal() which sets all quota goal
parameters, instead of damos_commit_quota_goal_union(), which sets only
the union fields.
Link: https://lkml.kernel.org/r/20251014001846.279282-1-sj@kernel.org
Fixes: 1aef9df0ee ("mm/damon/core: commit damos_quota_goal->nid")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.16+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e59f47c15 upstream.
Commit b714ccb02a ("mm/mremap: complete refactor of move_vma()")
mistakenly introduced a new behaviour - clearing the VM_ACCOUNT flag of
the old mapping when a mapping is mremap()'d with the MREMAP_DONTUNMAP
flag set.
While we always clear the VM_LOCKED and VM_LOCKONFAULT flags for the old
mapping (the page tables have been moved, so there is no data that could
possibly be locked in memory), there is no reason to touch any other VMA
flags.
This is because after the move the old mapping is in a state as if it were
freshly mapped. This implies that the attributes of the mapping ought to
remain the same, including whether or not the mapping is accounted.
Link: https://lkml.kernel.org/r/20251013165836.273113-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Fixes: b714ccb02a ("mm/mremap: complete refactor of move_vma()")
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 841a8bfcba upstream.
When performing memory error injection on a THP (Transparent Huge Page)
mapped to userspace on an x86 server, the kernel panics with the following
trace. The expected behavior is to terminate the affected process instead
of panicking the kernel, as the x86 Machine Check code can recover from an
in-userspace #MC.
mce: [Hardware Error]: CPU 0: Machine Check Exception: f Bank 3: bd80000000070134
mce: [Hardware Error]: RIP 10:<ffffffff8372f8bc> {memchr_inv+0x4c/0xf0}
mce: [Hardware Error]: TSC afff7bbff88a ADDR 1d301b000 MISC 80 PPIN 1e741e77539027db
mce: [Hardware Error]: PROCESSOR 0:d06d0 TIME 1758093249 SOCKET 0 APIC 0 microcode 80000320
mce: [Hardware Error]: Run the above through 'mcelog --ascii'
mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel
Kernel panic - not syncing: Fatal local machine check
The root cause of this panic is that handling a memory failure triggered
by an in-userspace #MC necessitates splitting the THP. The splitting
process employs a mechanism, implemented in
try_to_map_unused_to_zeropage(), which reads the pages in the THP to
identify zero-filled pages. However, reading the pages in the THP results
in a second in-kernel #MC, occurring before the initial memory_failure()
completes, ultimately leading to a kernel panic. See the kernel panic
call trace on the two #MCs.
First Machine Check occurs // [1]
memory_failure() // [2]
try_to_split_thp_page()
split_huge_page()
split_huge_page_to_list_to_order()
__folio_split() // [3]
remap_page()
remove_migration_ptes()
remove_migration_pte()
try_to_map_unused_to_zeropage() // [4]
memchr_inv() // [5]
Second Machine Check occurs // [6]
Kernel panic
[1] Triggered by accessing a hardware-poisoned THP in userspace, which is
typically recoverable by terminating the affected process.
[2] Call folio_set_has_hwpoisoned() before try_to_split_thp_page().
[3] Pass the RMP_USE_SHARED_ZEROPAGE remap flag to remap_page().
[4] Try to map the unused THP to zeropage.
[5] Re-access pages in the hw-poisoned THP in the kernel.
[6] Triggered in-kernel, leading to a panic kernel.
In Step[2], memory_failure() sets the poisoned flag on the page in the THP
by TestSetPageHWPoison() before calling try_to_split_thp_page().
As suggested by David Hildenbrand, fix this panic by not accessing to the
poisoned page in the THP during zeropage identification, while continuing
to scan unaffected pages in the THP for possible zeropage mapping. This
prevents a second in-kernel #MC that would cause kernel panic in Step[4].
Thanks to Andrew Zaborowski for his initial work on fixing this issue.
Link: https://lkml.kernel.org/r/20251015064926.1887643-1-qiuxu.zhuo@intel.com
Link: https://lkml.kernel.org/r/20251011075520.320862-1-qiuxu.zhuo@intel.com
Fixes: b1f202060a ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reported-by: Farrah Chen <farrah.chen@intel.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Acked-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jiaqi Yan <jiaqiyan@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e84cb860ac upstream.
The special C-flag case expects the ADD_ADDR to be received when
switching to 'fully-established'. But for various reasons, the ADD_ADDR
could be sent after the "4th ACK", and the special case doesn't work.
On NIPA, the new test validating this special case for the C-flag failed
a few times, e.g.
102 default limits, server deny join id 0
syn rx [FAIL] got 0 JOIN[s] syn rx expected 2
Server ns stats
(...)
MPTcpExtAddAddrTx 1
MPTcpExtEchoAdd 1
Client ns stats
(...)
MPTcpExtAddAddr 1
MPTcpExtEchoAddTx 1
synack rx [FAIL] got 0 JOIN[s] synack rx expected 2
ack rx [FAIL] got 0 JOIN[s] ack rx expected 2
join Rx [FAIL] see above
syn tx [FAIL] got 0 JOIN[s] syn tx expected 2
join Tx [FAIL] see above
I had a suspicion about what the issue could be: the ADD_ADDR might have
been received after the switch to the 'fully-established' state. The
issue was not easy to reproduce. The packet capture shown that the
ADD_ADDR can indeed be sent with a delay, and the client would not try
to establish subflows to it as expected.
A simple fix is not to mark the endpoints as 'used' in the C-flag case,
when looking at creating subflows to the remote initial IP address and
port. In this case, there is no need to try.
Note: newly added fullmesh endpoints will still continue to be used as
expected, thanks to the conditions behind mptcp_pm_add_addr_c_flag_case.
Fixes: 4b1ff850e0 ("mptcp: pm: in-kernel: usable client side with C-flag")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251020-net-mptcp-c-flag-late-add-addr-v1-1-8207030cb0e8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5370c31e84 upstream.
Ensure the TX descriptor type fields are published in a safe order so the
DMA engine never begins processing a descriptor chain before all descriptor
fields are fully initialised.
For multi-descriptor transmits the driver writes DT_FEND into the last
descriptor and DT_FSTART into the first. The DMA engine begins processing
when it observes DT_FSTART. Move the dma_wmb() barrier so it executes
immediately after DT_FEND and immediately before writing DT_FSTART
(and before DT_FSINGLE in the single-descriptor case). This guarantees
that all prior CPU writes to the descriptor memory are visible to the
device before DT_FSTART is seen.
This avoids a situation where compiler/CPU reordering could publish
DT_FSTART ahead of DT_FEND or other descriptor fields, allowing the DMA to
start on a partially initialised chain and causing corrupted transmissions
or TX timeouts. Such a failure was observed on RZ/G2L with an RT kernel as
transmit queue timeouts and device resets.
Fixes: 2f45d1902a ("ravb: minimize TX data copying")
Cc: stable@vger.kernel.org
Co-developed-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://patch.msgid.link/20251017151830.171062-4-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75cea9860a upstream.
TX frames aren't padded and unknown memory is sent into the ether.
Theoretically, it isn't even guaranteed that the extra memory exists
and can be sent out, which could cause further problems. In practice,
I found that plenty of tailroom exists in the skb itself (in my test
with ping at least) and skb_padto() easily succeeds, so use it here.
In the event of -ENOMEM drop the frame like other drivers do.
The use of one more padding byte instead of a USB zero-length packet
is retained to avoid regression. I have a dodgy Etron xHCI controller
which doesn't seem to support sending ZLPs at all.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251014203528.3f9783c4.michal.pecio@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7c877e753 upstream.
Syzbot reported a potential lock inversion deadlock between
vsock_register_mutex and sk_lock-AF_VSOCK when vsock_linger() is called.
The issue was introduced by commit 687aa0c558 ("vsock: Fix
transport_* TOCTOU") which added vsock_register_mutex locking in
vsock_assign_transport() around the transport->release() call, that can
call vsock_linger(). vsock_assign_transport() can be called with sk_lock
held. vsock_linger() calls sk_wait_event() that temporarily releases and
re-acquires sk_lock. During this window, if another thread hold
vsock_register_mutex while trying to acquire sk_lock, a circular
dependency is created.
Fix this by releasing vsock_register_mutex before calling
transport->release() and vsock_deassign_transport(). This is safe
because we don't need to hold vsock_register_mutex while releasing the
old transport, and we ensure the new transport won't disappear by
obtaining a module reference first via try_module_get().
Reported-by: syzbot+10e35716f8e4929681fa@syzkaller.appspotmail.com
Tested-by: syzbot+10e35716f8e4929681fa@syzkaller.appspotmail.com
Fixes: 687aa0c558 ("vsock: Fix transport_* TOCTOU")
Cc: mhal@rbox.co
Cc: stable@vger.kernel.org
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20251021121718.137668-1-sgarzare@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf5570590a upstream.
MIPS Malta platform code registers the PCI southbridge legacy port I/O
PS/2 keyboard range as a standard resource marked as busy. It prevents
the i8042 driver from registering as it fails to claim the resource in
a call to i8042_platform_init(). Consequently PS/2 keyboard and mouse
devices cannot be used with this platform.
Fix the issue by removing the busy marker from the standard reservation,
making the driver register successfully:
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
and the resource show up as expected among the legacy devices:
00000000-00ffffff : MSC PCI I/O
00000000-0000001f : dma1
00000020-00000021 : pic1
00000040-0000005f : timer
00000060-0000006f : keyboard
00000060-0000006f : i8042
00000070-00000077 : rtc0
00000080-0000008f : dma page reg
000000a0-000000a1 : pic2
000000c0-000000df : dma2
[...]
If the i8042 driver has not been configured, then the standard resource
will remain there preventing any conflicting dynamic assignment of this
PCI port I/O address range.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/alpine.DEB.2.21.2510211919240.8377@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e93ac51e4 upstream.
Since the commit c1f3f9797c ("can: netlink: can_changelink(): fix NULL
pointer deref of struct can_priv::do_set_mode"), the automatic restart
delay can only be set for devices that implement the restart handler struct
can_priv::do_set_mode. As it makes no sense to configure a automatic
restart for devices that doesn't support it.
However, since systemd commit 13ce5d4632e3 ("network/can: properly handle
CAN.RestartSec=0") [1], systemd-networkd correctly handles a restart delay
of "0" (i.e. the restart is disabled). Which means that a disabled restart
is always configured in the kernel.
On systems with both changes active this causes that CAN interfaces that
don't implement a restart handler cannot be brought up by systemd-networkd.
Solve this problem by allowing a delay of "0" to be configured, even if the
device does not implement a restart handler.
[1] 13ce5d4632
Cc: stable@vger.kernel.org
Cc: Andrei Lalaev <andrey.lalaev@gmail.com>
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
Closes: https://lore.kernel.org/all/20251020-certain-arrogant-vole-of-sunshine-141841-mkl@pengutronix.de
Fixes: c1f3f9797c ("can: netlink: can_changelink(): fix NULL pointer deref of struct can_priv::do_set_mode")
Link: https://patch.msgid.link/20251020-netlink-fix-restart-v1-1-3f53c7f8520b@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f434e1d9a upstream.
If two competing threads enter alloc_slab_obj_exts(), and the one that
allocates the vector wins the cmpxchg(), the other thread that failed
allocation mistakenly assumes that slab->obj_exts is still empty due to
its own allocation failure. This will then trigger warnings with
CONFIG_MEM_ALLOC_PROFILING_DEBUG checks in the subsequent free path.
Therefore, let's check the result of cmpxchg() to see if marking the
allocation as failed was successful. If it wasn't, check whether the
winning side has succeeded its allocation (it might have been also
marking it as failed) and if yes, return success.
Suggested-by: Harry Yoo <harry.yoo@oracle.com>
Fixes: f7381b9116 ("slab: mark slab->obj_exts allocation failures unconditionally")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Link: https://patch.msgid.link/20251023143313.1327968-1-hao.ge@linux.dev
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ed8bfd24c upstream.
If two competing threads enter alloc_slab_obj_exts() and one of them
fails to allocate the object extension vector, it might override the
valid slab->obj_exts allocated by the other thread with
OBJEXTS_ALLOC_FAIL. This will cause the thread that lost this race and
expects a valid pointer to dereference a NULL pointer later on.
Update slab->obj_exts atomically using cmpxchg() to avoid
slab->obj_exts overrides by racing threads.
Thanks for Vlastimil and Suren's help with debugging.
Fixes: f7381b9116 ("slab: mark slab->obj_exts allocation failures unconditionally")
Cc: <stable@vger.kernel.org>
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Link: https://patch.msgid.link/20251021010353.1187193-1-hao.ge@linux.dev
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cfec502b3d upstream.
Regardless of the DeviceContext of a device, we can't give any
guarantees about the DeviceContext of its parent device.
This is very subtle, since it's only caused by a simple typo, i.e.
Self::from_raw(parent)
which preserves the DeviceContext in this case, vs.
Device::from_raw(parent)
which discards the DeviceContext.
(I should have noticed it doing the correct thing in auxiliary::Device
subsequently, but somehow missed it.)
Hence, fix both Device::parent() and auxiliary::Device::parent().
Cc: stable@vger.kernel.org
Fixes: a4c9f71e34 ("rust: device: implement Device::parent()")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2dc99ea272 upstream.
In has_thead_homogeneous_vlenb(), smatch detected that the vlenb variable
could be used while uninitialized. It appears that this could happen if
no CPUs described in DT have the "thead,vlenb" property.
Fix by initializing vlenb to 0, which will keep thead_vlenb_of set to 0
(as it was statically initialized). This in turn will cause
riscv_v_setup_vsize() to fall back to CSR probing - the desired result if
thead,vlenb isn't provided in the DT data.
While here, fix a nearby comment typo.
Cc: stable@vger.kernel.org
Cc: Charlie Jenkins <charlie@rivosinc.com>
Fixes: 377be47f90 ("riscv: vector: Use vlenb from DT for thead")
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Link: https://lore.kernel.org/r/22674afb-2fe8-2a83-1818-4c37bd554579@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 10fad40122 upstream.
It is reported that commit 85975daeaa ("cpuidle: menu: Avoid discarding
useful information") led to a performance regression on Intel Jasper Lake
systems because it reduced the time spent by CPUs in idle state C7 which
is correlated to the maximum frequency the CPUs can get to because of an
average running power limit [1].
Before that commit, get_typical_interval() would have returned UINT_MAX
whenever it had been unable to make a high-confidence prediction which
had led to selecting the deepest available idle state too often and
both power and performance had been inadequate as a result of that on
some systems. However, this had not been a problem on systems with
relatively aggressive average running power limits, like the Jasper Lake
systems in question, because on those systems it was compensated by the
ability to run CPUs faster.
It was addressed by causing get_typical_interval() to return a number
based on the recent idle duration information available to it even if it
could not make a high-confidence prediction, but that clearly did not
take the possible correlation between idle power and available CPU
capacity into account.
For this reason, revert most of the changes made by commit 85975daeaa,
except for one cosmetic cleanup, and add a comment explaining the
rationale for returning UINT_MAX from get_typical_interval() when it
is unable to make a high-confidence prediction.
Fixes: 85975daeaa ("cpuidle: menu: Avoid discarding useful information")
Closes: https://lore.kernel.org/linux-pm/36iykr223vmcfsoysexug6s274nq2oimcu55ybn6ww4il3g3cv@cohflgdbpnq7/ [1]
Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/3663603.iIbC2pHGDl@rafael.j.wysocki
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f477af0cfa upstream.
On a filesystem with parent pointers, xchk_nlinks_collect_dir walks both
the directory entries (data fork) and the parent pointers (attr fork) to
determine the correct link count. Unfortunately I forgot to update the
lock mode logic to handle the case of a directory whose attr fork is in
btree format and has not yet been loaded *and* whose data fork doesn't
need loading.
This leads to a bunch of assertions from xfs/286 in xfs_iread_extents
because we only took ILOCK_SHARED, not ILOCK_EXCL. You'd need the rare
happenstance of a directory with a large number of non-pptr extended
attributes set and enough memory pressure to cause the directory to be
evicted and partially reloaded from disk.
I /think/ this only started in 6.18-rc1 because I've started seeing OOM
errors with the maple tree slab using 70% of memory, and this didn't
happen in 6.17. Yay dynamic systems!
Cc: stable@vger.kernel.org # v6.10
Fixes: 77ede5f44b ("xfs: walk directory parent pointers to determine backref count")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1fabe43b4e upstream.
Commit 29d6d30f5c ("Btrfs: send, don't send rmdir for same target
multiple times") has fixed an issue that a send stream contained a rmdir
operation for the same directory multiple times. After that fix we keep
track of the last directory for which we sent a rmdir operation and
compare with it before sending a rmdir for the parent inode of a deleted
hardlink we are processing. But there is still a corner case that in
between rmdir dir operations for the same inode we find deleted hardlinks
for other parent inodes, so tracking just the last inode for which we sent
a rmdir operation is not enough.
Hardlinks of a file in the same directory are stored in the same INODE_REF
item, but if the number of hardlinks is too large and can not fit in a
leaf, we use INODE_EXTREF items to store them. The key of an INODE_EXTREF
item is (inode_id, INODE_EXTREF, hash[name, parent ino]), so between two
hardlinks for the same parent directory, we can find others for other
parent directories. For example for the reproducer below we get the
following (from a btrfs inspect-internal dump-tree output):
item 0 key (259 INODE_EXTREF 2309449) itemoff 16257 itemsize 26
index 6925 parent 257 namelen 8 name: foo.6923
item 1 key (259 INODE_EXTREF 2311350) itemoff 16231 itemsize 26
index 6588 parent 258 namelen 8 name: foo.6587
item 2 key (259 INODE_EXTREF 2457395) itemoff 16205 itemsize 26
index 6611 parent 257 namelen 8 name: foo.6609
(...)
So tracking the last directory's inode number does not work in this case
since we process a link for parent inode 257, then for 258 and then back
again for 257, and that second time we process a deleted link for 257 we
think we have not yet sent a rmdir operation.
Fix this by using a rbtree to keep track of all the directories for which
we have already sent rmdir operations, and add those directories to the
'check_dirs' ref list in process_recorded_refs() only if the directory is
not yet in the rbtree, otherwise skip it since it means we have already
sent a rmdir operation for that directory.
The following test script reproduces the problem:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdi
MNT=/mnt/sdi
mkfs.btrfs -f $DEV
mount $DEV $MNT
mkdir $MNT/a $MNT/b
echo 123 > $MNT/a/foo
for ((i = 1; i <= 1000; i++)); do
ln $MNT/a/foo $MNT/a/foo.$i
ln $MNT/a/foo $MNT/b/foo.$i
done
btrfs subvolume snapshot -r $MNT $MNT/snap1
btrfs send $MNT/snap1 -f /tmp/base.send
rm -r $MNT/a $MNT/b
btrfs subvolume snapshot -r $MNT $MNT/snap2
btrfs send -p $MNT/snap1 $MNT/snap2 -f /tmp/incremental.send
umount $MNT
mkfs.btrfs -f $DEV
mount $DEV $MNT
btrfs receive $MNT -f /tmp/base.send
btrfs receive $MNT -f /tmp/incremental.send
rm -f /tmp/base.send /tmp/incremental.send
umount $MNT
When running it, it fails like this:
$ ./test.sh
(...)
At subvol snap1
At snapshot snap2
ERROR: rmdir o257-9-0 failed: No such file or directory
CC: <stable@vger.kernel.org>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Ting-Chang Hou <tchou@synology.com>
[ Updated changelog ]
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a94e065726 upstream.
The current approach is a bit naive, and hence calls the time querying
way too often. Only start the "doing work" timer when there's actual
work to do, and then use that information to terminate (and account) the
work time once done. This greatly reduces the frequency of these calls,
when they cannot have changed anyway.
Running a basic random reader that is setup to use SQPOLL, a profile
before this change shows these as the top cycle consumers:
+ 32.60% iou-sqp-1074 [kernel.kallsyms] [k] thread_group_cputime_adjusted
+ 19.97% iou-sqp-1074 [kernel.kallsyms] [k] thread_group_cputime
+ 12.20% io_uring io_uring [.] submitter_uring_fn
+ 4.13% iou-sqp-1074 [kernel.kallsyms] [k] getrusage
+ 2.45% iou-sqp-1074 [kernel.kallsyms] [k] io_submit_sqes
+ 2.18% iou-sqp-1074 [kernel.kallsyms] [k] __pi_memset_generic
+ 2.09% iou-sqp-1074 [kernel.kallsyms] [k] cputime_adjust
and after this change, top of profile looks as follows:
+ 36.23% io_uring io_uring [.] submitter_uring_fn
+ 23.26% iou-sqp-819 [kernel.kallsyms] [k] io_sq_thread
+ 10.14% iou-sqp-819 [kernel.kallsyms] [k] io_sq_tw
+ 6.52% iou-sqp-819 [kernel.kallsyms] [k] tctx_task_work_run
+ 4.82% iou-sqp-819 [kernel.kallsyms] [k] nvme_submit_cmds.part.0
+ 2.91% iou-sqp-819 [kernel.kallsyms] [k] io_submit_sqes
[...]
0.02% iou-sqp-819 [kernel.kallsyms] [k] cputime_adjust
where it's spending the cycles on things that actually matter.
Reported-by: Fengnan Chang <changfengnan@bytedance.com>
Cc: stable@vger.kernel.org
Fixes: 3fcb9d1720 ("io_uring/sqpoll: statistics of the true utilization of sq threads")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8ac9b0d33e upstream.
getrusage() does a lot more than what the SQPOLL accounting needs, the
latter only cares about (and uses) the stime. Rather than do a full
RUSAGE_SELF summation, just query the used stime instead.
Cc: stable@vger.kernel.org
Fixes: 3fcb9d1720 ("io_uring/sqpoll: statistics of the true utilization of sq threads")
Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d15d2ad36 upstream.
The hwprobe vDSO data for some keys, like MISALIGNED_VECTOR_PERF,
is determined by an asynchronous kthread. This can create a race
condition where the kthread finishes after the vDSO data has
already been populated, causing userspace to read stale values.
To fix this race, a new 'ready' flag is added to the vDSO data,
initialized to 'false' during arch_initcall_sync. This flag is
checked by both the vDSO's user-space code and the riscv_hwprobe
syscall. The syscall serves as a one-time gate, using a completion
to wait for any pending probes before populating the data and
setting the flag to 'true', thus ensuring userspace reads fresh
values on its first request.
Reported-by: Tsukasa OI <research_trasio@irq.a4lg.com>
Closes: https://lore.kernel.org/linux-riscv/760d637b-b13b-4518-b6bf-883d55d44e7f@irq.a4lg.com/
Fixes: e7c9d66e31 ("RISC-V: Report vector unaligned access speed hwprobe")
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Co-developed-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Jingwei Wang <wangjingwei@iscas.ac.cn>
Link: https://lore.kernel.org/r/20250811142035.105820-1-wangjingwei@iscas.ac.cn
[pjw@kernel.org: fix checkpatch issues]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2eead19334 upstream.
Fix incorrect use of PTR_ERR_OR_ZERO() in topology_parse_cpu_capacity()
which causes the code to proceed with NULL clock pointers. The current
logic uses !PTR_ERR_OR_ZERO(cpu_clk) which evaluates to true for both
valid pointers and NULL, leading to potential NULL pointer dereference
in clk_get_rate().
Per include/linux/err.h documentation, PTR_ERR_OR_ZERO(ptr) returns:
"The error code within @ptr if it is an error pointer; 0 otherwise."
This means PTR_ERR_OR_ZERO() returns 0 for both valid pointers AND NULL
pointers. Therefore !PTR_ERR_OR_ZERO(cpu_clk) evaluates to true (proceed)
when cpu_clk is either valid or NULL, causing clk_get_rate(NULL) to be
called when of_clk_get() returns NULL.
Replace with !IS_ERR_OR_NULL(cpu_clk) which only proceeds for valid
pointers, preventing potential NULL pointer dereference in clk_get_rate().
Cc: stable <stable@kernel.org>
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Fixes: b8fe128dad ("arch_topology: Adjust initial CPU capacities with current freq")
Link: https://patch.msgid.link/20250923174308.1771906-1-kaushlendra.kumar@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03521c892b upstream.
Commit 370645f41e ("dma-mapping: force bouncing if the kmalloc() size is
not cache-line-aligned") introduced DMA_BOUNCE_UNALIGNED_KMALLOC feature
and permitted architecture specific code configure kmalloc slabs with
sizes smaller than the value of dma_get_cache_alignment().
When that feature is enabled, the physical address of some small
kmalloc()-ed buffers might be not aligned to the CPU cachelines, thus not
really suitable for typical DMA. To properly handle that case a SWIOTLB
buffer bouncing is used, so no CPU cache corruption occurs. When that
happens, there is no point reporting a false-positive DMA-API warning that
the buffer is not properly aligned, as this is not a client driver fault.
[m.szyprowski@samsung.com: replace is_swiotlb_allocated() with is_swiotlb_active(), per Catalin]
Link: https://lkml.kernel.org/r/20251010173009.3916215-1-m.szyprowski@samsung.com
Link: https://lkml.kernel.org/r/20251009141508.2342138-1-m.szyprowski@samsung.com
Fixes: 370645f41e ("dma-mapping: force bouncing if the kmalloc() size is not cache-line-aligned")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Inki Dae <m.szyprowski@samsung.com>
Cc: Robin Murohy <robin.murphy@arm.com>
Cc: "Isaac J. Manjarres" <isaacmanjarres@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7c4bb43bf upstream.
Calling intotify_show_fdinfo() on fd watching an overlayfs inode, while
the overlayfs is being unmounted, can lead to dereferencing NULL ptr.
This issue was found by syzkaller.
Race Condition Diagram:
Thread 1 Thread 2
-------- --------
generic_shutdown_super()
shrink_dcache_for_umount
sb->s_root = NULL
|
| vfs_read()
| inotify_fdinfo()
| * inode get from mark *
| show_mark_fhandle(m, inode)
| exportfs_encode_fid(inode, ..)
| ovl_encode_fh(inode, ..)
| ovl_check_encode_origin(inode)
| * deref i_sb->s_root *
|
|
v
fsnotify_sb_delete(sb)
Which then leads to:
[ 32.133461] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 32.134438] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[ 32.135032] CPU: 1 UID: 0 PID: 4468 Comm: systemd-coredum Not tainted 6.17.0-rc6 #22 PREEMPT(none)
<snip registers, unreliable trace>
[ 32.143353] Call Trace:
[ 32.143732] ovl_encode_fh+0xd5/0x170
[ 32.144031] exportfs_encode_inode_fh+0x12f/0x300
[ 32.144425] show_mark_fhandle+0xbe/0x1f0
[ 32.145805] inotify_fdinfo+0x226/0x2d0
[ 32.146442] inotify_show_fdinfo+0x1c5/0x350
[ 32.147168] seq_show+0x530/0x6f0
[ 32.147449] seq_read_iter+0x503/0x12a0
[ 32.148419] seq_read+0x31f/0x410
[ 32.150714] vfs_read+0x1f0/0x9e0
[ 32.152297] ksys_read+0x125/0x240
IOW ovl_check_encode_origin derefs inode->i_sb->s_root, after it was set
to NULL in the unmount path.
Fix it by protecting calling exportfs_encode_fid() from
show_mark_fhandle() with s_umount lock.
This form of fix was suggested by Amir in [1].
[1]: https://lore.kernel.org/all/CAOQ4uxhbDwhb+2Brs1UdkoF0a3NSdBAOQPNfEHjahrgoKJpLEw@mail.gmail.com/
Fixes: c45beebfde ("ovl: support encoding fid from inode with no alias")
Signed-off-by: Jakub Acs <acsjakub@amazon.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Christian Brauner <brauner@kernel.org>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 399d109347 ]
Currently, during the LAN8814 PTP probe shared->phydev is only set if PTP
clock gets actually set, otherwise the function will return before setting
it.
This is an issue as shared->phydev is unconditionally being used when IRQ
is being handled, especially in lan8814_gpio_process_cap and since it was
not set it will cause a NULL pointer exception and crash the kernel.
So, simply always set shared->phydev to avoid the NULL pointer exception.
Fixes: b3f1a08fcf ("net: phy: micrel: Add support for PTP_PF_EXTTS for lan8814")
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://patch.msgid.link/20251021132034.983936-1-robert.marko@sartura.hr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit efd729408b ]
openvpn TCP encapsulation uses a custom queue to deliver packets to
userspace. Currently it relies on datagram_poll, which checks
sk_receive_queue, leading to false readiness signals when that queue
contains non-userspace packets.
Switch ovpn_tcp_poll to use datagram_poll_queue with the peer's
user_queue, ensuring poll only signals readiness when userspace data is
actually available. Also refactor ovpn_tcp_poll in order to enforce the
assumption we can make on the lifetime of ovpn_sock and peer.
Fixes: 11851cbd60 ("ovpn: implement TCP transport")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251021100942.195010-4-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f6ceec6434 ]
Some protocols using TCP encapsulation (e.g., espintcp, openvpn) deliver
userspace-bound packets through a custom skb queue rather than the
standard sk_receive_queue.
Introduce datagram_poll_queue that accepts an explicit receive queue,
and convert datagram_poll into a wrapper around datagram_poll_queue.
This allows protocols with custom skb queues to reuse the core polling
logic without relying on sk_receive_queue.
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Antonio Quartulli <antonio@openvpn.net>
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Antonio Quartulli <antonio@openvpn.net>
Link: https://patch.msgid.link/20251021100942.195010-2-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: efd729408b ("ovpn: use datagram_poll_queue for socket readiness in TCP")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0fc3e32c2c ]
espintcp uses a custom queue (ike_queue) to deliver packets to
userspace. The polling logic relies on datagram_poll, which checks
sk_receive_queue, which can lead to false readiness signals when that
queue contains non-userspace packets.
Switch espintcp_poll to use datagram_poll_queue with ike_queue, ensuring
poll only signals readiness when userspace data is actually available.
Fixes: e27cca96cd ("xfrm: add espintcp (RFC 8229)")
Signed-off-by: Ralf Lici <ralf@mandelbit.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251021100942.195010-3-ralf@mandelbit.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c0178eec88 ]
HSR/PRP driver does not handle correctly having slaves/interlink devices
in a different net namespace. Currently, it is possible to create a HSR
link in a different net namespace than the slaves/interlink with the
following command:
ip link add hsr0 netns hsr-ns type hsr slave1 eth1 slave2 eth2
As there is no use-case on supporting this scenario, enforce that HSR
device link matches netns defined by IFLA_LINK_NETNSID.
The iproute2 command mentioned above will throw the following error:
Error: hsr: HSR slaves/interlink must be on the same net namespace than HSR link.
Fixes: f421436a59 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20251020135533.9373-1-fmancera@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e13d315ae0 ]
Robert reported an infinite loop observed by two crafted images.
The root cause is that `clusterofs` can be larger than `lclustersize`
for !NONHEAD `lclusters` in corrupted subpage compact indexes, e.g.:
blocksize = lclustersize = 512 lcn = 6 clusterofs = 515
Move the corresponding check for full compress indexes to
`z_erofs_load_lcluster_from_disk()` to also cover subpage compact
compress indexes.
It also fixes the position of `m->type >= Z_EROFS_LCLUSTER_TYPE_MAX`
check, since it should be placed right after
`z_erofs_load_{compact,full}_lcluster()`.
Fixes: 8d2517aaee ("erofs: fix up compacted indexes for block size < 4096")
Fixes: 1a5223c182 ("erofs: do sanity check on m->type in z_erofs_load_compact_lcluster()")
Reported-by: Robert Morris <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/r/35167.1760645886@localhost
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 143937ca51 ]
Current pte_mkwrite_novma() makes PTE dirty unconditionally. This may
mark some pages that are never written dirty wrongly. For example,
do_swap_page() may map the exclusive pages with writable and clean PTEs
if the VMA is writable and the page fault is for read access.
However, current pte_mkwrite_novma() implementation always dirties the
PTE. This may cause unnecessary disk writing if the pages are
never written before being reclaimed.
So, change pte_mkwrite_novma() to clear the PTE_RDONLY bit only if the
PTE_DIRTY bit is set to make it possible to make the PTE writable and
clean.
The current behavior was introduced in commit 73e86cb03c ("arm64:
Move PTE_RDONLY bit handling out of set_pte_at()"). Before that,
pte_mkwrite() only sets the PTE_WRITE bit, while set_pte_at() only
clears the PTE_RDONLY bit if both the PTE_WRITE and the PTE_DIRTY bits
are set.
To test the performance impact of the patch, on an arm64 server
machine, run 16 redis-server processes on socket 1 and 16
memtier_benchmark processes on socket 0 with mostly get
transactions (that is, redis-server will mostly read memory only).
The memory footprint of redis-server is larger than the available
memory, so swap out/in will be triggered. Test results show that the
patch can avoid most swapping out because the pages are mostly clean.
And the benchmark throughput improves ~23.9% in the test.
Fixes: 73e86cb03c ("arm64: Move PTE_RDONLY bit handling out of set_pte_at()")
Signed-off-by: Huang Ying <ying.huang@linux.alibaba.com>
Cc: Will Deacon <will@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Gavin Shan <gshan@redhat.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49d34f3dd8 ]
Resolve race conditions in timestamp events list handling between TX
and RX paths causing missed timestamps.
The current implementation uses a single events list for both TX and RX
timestamps. The am65_cpts_find_ts() function acquires the lock,
splices all events (TX as well as RX events) to a temporary list,
and releases the lock. This function performs matching of timestamps
for TX packets only. Before it acquires the lock again to put the
non-TX events back to the main events list, a concurrent RX
processing thread could acquire the lock (as observed in practice),
find an empty events list, and fail to attach timestamp to it,
even though a relevant event exists in the spliced list which is yet to
be restored to the main list.
Fix this by creating separate events lists to handle TX and RX
timestamps independently.
Fixes: c459f606f6 ("net: ethernet: ti: am65-cpts: Enable RX HW timestamp for PTP packets using CPTS FIFO")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://patch.msgid.link/20251016115755.1123646-1-a-garg7@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 87bcef158a ]
XDP programs can change the layout of an xdp_buff through
bpf_xdp_adjust_tail() and bpf_xdp_adjust_head(). Therefore, the driver
cannot assume the size of the linear data area nor fragments. Fix the
bug in mlx5 by generating skb according to xdp_buff after XDP programs
run.
Currently, when handling multi-buf XDP, the mlx5 driver assumes the
layout of an xdp_buff to be unchanged. That is, the linear data area
continues to be empty and fragments remain the same. This may cause
the driver to generate erroneous skb or triggering a kernel
warning. When an XDP program added linear data through
bpf_xdp_adjust_head(), the linear data will be ignored as
mlx5e_build_linear_skb() builds an skb without linear data and then
pull data from fragments to fill the linear data area. When an XDP
program has shrunk the non-linear data through bpf_xdp_adjust_tail(),
the delta passed to __pskb_pull_tail() may exceed the actual nonlinear
data size and trigger the BUG_ON in it.
To fix the issue, first record the original number of fragments. If the
number of fragments changes after the XDP program runs, rewind the end
fragment pointer by the difference and recalculate the truesize. Then,
build the skb with the linear data area matching the xdp_buff. Finally,
only pull data in if there is non-linear data and fill the linear part
up to 256 bytes.
Fixes: f52ac7028b ("net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1760644540-899148-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afd5ba577c ]
XDP programs can release xdp_buff fragments when calling
bpf_xdp_adjust_tail(). The driver currently assumes the number of
fragments to be unchanged and may generate skb with wrong truesize or
containing invalid frags. Fix the bug by generating skb according to
xdp_buff after the XDP program runs.
Fixes: ea5d49bdae ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ")
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1760644540-899148-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a73ca0449b ]
sctp_vrf.sh could fail:
TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [FAIL]
not ok 1 selftests: net: sctp_vrf.sh # exit=3
The failure happens when the server bind in a new run conflicts with an
existing association from the previous run:
[1] ip netns exec $SERVER_NS ./sctp_hello server ...
[2] ip netns exec $CLIENT_NS ./sctp_hello client ...
[3] ip netns exec $SERVER_NS pkill sctp_hello ...
[4] ip netns exec $SERVER_NS ./sctp_hello server ...
It occurs if the client in [2] sends a message and closes immediately.
With the message unacked, no SHUTDOWN is sent. Killing the server in [3]
triggers a SHUTDOWN the client also ignores due to the unacked message,
leaving the old association alive. This causes the bind at [4] to fail
until the message is acked and the client responds to a second SHUTDOWN
after the server’s T2 timer expires (3s).
This patch fixes the issue by preventing the client from sending data.
Instead, the client blocks on recv() and waits for the server to close.
It also waits until both the server and the client sockets are fully
released in stop_server and wait_client before restarting.
Additionally, replace 2>&1 >/dev/null with -q in sysctl and grep, and
drop other redundant 2>&1 >/dev/null redirections, and fix a typo from
N to Y (connect successfully) in the description of the last test.
Fixes: a61bd7b9fe ("selftests: add a selftest for sctp vrf")
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Tested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/be2dacf52d0917c4ba5e2e8c5a9cb640740ad2b6.1760731574.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 902e81e679 ]
The blamed commit increased the needed headroom to account for
alignment. This means that the size required to always align a Tx buffer
was added inside the dpaa2_eth_needed_headroom() function. By doing
that, a manual adjustment of the pointer passed to PTR_ALIGN() was no
longer correct since the 'buffer_start' variable was already pointing
to the start of the skb's memory.
The behavior of the dpaa2-eth driver without this patch was to drop
frames on Tx even when the headroom was matching the 128 bytes
necessary. Fix this by removing the manual adjust of 'buffer_start' from
the PTR_MODE call.
Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fastmail.com/T/#u
Fixes: f422abe3f2 ("dpaa2-eth: increase the needed headroom to account for alignment")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Tested-by: Mathew McBride <matt@traverse.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e59bc32df2 ]
The ENETC RX ring uses the page halves flipping mechanism, each page is
split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is
defined to 2048 to indicate the size of half a page. However, the page
size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K,
but it could be configured to 16K or 64K.
When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct,
and the RX ring will always use the first half of the page. This is not
consistent with the description in the relevant kernel doc and commit
messages.
This issue is invisible in most cases, but if users want to increase
PAGE_SIZE to receive a Jumbo frame with a single buffer for some use
cases, it will not work as expected, because the buffer size of each
RX BD is fixed to 2048 bytes.
Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE
to (PAGE_SIZE >> 1), as described in the comment.
Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50bd33f6b3 ]
After applying the workaround for err050089, the LS1028A platform
experiences RCU stalls on RT kernel. This issue is caused by the
recursive acquisition of the read lock enetc_mdio_lock. Here list some
of the call stacks identified under the enetc_poll path that may lead to
a deadlock:
enetc_poll
-> enetc_lock_mdio
-> enetc_clean_rx_ring OR napi_complete_done
-> napi_gro_receive
-> enetc_start_xmit
-> enetc_lock_mdio
-> enetc_map_tx_buffs
-> enetc_unlock_mdio
-> enetc_unlock_mdio
After enetc_poll acquires the read lock, a higher-priority writer attempts
to acquire the lock, causing preemption. The writer detects that a
read lock is already held and is scheduled out. However, readers under
enetc_poll cannot acquire the read lock again because a writer is already
waiting, leading to a thread hang.
Currently, the deadlock is avoided by adjusting enetc_lock_mdio to prevent
recursive lock acquisition.
Fixes: 6d36ecdbc4 ("net: enetc: take the MDIO lock only once per NAPI poll cycle")
Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com>
Acked-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251015021427.180757-1-jianpeng.chang.cn@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a429b76114 ]
Robert recently reported two corrupted images that can cause system
crashes, which are related to the new encoded extents introduced
in Linux 6.15:
- The first one [1] has plen != 0 (e.g. plen == 0x2000000) but
(plen & Z_EROFS_EXTENT_PLEN_MASK) == 0. It is used to represent
special extents such as sparse extents (!EROFS_MAP_MAPPED), but
previously only plen == 0 was handled;
- The second one [2] has pa 0xffffffffffdcffed and plen 0xb4000,
then "cur [0xfffffffffffff000] += bvec.bv_len [0x1000]" in
"} while ((cur += bvec.bv_len) < end);" wraps around, causing an
out-of-bound access of pcl->compressed_bvecs[] in
z_erofs_submit_queue(). EROFS only supports 48-bit physical block
addresses (up to 1EiB for 4k blocks), so add a sanity check to
enforce this.
Fixes: 1d191b4ca5 ("erofs: implement encoded extent metadata")
Reported-by: Robert Morris <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/r/75022.1759355830@localhost [1]
Closes: https://lore.kernel.org/r/80524.1760131149@localhost [2]
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf29555f5b ]
Creating FDB entries is possible from a non-initial user namespace when
having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive
an EPERM because the capability is always checked against the initial
user namespace. This restricts the FDB management from unprivileged
containers.
Drop the netlink_capable check in rtnl_fdb_del as it was originally
dropped in c5c351088a and reintroduced in 1690be63a2 without
intention.
This patch was tested using a container on GyroidOS, where it was
possible to delete FDB entries from an unprivileged user namespace and
private network namespace.
Fixes: 1690be63a2 ("bridge: Add vlan support to static neighbors")
Reviewed-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Tested-by: Harshal Gohel <hg@simonwunderlich.de>
Signed-off-by: Johannes Wiesböck <johannes.wiesboeck@aisec.fraunhofer.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aaf043a568 ]
When building with Clang 20 or newer, there are some objtool warnings
from unexpected fallthroughs to other functions:
vmlinux.o: warning: objtool: mlx5e_mpwrq_mtts_per_wqe() falls through to next function mlx5e_mpwrq_max_num_entries()
vmlinux.o: warning: objtool: mlx5e_mpwrq_max_log_rq_size() falls through to next function mlx5e_get_linear_rq_headroom()
LLVM 20 contains an (admittedly problematic [1]) optimization [2] to
convert divide by zero into the equivalent of __builtin_unreachable(),
which invokes undefined behavior and destroys code generation when it is
encountered in a control flow graph.
mlx5e_mpwrq_umr_entry_size() returns 0 in the default case of an
unrecognized mlx5e_mpwrq_umr_mode value. mlx5e_mpwrq_mtts_per_wqe(),
which is inlined into mlx5e_mpwrq_max_log_rq_size(), uses the result of
mlx5e_mpwrq_umr_entry_size() in a divide operation without checking for
zero, so LLVM is able to infer there will be a divide by zero in this
case and invokes undefined behavior. While there is some proposed work
to isolate this undefined behavior and avoid the destructive code
generation that results in these objtool warnings, code should still be
defensive against divide by zero.
As the WARN_ONCE() implies that an invalid value should be handled
gracefully, return 1 instead of 0 in the default case so that the
results of this division operation is always valid.
Fixes: 168723c1f8 ("net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters")
Link: https://lore.kernel.org/CAGG=3QUk8-Ak7YKnRziO4=0z=1C_7+4jF+6ZeDQ9yF+kuTOHOQ@mail.gmail.com/ [1]
Link: 37932643ab [2]
Closes: https://github.com/ClangBuiltLinux/linux/issues/2131
Closes: https://github.com/ClangBuiltLinux/linux/issues/2132
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20251014-mlx5e-avoid-zero-div-from-mlx5e_mpwrq_umr_entry_size-v1-1-dc186b8819ef@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 85d7dda5a9 ]
After resuming from S4, all CPUs except the boot CPU have the wrong EPP
hint programmed. This is because when the CPUs were offlined the EPP value
was reset to 0.
This is a similar problem as fixed by
commit ba3319e590 ("cpufreq/amd-pstate: Fix a regression leading to EPP
0 after resume") and the solution is also similar. When offlining rather
than reset the values to zero, reset them to match those chosen by the
policy. When the CPUs are onlined again these values will be restored.
Closes: https://community.frame.work/t/increased-power-usage-after-resuming-from-suspend-on-ryzen-7040-kernel-6-15-regression/74531/20?u=mario_limonciello
Fixes: 608a76b652 ("cpufreq/amd-pstate: Add support for the "Requested CPU Min frequency" BIOS option")
Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a7b4747d8e ]
The lock-related debug logic (CONFIG_LOCK_STAT) in the kernel is noting
the following warning when the BlueField-3 SOC is booted:
BUG: key ffff00008a3402a8 has not been registered!
------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 4 PID: 592 at kernel/locking/lockdep.c:4801 lockdep_init_map_type+0x1d4/0x2a0
<snip>
Call trace:
lockdep_init_map_type+0x1d4/0x2a0
__kernfs_create_file+0x84/0x140
sysfs_add_file_mode_ns+0xcc/0x1cc
internal_create_group+0x110/0x3d4
internal_create_groups.part.0+0x54/0xcc
sysfs_create_groups+0x24/0x40
device_add+0x6e8/0x93c
device_register+0x28/0x40
__hwmon_device_register+0x4b0/0x8a0
devm_hwmon_device_register_with_groups+0x7c/0xe0
mlxbf_pmc_probe+0x1e8/0x3e0 [mlxbf_pmc]
platform_probe+0x70/0x110
The mlxbf_pmc driver must call sysfs_attr_init() during the
initialization of the "count_clock" data structure to avoid
this warning.
Fixes: 5efc800975 ("platform/mellanox: mlxbf-pmc: Add support for monitoring cycle count")
Reviewed-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Signed-off-by: David Thompson <davthompson@nvidia.com>
Link: https://patch.msgid.link/20251013155605.3589770-1-davthompson@nvidia.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee916dccd4 ]
This pattern isn't very documented, and apparently not used much outside
of 'make tools/help', but it has existed for over a decade (since commit
ea01fa9f63: "tools: Connect to the kernel build system").
However, it doesn't work very well for most cases, particularly the
useful "tools/all" target, because it overrides the LDFLAGS value with
an empty one.
And once overridden, 'make' will then not honor the tooling makefiles
trying to change it - which then makes any LDFLAGS use in the tooling
directory break, typically causing odd link errors.
Remove that LDFLAGS override, since it seems to be entirely historical.
The core kernel makefiles no longer modify LDFLAGS as part of the build,
and use kernel-specific link flags instead (eg 'KBUILD_LDFLAGS' and
friends).
This allows more of the 'make tools/*' cases to work. I say 'more',
because some of the tooling build rules make various other assumptions
or have other issues, so it's still a bit hit-or-miss. But those issues
tend to show up with the 'make -C tools xyz' pattern too, so now it's no
longer an issue of this particular 'tools/*' build rule being special.
Acked-by: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 43de0ac332 ]
Event ID is only using the attr::config bit [7, 0] but we check the
event range using the whole 64bit field. It blocks the usage of the
rest field of attr::config. Relax the check by only using the
bit [7, 0].
Acked-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Yushan Wang <wangyushan12@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 17e9521044 ]
Some RISC-V implementations may hang when attempting to write an
unsupported SATP mode, even though the latest RISC-V specification
states such writes should have no effect. To avoid this issue, the
logic for selecting SATP mode has been refined:
The kernel now determines the SATP mode limit by taking the minimum of
the value specified by the kernel command line (noXlvl) and the
"mmu-type" property in the device tree (FDT). If only one is specified,
use that.
- If the resulting limit is sv48 or higher, the kernel will probe SATP
modes from this limit downward until a supported mode is found.
- If the limit is sv39, the kernel will directly use sv39 without
probing.
This ensures SATP mode selection is safe and compatible with both
hardware and user configuration, minimizing the risk of hangs.
Signed-off-by: Junhui Liu <junhui.liu@pigmoral.tech>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Nutty Liu <liujingqi@lanxincomputing.com>
Link: https://lore.kernel.org/r/20250722-satp-from-fdt-v1-2-5ba22218fa5f@pigmoral.tech
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9316512b71 ]
PAGE_KERNEL_TEXT is an old macro that is used to tell kernel whether
kernel text has to be mapped read-only or read-write based on build
time options.
But nowadays, with functionnalities like jump_labels, static links,
etc ... more only less all kernels need to be read-write at some
point, and some combinations of configs failed to work due to
innacurate setting of PAGE_KERNEL_TEXT. On the other hand, today
we have CONFIG_STRICT_KERNEL_RWX which implements a more controlled
access to kernel modifications.
Instead of trying to keep PAGE_KERNEL_TEXT accurate with all
possible options that may imply kernel text modification, always
set kernel text read-write at startup and rely on
CONFIG_STRICT_KERNEL_RWX to provide accurate protection.
Do this by passing PAGE_KERNEL_X to map_kernel_page() in
__maping_ram_chunk() instead of passing PAGE_KERNEL_TEXT. Once
this is done, the only remaining user of PAGE_KERNEL_TEXT is
mmu_mark_initmem_nx() which uses it in a call to setibat().
As setibat() ignores the RW/RO, we can seamlessly replace
PAGE_KERNEL_TEXT by PAGE_KERNEL_X here as well and get rid of
PAGE_KERNEL_TEXT completely.
Reported-by: Erhard Furtner <erhard_f@mailbox.org>
Closes: https://lore.kernel.org/all/342b4120-911c-4723-82ec-d8c9b03a8aef@mailbox.org/
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/8e2d793abf87ae3efb8f6dce10f974ac0eda61b8.1757412205.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28c4d9bc07 ]
In gdlm_put_lock(), there is a small window of time in which the
DFL_UNMOUNT flag has been set but the lockspace hasn't been released,
yet. In that window, dlm may still call gdlm_ast() and gdlm_bast().
To prevent it from dereferencing freed glock objects, only free the
glock if the lockspace has actually been released.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f4d4ebc849 ]
The `ID_AA64MMFR4_EL1.EIESB` field, is an unsigned enumeration, but was
incorrectly defined as a `SignedEnum` when introduced in commit
cfc680bb04 ("arm64: sysreg: Add layout for ID_AA64MMFR4_EL1"). This is
corrected to `UnsignedEnum`.
Conversely, the `ID_AA64DFR0_EL1.DoubleLock` field, is a signed
enumeration, but was incorrectly defined as an `UnsignedEnum`. This is
corrected to `SignedEnum`, which wasn't correctly set when annotated as
such in commit ad16d4cf0b ("arm64/sysreg: Initial unsigned annotations
for ID registers").
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 00e58ff924 ]
In preparation for the future commit ("bitops: Add __attribute_const__ to generic
ffs()-family implementations"), which allows GCC's value range tracker
to see past ffs(), GCC 8 on ARM thinks that it might be possible that
"ffs(rq) - 8" used here:
v = FIELD_PREP(PCI_EXP_DEVCTL_READRQ, ffs(rq) - 8);
could wrap below 0, leading to a very large value, which would be out of
range for the FIELD_PREP() usage:
drivers/pci/pci.c: In function 'pcie_set_readrq':
include/linux/compiler_types.h:572:38: error: call to '__compiletime_assert_471' declared with attribute error: FIELD_PREP: value too large for the field
...
drivers/pci/pci.c:5896:6: note: in expansion of macro 'FIELD_PREP'
v = FIELD_PREP(PCI_EXP_DEVCTL_READRQ, ffs(rq) - 8);
^~~~~~~~~~
If the result of the ffs() is bounds checked before being used in
FIELD_PREP(), the value tracker seems happy again. :)
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/linux-pci/CA+G9fYuysVr6qT8bjF6f08WLyCJRG7aXAeSd2F7=zTaHHd7L+Q@mail.gmail.com/
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250905052836.work.425-kees@kernel.org
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2048ec5b98 ]
The syzbot reported issue in hfs_find_set_zero_bits():
=====================================================
BUG: KMSAN: uninit-value in hfs_find_set_zero_bits+0x74d/0xb60 fs/hfs/bitmap.c:45
hfs_find_set_zero_bits+0x74d/0xb60 fs/hfs/bitmap.c:45
hfs_vbm_search_free+0x13c/0x5b0 fs/hfs/bitmap.c:151
hfs_extend_file+0x6a5/0x1b00 fs/hfs/extent.c:408
hfs_get_block+0x435/0x1150 fs/hfs/extent.c:353
__block_write_begin_int+0xa76/0x3030 fs/buffer.c:2151
block_write_begin fs/buffer.c:2262 [inline]
cont_write_begin+0x10e1/0x1bc0 fs/buffer.c:2601
hfs_write_begin+0x85/0x130 fs/hfs/inode.c:52
cont_expand_zero fs/buffer.c:2528 [inline]
cont_write_begin+0x35a/0x1bc0 fs/buffer.c:2591
hfs_write_begin+0x85/0x130 fs/hfs/inode.c:52
hfs_file_truncate+0x1d6/0xe60 fs/hfs/extent.c:494
hfs_inode_setattr+0x964/0xaa0 fs/hfs/inode.c:654
notify_change+0x1993/0x1aa0 fs/attr.c:552
do_truncate+0x28f/0x310 fs/open.c:68
do_ftruncate+0x698/0x730 fs/open.c:195
do_sys_ftruncate fs/open.c:210 [inline]
__do_sys_ftruncate fs/open.c:215 [inline]
__se_sys_ftruncate fs/open.c:213 [inline]
__x64_sys_ftruncate+0x11b/0x250 fs/open.c:213
x64_sys_call+0xfe3/0x3db0 arch/x86/include/generated/asm/syscalls_64.h:78
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd9/0x210 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Uninit was created at:
slab_post_alloc_hook mm/slub.c:4154 [inline]
slab_alloc_node mm/slub.c:4197 [inline]
__kmalloc_cache_noprof+0x7f7/0xed0 mm/slub.c:4354
kmalloc_noprof include/linux/slab.h:905 [inline]
hfs_mdb_get+0x1cc8/0x2a90 fs/hfs/mdb.c:175
hfs_fill_super+0x3d0/0xb80 fs/hfs/super.c:337
get_tree_bdev_flags+0x6e3/0x920 fs/super.c:1681
get_tree_bdev+0x38/0x50 fs/super.c:1704
hfs_get_tree+0x35/0x40 fs/hfs/super.c:388
vfs_get_tree+0xb0/0x5c0 fs/super.c:1804
do_new_mount+0x738/0x1610 fs/namespace.c:3902
path_mount+0x6db/0x1e90 fs/namespace.c:4226
do_mount fs/namespace.c:4239 [inline]
__do_sys_mount fs/namespace.c:4450 [inline]
__se_sys_mount+0x6eb/0x7d0 fs/namespace.c:4427
__x64_sys_mount+0xe4/0x150 fs/namespace.c:4427
x64_sys_call+0xfa7/0x3db0 arch/x86/include/generated/asm/syscalls_64.h:166
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd9/0x210 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
CPU: 1 UID: 0 PID: 12609 Comm: syz.1.2692 Not tainted 6.16.0-syzkaller #0 PREEMPT(none)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
=====================================================
The HFS_SB(sb)->bitmap buffer is allocated in hfs_mdb_get():
HFS_SB(sb)->bitmap = kmalloc(8192, GFP_KERNEL);
Finally, it can trigger the reported issue because kmalloc()
doesn't clear the allocated memory. If allocated memory contains
only zeros, then everything will work pretty fine.
But if the allocated memory contains the "garbage", then
it can affect the bitmap operations and it triggers
the reported issue.
This patch simply exchanges the kmalloc() on kzalloc()
with the goal to guarantee the correctness of bitmap operations.
Because, newly created allocation bitmap should have all
available blocks free. Potentially, initialization bitmap's read
operation could not fill the whole allocated memory and
"garbage" in the not initialized memory will be the reason of
volume coruptions and file system driver bugs.
Reported-by: syzbot <syzbot+773fa9d79b29bd8b6831@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=773fa9d79b29bd8b6831
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20250820230636.179085-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c62663a986 ]
Potenatially, __hfs_ext_read_extent() could operate by
not initialized values of fd->key after hfs_brec_find() call:
static inline int __hfs_ext_read_extent(struct hfs_find_data *fd, struct hfs_extent *extent,
u32 cnid, u32 block, u8 type)
{
int res;
hfs_ext_build_key(fd->search_key, cnid, block, type);
fd->key->ext.FNum = 0;
res = hfs_brec_find(fd);
if (res && res != -ENOENT)
return res;
if (fd->key->ext.FNum != fd->search_key->ext.FNum ||
fd->key->ext.FkType != fd->search_key->ext.FkType)
return -ENOENT;
if (fd->entrylength != sizeof(hfs_extent_rec))
return -EIO;
hfs_bnode_read(fd->bnode, extent, fd->entryoffset, sizeof(hfs_extent_rec));
return 0;
}
This patch changes kmalloc() on kzalloc() in hfs_find_init()
and intializes fd->record, fd->keyoffset, fd->keylength,
fd->entryoffset, fd->entrylength for the case if hfs_brec_find()
has been found nothing in the b-tree node.
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20250818225252.126427-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18b07c44f2 ]
Currently, hfs_brec_remove() executes moving records
towards the location of deleted record and it updates
offsets of moved records. However, the hfs_brec_remove()
logic ignores the "mess" of b-tree node's free space and
it doesn't touch the offsets out of records number.
Potentially, it could confuse fsck or driver logic or
to be a reason of potential corruption cases.
This patch reworks the logic of hfs_brec_remove()
by means of clearing freed space of b-tree node
after the records moving. And it clear the last
offset that keeping old location of free space
because now the offset before this one is keeping
the actual offset to the free space after the record
deletion.
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20250815194918.38165-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11aa54ba4c ]
The pkey ioctl PKEY_CLR2SECK2 describes in the pkey.h header file
the parameter 'keygenflags' which is forwarded to the handler
functions which actually deal with the clear key to secure key
operation. The ep11 handler module function ep11_clr2keyblob()
function receives this parameter but does not forward it to the
underlying function ep11_unwrapkey() on invocation. So in the end
the user of this ioctl could not forward additional key generation
flags to the ep11 implementation and thus was unable to modify the
key generation process in any way. So now call ep11_unwrapkey()
with the real keygenflags instead of 0 and thus the user of this
ioctl can for example via keygenflags provide valid combinations
of XCP_BLOB_* flags.
Suggested-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a20b83cf45 ]
On nios2, with CONFIG_FLATMEM set, the kernel relies on
memblock_get_current_limit() to determine the limits of mem_map, in
particular for max_low_pfn.
Unfortunately, memblock.current_limit is only default initialized to
MEMBLOCK_ALLOC_ANYWHERE at this point of the bootup, potentially leading
to situations where max_low_pfn can erroneously exceed the value of
max_pfn and, thus, the valid range of available DRAM.
This can in turn cause kernel-level paging failures, e.g.:
[ 76.900000] Unable to handle kernel paging request at virtual address 20303000
[ 76.900000] ea = c0080890, ra = c000462c, cause = 14
[ 76.900000] Kernel panic - not syncing: Oops
[ 76.900000] ---[ end Kernel panic - not syncing: Oops ]---
This patch fixes this by pre-calculating memblock.current_limit
based on the upper limits of the available memory ranges via
adjust_lowmem_bounds, a simplified version of the equivalent
implementation within the arm architecture.
Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Signed-off-by: Andreas Oetken <andreas.oetken@siemens-energy.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a8abcff174 ]
Since commit f74dacb4c8 ("dlm: fix recovery of middle conversions")
we introduced additional debugging information if we hit the middle
conversion by using log_limit(). The DLM log_limit() functionality
requires a DLM debug option being enabled. As this case is so rarely and
excempt any potential introduced new issue with recovery we switching it
to log_rinfo() ad this is ratelimited under normal DLM loglevel.
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0fbbcab7f9 ]
Format the kernel-doc for SCALE_HW_CALIB_INVALID correctly to
avoid a kernel-doc warning:
Warning: include/linux/misc_cgroup.h:26 Enum value
'MISC_CG_RES_TDX' not described in enum 'misc_res_type'
Fixes: 7c035bea94 ("KVM: TDX: Register TDX host key IDs to cgroup misc controller")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 48b77733d0 ]
After commit 5402c4d4d2 ("exportfs: require ->fh_to_parent() to encode
connectable file handles") we will fail to create non-decodable file
handles for filesystems without export operations. Fix it.
Fixes: 5402c4d4d2 ("exportfs: require ->fh_to_parent() to encode connectable file handles")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dequeuing a fair task on a throttled hierarchy returns early on
encountering a throttled cfs_rq since the throttle path has already
dequeued the hierarchy above and has adjusted the h_nr_* accounting till
the root cfs_rq.
dequeue_entities() crucially misses calling __block_task() for delayed
tasks being dequeued on the throttled hierarchies, but this was mostly
harmless until commit b7ca5743a2 ("sched/core: Tweak
wait_task_inactive() to force dequeue sched_delayed tasks") since all
existing cases would re-enqueue the task if task_on_rq_queued() returned
true and the task would eventually be blocked at pick after the
hierarchy was unthrottled.
wait_task_inactive() is special as it expects the delayed task on
throttled hierarchy to reach the blocked state on dequeue but since
__block_task() is never called, task_on_rq_queued() continues to return
true. Furthermore, since the task is now off the hierarchy, the pick
never reaches it to fully block the task even after unthrottle leading
to wait_task_inactive() looping endlessly.
Remedy this by calling __block_task() if a delayed task is being
dequeued on a throttled hierarchy.
This fix is only required for stabled kernels implementing delay dequeue
(>= v6.12) before v6.18 since upstream commit e1fad12dcb ("sched/fair:
Switch to task based throttle model") indirectly fixes this by removing
the early return conditions in dequeue_entities() as part of the per-task
throttle feature.
Cc: stable@vger.kernel.org
Reported-by: Matt Fleming <matt@readmodwrite.com>
Closes: https://lore.kernel.org/all/20250925133310.1843863-1-matt@readmodwrite.com/
Fixes: b7ca5743a2 ("sched/core: Tweak wait_task_inactive() to force dequeue sched_delayed tasks")
Tested-by: Matt Fleming <mfleming@cloudflare.com>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bbfe987c5a upstream.
Commit 495c8d3503 ("PM: hibernate: Add pm_hibernation_mode_is_suspend()")
that introduced pm_hibernation_mode_is_suspend() did not define it in
the case when CONFIG_HIBERNATION is unset, but CONFIG_SUSPEND is set.
Subsequent commit 0a6e9e098f ("drm/amd: Fix hybrid sleep") made the
amdgpu driver use that function which led to kernel build breakage in
the case mentioned above [1].
Address this by using appropriate #ifdeffery around the definition of
pm_hibernation_mode_is_suspend().
Fixes: 0a6e9e098f ("drm/amd: Fix hybrid sleep")
Reported-by: KernelCI bot <bot@kernelci.org>
Closes: https://groups.io/g/kernelci-results/topic/regression_pm_testing/115439919 [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d30203739b ]
There may be cases in which the BAR0 also needs to move to accommodate
the bigger BAR2. However if it's not released, the BAR2 resize fails.
During the vram probe it can't be released as it's already in use by
xe_mmio for early register access.
Add a new function in xe_vram and let xe_pci call it directly before
even early device probe. This allows the BAR2 to resize in cases BAR0
also needs to move, assuming there aren't other reasons to hold that
move:
[] xe 0000:03:00.0: vgaarb: deactivate vga console
[] xe 0000:03:00.0: [drm] Attempting to resize bar from 8192MiB -> 16384MiB
[] xe 0000:03:00.0: BAR 0 [mem 0x83000000-0x83ffffff 64bit]: releasing
[] xe 0000:03:00.0: BAR 2 [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing
[] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing
[] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x41ffffffff 64bit pref]: releasing
[] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned
[] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned
[] xe 0000:03:00.0: BAR 2 [mem 0x4000000000-0x43ffffffff 64bit pref]: assigned
[] xe 0000:03:00.0: BAR 0 [mem 0x83000000-0x83ffffff 64bit]: assigned
[] pcieport 0000:00:01.0: PCI bridge to [bus 01-04]
[] pcieport 0000:00:01.0: bridge window [mem 0x83000000-0x840fffff]
[] pcieport 0000:00:01.0: bridge window [mem 0x4000000000-0x44007fffff 64bit pref]
[] pcieport 0000:01:00.0: PCI bridge to [bus 02-04]
[] pcieport 0000:01:00.0: bridge window [mem 0x83000000-0x840fffff]
[] pcieport 0000:01:00.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref]
[] pcieport 0000:02:01.0: PCI bridge to [bus 03]
[] pcieport 0000:02:01.0: bridge window [mem 0x83000000-0x83ffffff]
[] pcieport 0000:02:01.0: bridge window [mem 0x4000000000-0x43ffffffff 64bit pref]
[] xe 0000:03:00.0: [drm] BAR2 resized to 16384M
[] xe 0000:03:00.0: [drm:xe_pci_probe [xe]] BATTLEMAGE e221:0000 dgfx:1 gfx:Xe2_HPG (20.02) ...
For BMG there are additional fix needed in the PCI side, but this
helps getting it to a working resize.
All the rebar logic is more pci-specific than xe-specific and can be
done very early in the probe sequence. In future it would be good to
move it out of xe_vram.c, but this refactor is left for later.
Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: stable@vger.kernel.org # 6.12+
Link: https://lore.kernel.org/intel-xe/fafda2a3-fc63-ce97-d22b-803f771a4d19@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20250918-xe-pci-rebar-2-v1-2-6c094702a074@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 45e33f220fd625492c11e15733d8e9b4f9db82a4)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b0a5f5ce7 ]
Currently in the drivers we have defined VRAM regions per device and per
tile. Initialization of these regions is done in two completely different
ways. To simplify the logic of the code and make it easier to add new
regions in the future, let's unify the way we initialize VRAM regions.
v2:
- fix doc comments in struct xe_vram_region
- remove unnecessary includes (Jani)
v3:
- move code from xe_vram_init_regions_managers to xe_tile_init_noalloc
(Matthew)
- replace ioremap_wc to devm_ioremap_wc for mapping VRAM BAR
(Matthew)
- Replace the tile id parameter with vram region in the xe_pf_begin
function.
v4:
- remove tile back pointer from struct xe_vram_region
- add new back pointers: xe and migarte to xe_vram_region
Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com> # rev3
Acked-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250714184818.89201-6-piotr.piorkowski@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Stable-dep-of: d30203739b ("drm/xe: Move rebar to be done earlier")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2c27aaee93 ]
Do read-modify-write so that we re-use the characterized reset value as
specified in TRM [1] to program calibration wait time which defines number
of cycles to wait for after startup state machine is in bandgap enable
state.
This fixes PLL lock timeout error faced while using RPi DSI Panel on TI's
AM62L and J721E SoC since earlier calibration wait time was getting
overwritten to zero value thus failing the PLL to lockup and causing
timeout.
[1] AM62P TRM (Section 14.8.6.3.2.1.1 DPHY_TX_DPHYTX_CMN0_CMN_DIG_TBIT2):
Link: https://www.ti.com/lit/pdf/spruj83
Cc: stable@vger.kernel.org
Fixes: 7a343c8bf4 ("phy: Add Cadence D-PHY support")
Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
Tested-by: Harikrishna Shenoy <h-shenoy@ti.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://lore.kernel.org/r/20250704125915.1224738-3-devarsht@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f4d027921c ]
match_region_by_range() is not using the helper function that also takes
extended linear cache size into account when comparing regions. This
causes a x2 region to show up as 2 partial incomplete regions rather
than a single CXL region with extended linear cache support. Replace
the open coded compare logic with the proper helper function for
comparison. User visible impact is that when 'cxl list' is issued,
no activa CXL region(s) are shown. There may be multiple idle regions
present. No actual active CXL region is present in the kernel.
[dj: Fix stable address]
Fixes: 0ec9849b63 ("acpi/hmat / cxl: Add extended linear cache support for CXL")
Cc: stable@vger.kernel.org
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[ constify struct range ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 15292f1b4c ]
Users can create as many monitoring groups as the number of RMIDs supported
by the hardware. However, on AMD systems, only a limited number of RMIDs
are guaranteed to be actively tracked by the hardware. RMIDs that exceed
this limit are placed in an "Unavailable" state.
When a bandwidth counter is read for such an RMID, the hardware sets
MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
remains set on first read after tracking re-starts and is clear on all
subsequent reads as long as the RMID is tracked.
resctrl miscounts the bandwidth events after an RMID transitions from the
"Unavailable" state back to being tracked. This happens because when the
hardware starts counting again after resetting the counter to zero, resctrl
in turn compares the new count against the counter value stored from the
previous time the RMID was tracked.
This results in resctrl computing an event value that is either undercounting
(when new counter is more than stored counter) or a mistaken overflow (when
new counter is less than stored counter).
Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
zero whenever the RMID is in the "Unavailable" state to ensure accurate
counting after the RMID resets to zero when it starts to be tracked again.
Example scenario that results in mistaken overflow
==================================================
1. The resctrl filesystem is mounted, and a task is assigned to a
monitoring group.
$mount -t resctrl resctrl /sys/fs/resctrl
$mkdir /sys/fs/resctrl/mon_groups/test1/
$echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
21323 <- Total bytes on domain 0
"Unavailable" <- Total bytes on domain 1
Task is running on domain 0. Counter on domain 1 is "Unavailable".
2. The task runs on domain 0 for a while and then moves to domain 1. The
counter starts incrementing on domain 1.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
7345357 <- Total bytes on domain 0
4545 <- Total bytes on domain 1
3. At some point, the RMID in domain 0 transitions to the "Unavailable"
state because the task is no longer executing in that domain.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
"Unavailable" <- Total bytes on domain 0
434341 <- Total bytes on domain 1
4. Since the task continues to migrate between domains, it may eventually
return to domain 0.
$cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
17592178699059 <- Overflow on domain 0
3232332 <- Total bytes on domain 1
In this case, the RMID on domain 0 transitions from "Unavailable" state to
active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
the counter is read and begins tracking the RMID counting from 0.
Subsequent reads succeed but return a value smaller than the previously
saved MSR value (7345357). Consequently, the resctrl's overflow logic is
triggered, it compares the previous value (7345357) with the new, smaller
value and incorrectly interprets this as a counter overflow, adding a large
delta.
In reality, this is a false positive: the counter did not overflow but was
simply reset when the RMID transitioned from "Unavailable" back to active
state.
Here is the text from APM [1] available from [2].
"In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
first QM_CTR read when it begins tracking an RMID that it was not
previously tracking. The U bit will be zero for all subsequent reads from
that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
read with the U bit set when that RMID is in use by a processor can be
considered 0 when calculating the difference with a subsequent read."
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
Bandwidth (MBM).
[ bp: Split commit message into smaller paragraph chunks for better
consumption. ]
Fixes: 4d05bf71f1 ("x86/resctrl: Introduce AMD QOS feature")
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@vger.kernel.org # needs adjustments for <= v6.17
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7c9ac605e2 ]
resctrl_arch_rmid_read() adjusts the value obtained from MSR_IA32_QM_CTR to
account for the overflow for MBM events and apply counter scaling for all the
events. This logic is common to both reading an RMID and reading a hardware
counter directly.
Refactor the hardware value adjustment logic into get_corrected_val() to
prepare for support of reading a hardware counter.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
Stable-dep-of: 15292f1b4c ("x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 284fb19a3f ]
PLL lockup and O_CMN_READY assertion can only happen after common state
machine gets enabled by programming DPHY_CMN_SSM register, but driver was
polling them before the common state machine was enabled which is
incorrect. This is as per the DPHY initialization sequence as mentioned in
J721E TRM [1] at section "12.7.2.4.1.2.1 Start-up Sequence Timing Diagram".
It shows O_CMN_READY polling at the end after common configuration pin
setup where the common configuration pin setup step enables state machine
as referenced in "Table 12-1533. Common Configuration-Related Setup
mentions state machine"
To fix this :
- Add new function callbacks for polling on PLL lock and O_CMN_READY
assertion.
- As state machine and clocks get enabled in power_on callback only, move
the clock related programming part from configure callback to power_on
callback and poll for the PLL lockup and O_CMN_READY assertion after state
machine gets enabled.
- The configure callback only saves the PLL configuration received from the
client driver which will be applied later on in power_on callback.
- Add checks to ensure configure is called before power_on and state
machine is in disabled state before power_on callback is called.
- Disable state machine in power_off so that client driver can re-configure
the PLL by following up a power_off, configure, power_on sequence.
[1]: https://www.ti.com/lit/zip/spruil1
Cc: stable@vger.kernel.org
Fixes: 7a343c8bf4 ("phy: Add Cadence D-PHY support")
Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
Tested-by: Harikrishna Shenoy <h-shenoy@ti.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://lore.kernel.org/r/20250704125915.1224738-2-devarsht@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d68886bae7 ]
The data type of loca_last_write_offset is newoffset4 and is switched
on a boolean value, no_newoffset, that indicates if a previous write
occurred or not. If no_newoffset is FALSE, an offset is not given.
This means that client does not try to update the file size. Thus,
server should not try to calculate new file size and check if it fits
into the segment range. See RFC 8881, section 12.5.4.2.
Sometimes the current incorrect logic may cause clients to hang when
trying to sync an inode. If layoutcommit fails, the client marks the
inode as dirty again.
Fixes: 9cf514ccfa ("nfsd: implement pNFS operations")
Cc: stable@vger.kernel.org
Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f963cf2b91 ]
When pNFS client in the block or scsi layout mode sends layoutcommit
to MDS, a variable length array of modified extents is supplied within
the request. This patch allows the server to accept such extent arrays
if they do not fit within single memory page.
The issue can be reproduced when writing to a 1GB file using FIO with
O_DIRECT, 4K block and large I/O depth without preallocation of the
file. In this case, the server returns NFSERR_BADXDR to the client.
Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: d68886bae7 ("NFSD: Fix last write offset handling in layoutcommit")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 274365a51d ]
Remove dprintk in nfsd4_layoutcommit. These are not needed
in day to day usage, and the information is also available
in Wireshark when capturing NFS traffic.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: d68886bae7 ("NFSD: Fix last write offset handling in layoutcommit")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 832738e4b3 ]
Compilers may optimize the layout of C structures, so we should not rely
on sizeof struct and memcpy to encode and decode XDR structures. The byte
order of the fields should also be taken into account.
This patch adds the correct functions to handle the deviceid4 structure
and removes the pad field, which is currently not used by NFSD, from the
runtime state. The server's byte order is preserved because the deviceid4
blob on the wire is only used as a cookie by the client.
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: d68886bae7 ("NFSD: Fix last write offset handling in layoutcommit")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e747883c7d ]
When mounting file systems with a log that was dirtied on i386 on
other architectures or vice versa, log recovery is unhappy:
[ 11.068052] XFS (vdb): Torn write (CRC failure) detected at log block 0x2. Truncating head block from 0xc.
This is because the CRCs generated by i386 and other architectures
always diff. The reason for that is that sizeof(struct xlog_rec_header)
returns different values for i386 vs the rest (324 vs 328), because the
struct is not sizeof(uint64_t) aligned, and i386 has odd struct size
alignment rules.
This issue goes back to commit 13cdc853c519 ("Add log versioning, and new
super block field for the log stripe") in the xfs-import tree, which
adds log v2 support and the h_size field that causes the unaligned size.
At that time it only mattered for the crude debug only log header
checksum, but with commit 0e446be448 ("xfs: add CRC checks to the log")
it became a real issue for v5 file system, because now there is a proper
CRC, and regular builds actually expect it match.
Fix this by allowing checksums with and without the padding.
Fixes: 0e446be448 ("xfs: add CRC checks to the log")
Cc: <stable@vger.kernel.org> # v3.8
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0b737f4ac1 ]
old_crc is a very misleading name. Rename it to expected_crc as that
described the usage much better.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Carlos Maiolino <cem@kernel.org>
Stable-dep-of: e747883c7d ("xfs: fix log CRC mismatches between i386 and other architectures")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea0d55ae4b upstream.
We intend that EL0 exception handlers unmask all DAIF exceptions
before calling exit_to_user_mode().
When completing single-step of a suspended breakpoint, we do not call
local_daif_restore(DAIF_PROCCTX) before calling exit_to_user_mode(),
leaving all DAIF exceptions masked.
When pseudo-NMIs are not in use this is benign.
When pseudo-NMIs are in use, this is unsound. At this point interrupts
are masked by both DAIF.IF and PMR_EL1, and subsequent irq flag
manipulation may not work correctly. For example, a subsequent
local_irq_enable() within exit_to_user_mode_loop() will only unmask
interrupts via PMR_EL1 (leaving those masked via DAIF.IF), and
anything depending on interrupts being unmasked (e.g. delivery of
signals) will not work correctly.
This was detected by CONFIG_ARM64_DEBUG_PRIORITY_MASKING.
Move the call to `try_step_suspended_breakpoints()` outside of the check
so that interrupts can be unmasked even if we don't call the step handler.
Fixes: 0ac7584c08 ("arm64: debug: split single stepping exception entry")
Cc: <stable@vger.kernel.org> # 6.17
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: added Mark's rewritten commit log and some whitespace]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[ada.coupriediaz@arm.com: Fix conflict for v6.17 stable]
Signed-off-by: Ada Couprie Diaz <ada.coupriediaz@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5a869d0177 ]
With TLS enabled, records that are encrypted and appended to TLS TX
list can fail to see a retry if the underlying TCP socket is busy, for
example, hitting an EAGAIN from tcp_sendmsg_locked(). This is not known
to the NVMe TCP driver, as the TLS layer successfully generated a record.
Typically, the TLS write_space() callback would ensure such records are
retried, but in the NVMe TCP Host driver, write_space() invokes
nvme_tcp_write_space(). This causes a partially sent record in the TLS TX
list to timeout after not being retried.
This patch fixes the above by calling queue->write_space(), which calls
into the TLS layer to retry any pending records.
Fixes: be8e82caa6 ("nvme-tcp: enable TLS handshake upcall")
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c1999ed33 ]
test_parse_test_list_file writes some data to
/tmp/bpf_arg_parsing_test.XXXXXX and parse_test_list_file() will read
the data back. However, after writing data to that file, we forget to
call fsync() and it's causing testing failure in my laptop. This patch
helps fix it by adding the missing fsync() call.
Fixes: 64276f01dc ("selftests/bpf: Test_progs can read test lists from file")
Signed-off-by: Xing Guo <higuoxing@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20251016035330.3217145-1-higuoxing@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa4daea418 ]
HID_DG_PEN devices should have a suffix of "Stylus", as pointed out by
commit c0ee1d5716 ("HID: hid-input: Add suffix also for HID_DG_PEN").
However, on multitouch devices, these suffixes may be overridden. Before
that commit, HID_DG_PEN devices would get the "Stylus" suffix, but after
that, multitouch would override them to have an "UNKNOWN" suffix. Just add
HID_DG_PEN to the list of non-overriden suffixes in multitouch.
Before this fix:
[ 0.470981] input: ELAN9008:00 04F3:2E14 UNKNOWN as /devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-16/i2c-ELAN9008:00/0018:04F3:2E14.0001/input/input8
ELAN9008:00 04F3:2E14 UNKNOWN
After this fix:
[ 0.474332] input: ELAN9008:00 04F3:2E14 Stylus as /devices/pci0000:00/0000:00:15.1/i2c_designware.1/i2c-16/i2c-ELAN9008:00/0018:04F3:2E14.0001/input/input8
ELAN9008:00 04F3:2E14 Stylus
Fixes: c0ee1d5716 ("HID: hid-input: Add suffix also for HID_DG_PEN")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0187c08058 ]
Commit 581c448476 ("HID: input: map digitizer battery usage") added
handling of battery events for digitizers (typically for batteries
presented in stylii). Digitizers typically report correct battery levels
only when stylus is actively touching the surface, and in other cases
they may report battery level of 0. To avoid confusing consumers of the
battery information the code was added to filer out reports with 0
battery levels.
However there exist other kinds of devices that may legitimately report
0 battery levels. Fix this by filtering out 0-level reports only for
digitizer usages, and continue reporting them for other kinds of devices
(Smart Batteries, etc).
Reported-by: 卢国宏 <luguohong@xiaomi.com>
Fixes: 581c448476 ("HID: input: map digitizer battery usage")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 08823e89e3 ]
Remove the acquisition and release of q->elevator_lock in the
blkg_conf_open_bdev_frozen() and blkg_conf_exit_frozen() functions. The
elevator lock is no longer needed in these code paths since commit
78c271344b ("block: move wbt_enable_default() out of queue freezing
from sched ->exit()") which introduces `disk->rqos_state_mutex` for
protecting wbt state change, and not necessary to abuse elevator_lock
for this purpose.
This change helps to solve the lockdep warning reported from Yu Kuai[1].
Pass blktests/throtl with lockdep enabled.
Links: https://lore.kernel.org/linux-block/e5e7ac3f-2063-473a-aafb-4d8d43e5576e@yukuai.org.cn/ [1]
Fixes: commit 78c271344b ("block: move wbt_enable_default() out of queue freezing from sched ->exit()")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28412b489b ]
In try_to_register_card(), the return value of usb_ifnum_to_if() is
passed directly to usb_interface_claimed() without a NULL check, which
will lead to a NULL pointer dereference when creating an invalid
USB audio device. Fix this by adding a check to ensure the interface
pointer is valid before passing it to usb_interface_claimed().
Fixes: 39efc9c8a9 ("ALSA: usb-audio: Fix last interface check for registration")
Closes: https://lore.kernel.org/all/CANypQFYtQxHL5ghREs-BujZG413RPJGnO5TH=xjFBKpPts33tA@mail.gmail.com/
Signed-off-by: Jiaming Zhang <r772577952@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e603a342cf ]
We started getting a crash in BPF CI, which seems to originate from
test_parse_test_list_file() test and is happening at this line:
ASSERT_OK(strcmp("test_with_spaces", set.tests[0].name), "test 0 name");
One way we can crash there is if set.cnt zero, which is checked for with
ASSERT_EQ() above, but we proceed after this regardless of the outcome.
Instead of crashing, we should bail out with test failure early.
Similarly, if parse_test_list_file() fails, we shouldn't be even looking
at set, so bail even earlier if ASSERT_OK() fails.
Fixes: 64276f01dc ("selftests/bpf: Test_progs can read test lists from file")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20251014202037.72922-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a4bbb493a3 ]
Traces of cxl_poison events include an hpa_alias0 field if the poison
address is in a region configured with an ELC, Extended Linear Cache.
Since the ELC always comes first in the region, the calculation needs
to subtract the ELC size from the calculated HPA address.
Fixes: 8c520c5f1e ("cxl: Add extended linear cache address alias emission for cxl events")
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7e091add9c ]
The sc_c field is currently not updated in the host response to the
controller challenge leading to failures while attempting secure
channel concatenation. Fix this by adding a new sc_c variable to the
dhchap queue context structure which is appropriately set during
negotiate and then used in the host response.
Fixes: e88a7595b5 ("nvme-tcp: request secure channel concatenation")
Signed-off-by: Martin George <marting@netapp.com>
Signed-off-by: Prashanth Adurthi <prashana@netapp.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11f08c30a3 ]
Currently, if find_and_map_user_pages() takes a DMA xfer request from the
user with a length field set to 0, or in a rare case, the host receives
QAIC_TRANS_DMA_XFER_CONT from the device where resources->xferred_dma_size
is equal to the requested transaction size, the function will return 0
before allocating an sgt or setting the fields of the dma_xfer struct.
In that case, encode_addr_size_pairs() will try to access the sgt which
will lead to a general protection fault.
Return an EINVAL in case the user provides a zero-sized ALP, or the device
requests continuation after all of the bytes have been transferred.
Fixes: 96d3c1cade ("accel/qaic: Clean up integer overflow checking in map_user_pages()")
Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007122320.339654-1-youssef.abdulrahman@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 17e3e88ed0 ]
The check for some lost idle pelt time should be always done when
pick_next_task_fair() fails to pick a task and not only when we call it
from the fair fast-path.
The case happens when the last running task on rq is a RT or DL task. When
the latter goes to sleep and the /Sum of util_sum of the rq is at the max
value, we don't account the lost of idle time whereas we should.
Fixes: 67692435c4 ("sched: Rework pick_next_task() slow-path")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee6e44dfe6 ]
IBM CI tool reported kernel warning[1] when running a CPU removal
operation through drmgr[2]. i.e "drmgr -c cpu -r -q 1"
WARNING: CPU: 0 PID: 0 at kernel/sched/cpudeadline.c:219 cpudl_set+0x58/0x170
NIP [c0000000002b6ed8] cpudl_set+0x58/0x170
LR [c0000000002b7cb8] dl_server_timer+0x168/0x2a0
Call Trace:
[c000000002c2f8c0] init_stack+0x78c0/0x8000 (unreliable)
[c0000000002b7cb8] dl_server_timer+0x168/0x2a0
[c00000000034df84] __hrtimer_run_queues+0x1a4/0x390
[c00000000034f624] hrtimer_interrupt+0x124/0x300
[c00000000002a230] timer_interrupt+0x140/0x320
Git bisects to: commit 4ae8d9aa9f ("sched/deadline: Fix dl_server getting stuck")
This happens since:
- dl_server hrtimer gets enqueued close to cpu offline, when
kthread_park enqueues a fair task.
- CPU goes offline and drmgr removes it from cpu_present_mask.
- hrtimer fires and warning is hit.
Fix it by stopping the dl_server before CPU is marked dead.
[1]: https://lore.kernel.org/all/8218e149-7718-4432-9312-f97297c352b9@linux.ibm.com/
[2]: https://github.com/ibm-power-utilities/powerpc-utils/tree/next/src/drmgr
[sshegde: wrote the changelog and tested it]
Fixes: 4ae8d9aa9f ("sched/deadline: Fix dl_server getting stuck")
Closes: https://lore.kernel.org/all/8218e149-7718-4432-9312-f97297c352b9@linux.ibm.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8fe2cd8ec8 ]
The original implementation used level detection for the first interrupt
after device reset to avoid potential interrupt line noise and missed
interrupts during the initialization phase. However, this approach
introduced unintended side effects when tested with certain touch panels,
including:
- Delayed hardware interrupt response
- Multiple spurious interrupt triggers
Switching back to edge detection for the first interrupt resolves these
issues while maintaining reliable interrupt handling.
Extensive testing across multiple platforms with touch panels from
various vendors confirms this change introduces no regressions.
[jkosina@suse.com: properly capitalize shortlog]
Fixes: 9d8d51735a ("HID: intel-thc-hid: intel-quickspi: Add HIDSPI protocol implementation")
Tested-by: Rui Zhang <rui1.zhang@intel.com>
Signed-off-by: Even Xu <even.xu@intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a375246fcf ]
cxl EDAC calls cxl_feature_info() to get the feature information and
if the hardware has no Features support, cxlfs may be passed in as
NULL.
[ 51.957498] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 51.965571] #PF: supervisor read access in kernel mode
[ 51.971559] #PF: error_code(0x0000) - not-present page
[ 51.977542] PGD 17e4f6067 P4D 0
[ 51.981384] Oops: Oops: 0000 [#1] SMP NOPTI
[ 51.986300] CPU: 49 UID: 0 PID: 3782 Comm: systemd-udevd Not tainted 6.17.0dj
test+ #64 PREEMPT(voluntary)
[ 51.997355] Hardware name: <removed>
[ 52.009790] RIP: 0010:cxl_feature_info+0xa/0x80 [cxl_core]
Add a check for cxlfs before dereferencing it and return -EOPNOTSUPP if
there is no cxlfs created due to no hardware support.
Fixes: eb5dfcb9e3 ("cxl: Add support to handle user feature commands for set feature")
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6917112af2 ]
Remove extra multiplication.
CIK GPUs such as Hawaii appear to use PP_TABLE_V0 in which case
the shutdown temperature is hardcoded in smu7_init_dpm_defaults
and is already multiplied by 1000. The value was mistakenly
multiplied another time by smu7_get_thermal_temperature_range.
Fixes: 4ba082572a ("drm/amd/powerplay: export the thermal ranges of VI asics (V2)")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1676
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef38b4eab1 ]
These were never used and are duplicated with the
interface that is used. Maybe leftovers from a previous
revision of the patch that added them.
Fixes: 90c448fef3 ("drm/amdgpu: add new AMDGPU_INFO subquery for userq objects")
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ff780f4f80 ]
When we backup ring contents to reemit after a queue reset,
we don't backup ring contents from the bad context. When
we signal the fences, we should set an error on those
fences as well.
v2: misc cleanups
v3: add locking for fence error, fix comment (Christian)
v4: fix wrap around, locking (Christian)
Fixes: 77cc0da39c ("drm/amdgpu: track ring state associated with a fence")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 357d90be2c ]
Chips which use the IP discovery firmware loaded by the driver
reported incorrect harvesting information in the ip discovery
table in sysfs because the driver only uses the ip discovery
firmware for populating sysfs and not for direct parsing for the
driver itself as such, the fields that are used to print the
harvesting info in sysfs report incorrect data for some IPs. Populate
the relevant fields for this case as well.
Fixes: 514678da56 ("drm/amdgpu/discovery: fix fw based ip discovery")
Acked-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9e6a5cf1a2 ]
For platforms without an IP discovery table.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: 357d90be2c ("drm/amdgpu: fix handling of harvesting for ip_discovery firmware")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8529dbc75 ]
For chips that don't have IP discovery tables.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: 357d90be2c ("drm/amdgpu: fix handling of harvesting for ip_discovery firmware")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86af6b90e0 ]
intel_frontbuffer_get() is what locks out subsequent set_tiling
changes to the bo. Thus the fence vs. modifier check must be done
after intel_frontbuffer_get(), or else a concurrent set_tiling ioctl
might sneak in and change the fence after the check has been done.
Close the race again. See commit dd689287b9 ("drm/i915: Prevent
concurrent tiling/framebuffer modifications") for the previous
instance.
v2: Reorder intel_user_framebuffer_destroy() to match the unwind (Jani)
Cc: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Fixes: 10690b8a49 ("drm/i915/display: Add intel_fb_bo_framebuffer_fini")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20251003145734.7634-3-ville.syrjala@linux.intel.com
(cherry picked from commit 1d1e4ded216017f8febd91332ee337f0e0e79285)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 760039c95c ]
Currently xe's intel_frontbuffer implementation forgets to
hold a reference on the bo. This makes the entire thing
extremely fragile as the cleanup order now depends on bo
references held by other things
(namely intel_fb_bo_framebuffer_fini()).
Move the bo refcounting to intel_frontbuffer_{get,release}()
so that both i915 and xe do this the same way.
I first tried to fix this by having xe do the refcounting
from its intel_bo_set_frontbuffer() implementation
(which is what i915 does currently), but turns out xe's
drm_gem_object_free() can sleep and thus drm_gem_object_put()
isn't safe to call while we hold fb_tracking.lock.
Fixes: 10690b8a49 ("drm/i915/display: Add intel_fb_bo_framebuffer_fini")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20251003145734.7634-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit eb4d490729a5fd8dc5a76d334f8d01fec7c14bbe)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2b4eda7bf7 ]
Stress testing the audio jack hotplug handling on a few Steam Deck units
revealed that the debounce circuit is responsible for having a negative
impact on the detection reliability, e.g. in some cases the ejection
interrupt is not fired, while in other instances it goes into a kind of
invalid state and generates a flood of misleading interrupts.
Add new entries to the DMI table introduced via commit 1bc40efdaf
("ASoC: nau8821: Add DMI quirk mechanism for active-high jack-detect")
and extend the quirk logic to allow bypassing the debounce circuit used
for jack detection on Valve Steam Deck LCD and OLED models.
While at it, rename existing NAU8821_JD_ACTIVE_HIGH quirk bitfield to
NAU8821_QUIRK_JD_ACTIVE_HIGH. This should help improve code readability
by differentiating from similarly named register bits.
Fixes: aab1ad11d6 ("ASoC: nau8821: new driver")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://patch.msgid.link/20251003-nau8821-jdet-fixes-v1-4-f7b0e2543f09@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a698679fe8 ]
The interrupt handler attempts to perform some IRQ status clear
operations *after* rather than *before* unmasking and enabling
interrupts. This is a rather fragile approach since it may generally
lead to missing IRQ requests or causing spurious interrupts.
Make use of the nau8821_irq_status_clear() helper instead of
manipulating the related register directly and ensure any interrupt
clearing is performed *after* the target interrupts are disabled/masked
and *before* proceeding with additional interrupt unmasking/enablement
operations.
This also implicitly drops the redundant clear operation of the ejection
IRQ in the interrupt handler, since nau8821_eject_jack() has been
already responsible for clearing all active interrupts.
Fixes: aab1ad11d6 ("ASoC: nau8821: new driver")
Fixes: 2551b6e899 ("ASoC: nau8821: Add headset button detection")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://patch.msgid.link/20251003-nau8821-jdet-fixes-v1-3-f7b0e2543f09@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9273aa85b3 ]
Instead of adding yet another utility function for dealing with the
interrupt clearing register, generalize nau8821_int_status_clear_all()
by renaming it to nau8821_irq_status_clear(), whilst introducing a
second parameter to allow restricting the operation scope to a single
interrupt instead of the whole range of active IRQs.
While at it, also fix a spelling typo in the comment block.
Note this is mainly a prerequisite for subsequent patches aiming to
address some deficiencies in the implementation of the interrupt
handler. Thus the presence of the Fixes tag below is intentional, to
facilitate backporting.
Fixes: aab1ad11d6 ("ASoC: nau8821: new driver")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://patch.msgid.link/20251003-nau8821-jdet-fixes-v1-2-f7b0e2543f09@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6e54919cb5 ]
The microphone detection work scheduled by a prior jack insertion
interrupt may still be in a pending state or under execution when a jack
ejection interrupt has been fired.
This might lead to a racing condition or nau8821_jdet_work() completing
after nau8821_eject_jack(), which will override the currently
disconnected state of the jack and incorrectly report the headphone or
the headset as being connected.
Cancel any pending jdet_work or wait for its execution to finish before
attempting to handle the ejection interrupt.
Proceed similarly before launching the eject handler as a consequence of
detecting an invalid insert interrupt.
Fixes: aab1ad11d6 ("ASoC: nau8821: new driver")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Link: https://patch.msgid.link/20251003-nau8821-jdet-fixes-v1-1-f7b0e2543f09@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db74b04edc ]
There is now a new LT9211 rev. U5, which reports chip ID 0x18 0x01 0xe4 .
The previous LT9211 reported chip ID 0x18 0x01 0xe3 , which is what the
driver checks for right now. Since there is a possibility there will be
yet another revision of the LT9211 in the future, drop the last version
nibble check to allow all future revisions of the chip to work with this
driver.
This fix makes LT9211 rev. U5 work with this driver.
Fixes: 8ce4129e3d ("drm/bridge: lt9211: Add Lontium LT9211 bridge driver")
Signed-off-by: Marek Vasut <marek.vasut@mailbox.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251011110017.12521-1-marek.vasut@mailbox.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9e68bd803f ]
When adding a kprobe such as "p:probe/tcp_sendmsg _text+15392192",
arch_check_kprobe would start iterating all instructions starting from
_text until the probed address. Not only is this very inefficient, but
literal values in there (e.g. left by function patching) are
misinterpreted in a way that causes a desync.
Fix this by doing it like x86: start the iteration at the closest
preceding symbol instead of the given starting point.
Fixes: 87f48c7ccc ("riscv: kprobe: Fixup kernel panic when probing an illegal position")
Signed-off-by: Fabian Vogt <fvogt@suse.de>
Signed-off-by: Marvin Friedrich <marvin.friedrich@suse.com>
Acked-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/6191817.lOV4Wx5bFT@fvogt-thinkpad
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb642e2d30 ]
For queue-depth I/O policy, this patch fixes unbalanced I/Os across
nvme multipaths.
Issue Description:
The RETRY disposition incorrectly increments ns->ctrl->nr_active
counter and reinitializes iostat start-time. In such cases nr_active
counter never goes back to zero until that path disconnects and
reconnects.
Such a path is not chosen for new I/Os if multiple RETRY cases on a given
a path cause its queue-depth counter to be artificially higher compared
to other paths. This leads to unbalanced I/Os across paths.
The patch skips incrementing nr_active if NVME_MPATH_CNT_ACTIVE is already
set. And it skips restarting io stats if NVME_MPATH_IO_STATS is already set.
base-commit: e989a3da2d371a4b6597ee8dee5c72e407b4db7a
Fixes: d4d957b53d ("nvme-multipath: support io stats on the mpath device")
Signed-off-by: Amit Chaudhary <achaudhary@purestorage.com>
Reviewed-by: Randy Jennings <randyj@purestorage.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e07e10ae83 ]
Currently the Panthor driver needs the GPU to be powered down
between suspend and resume. If this is not done, then the
MCU_CONTROL register will be preserved as AUTO, which again will
cause a premature FW boot on resume. The FW will go directly into
fatal state in this case.
This case needs to be handled as there is no guarantee that the
GPU will be powered down after the suspend callback on all platforms.
The fix is to call panthor_fw_stop() in "pre-reset" path to ensure
the MCU_CONTROL register is cleared (set DISABLE). This matches
well with the already existing call to panthor_fw_start() from the
"post-reset" path.
Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Fixes: 2718d91816 ("drm/panthor: Add the FW logical block")
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20251008105112.4077015-1-ketil.johnsen@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8d93ff40d4 ]
dev->chipid is used in lan78xx_init_mac_address before it's initialized:
lan78xx_reset() {
lan78xx_init_mac_address()
lan78xx_read_eeprom()
lan78xx_read_raw_eeprom() <- dev->chipid is used here
dev->chipid = ... <- dev->chipid is initialized correctly here
}
Reorder initialization so that dev->chipid is set before calling
lan78xx_init_mac_address().
Fixes: a0db7d10b7 ("lan78xx: Add to handle mux control per chip id")
Signed-off-by: I Viswanath <viswanathiyyappan@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Khalid Aziz <khalid@kernel.org>
Link: https://patch.msgid.link/20251013181648.35153-1-viswanathiyyappan@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1a8fed52f7 ]
Bringing a linked netdevsim device down and then up causes communication
failure because both interfaces lack carrier. Basically a ifdown/ifup on
the interface make the link broken.
Commit 3762ec05a9 ("netdevsim: add NAPI support") added supported
for NAPI, calling netif_carrier_off() in nsim_stop(). This patch
re-enables the carrier symmetrically on nsim_open(), in case the device
is linked and the peer is up.
Signed-off-by: Breno Leitao <leitao@debian.org>
Fixes: 3762ec05a9 ("netdevsim: add NAPI support")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251014-netdevsim_fix-v2-1-53b40590dae1@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f846c65ca ]
With async crypto, we rely on tx_work to actually transmit records
once encryption completes. But while send() is running, both the
tx_lock and socket lock are held, so tx_work_handler cannot process
the queue of encrypted records, and simply reschedules itself. During
a large send(), this could last a long time, and use a lot of memory.
Transmit any pending encrypted records before restarting the main
loop of tls_sw_sendmsg_locked.
Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/8396631478f70454b44afb98352237d33f48d34d.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8a6ff84ab ]
Async decryption calls tls_strp_msg_hold to create a clone of the
input skb to hold references to the memory it uses. If we fail to
allocate that clone, proceeding with async decryption can lead to
various issues (UAF on the skb, writing into userspace memory after
the recv() call has returned).
In this case, wait for all pending decryption requests.
Fixes: 84c61fe1a7 ("tls: rx: do not use the standard strparser")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/b9fe61dcc07dab15da9b35cf4c7d86382a98caf2.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b6fe4c29bb ]
When userspace wants to send a non-DATA record (via the
TLS_SET_RECORD_TYPE cmsg), we need to send any pending data from a
previous MSG_MORE send() as a separate DATA record. If that DATA record
is encrypted asynchronously, tls_handle_open_record will return
-EINPROGRESS. This is currently treated as an error by
tls_process_cmsg, and it will skip setting record_type to the correct
value, but the caller (tls_sw_sendmsg_locked) handles that return
value correctly and proceeds with sending the new message with an
incorrect record_type (DATA instead of whatever was requested in the
cmsg).
Always set record_type before handling the open record. If
tls_handle_open_record returns an error, record_type will be
ignored. If it succeeds, whether with synchronous crypto (returning 0)
or asynchronous (returning -EINPROGRESS), the caller will proceed
correctly.
Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/0457252e578a10a94e40c72ba6288b3a64f31662.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b014a4e066 ]
If we hit an error during the main loop of tls_sw_sendmsg_locked (eg
failed allocation), we jump to send_end and immediately
return. Previous iterations may have queued async encryption requests
that are still pending. We should wait for those before returning, as
we could otherwise be reading from memory that userspace believes
we're not using anymore, which would be a sort of use-after-free.
This is similar to what tls_sw_recvmsg already does: failures during
the main loop jump to the "wait for async" code, not straight to the
unlock/return.
Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/c793efe9673b87f808d84fdefc0f732217030c52.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ce5af41e32 ]
During tls_sw_sendmsg_locked, we pre-allocate the encrypted message
for the size we're expecting to send during the current iteration, but
we may end up sending less, for example when splicing: if we're
getting the data from small fragments of memory, we may fill up all
the slots in the skmsg with less data than expected.
In this case, we need to trim the encrypted message to only the length
we actually need, to avoid pushing uninitialized bytes down the
underlying TCP socket.
Fixes: fe1e81d4f7 ("tls/sw: Support MSG_SPLICE_PAGES")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/66a0ae99c9efc15f88e9e56c1f58f902f442ce86.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 88f170814f ]
Since commit 305853cce3 ("ksmbd: Fix race condition in RPC handle list
access"), ksmbd_session_rpc_method() attempts to lock sess->rpc_lock.
This causes hung connections / tasks when a client attempts to open
a named pipe. Using Samba's rpcclient tool:
$ rpcclient //192.168.1.254 -U user%password
$ rpcclient $> srvinfo
<connection hung here>
Kernel side:
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/0:0 state:D stack:0 pid:5021 tgid:5021 ppid:2 flags:0x00200000
Workqueue: ksmbd-io handle_ksmbd_work
Call trace:
__schedule from schedule+0x3c/0x58
schedule from schedule_preempt_disabled+0xc/0x10
schedule_preempt_disabled from rwsem_down_read_slowpath+0x1b0/0x1d8
rwsem_down_read_slowpath from down_read+0x28/0x30
down_read from ksmbd_session_rpc_method+0x18/0x3c
ksmbd_session_rpc_method from ksmbd_rpc_open+0x34/0x68
ksmbd_rpc_open from ksmbd_session_rpc_open+0x194/0x228
ksmbd_session_rpc_open from create_smb2_pipe+0x8c/0x2c8
create_smb2_pipe from smb2_open+0x10c/0x27ac
smb2_open from handle_ksmbd_work+0x238/0x3dc
handle_ksmbd_work from process_scheduled_works+0x160/0x25c
process_scheduled_works from worker_thread+0x16c/0x1e8
worker_thread from kthread+0xa8/0xb8
kthread from ret_from_fork+0x14/0x38
Exception stack(0x8529ffb0 to 0x8529fff8)
The task deadlocks because the lock is already held:
ksmbd_session_rpc_open
down_write(&sess->rpc_lock)
ksmbd_rpc_open
ksmbd_session_rpc_method
down_read(&sess->rpc_lock) <-- deadlock
Adjust ksmbd_session_rpc_method() callers to take the lock when necessary.
Fixes: 305853cce3 ("ksmbd: Fix race condition in RPC handle list access")
Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f0fddd817 ]
Since blamed commit, unregister_netdevice_many_notify() takes the netdev
mutex if the device needs it.
If the device list is too long, this will lock more device mutexes than
lockdep can handle:
unshare -n \
bash -c 'for i in $(seq 1 100);do ip link add foo$i type dummy;done'
BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48 max: 48!
48 locks held by kworker/u16:1/69:
#0: ..148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work
#1: ..d40 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work
#2: ..bd0 (pernet_ops_rwsem){++++}-{4:4}, at: cleanup_net
#3: ..aa8 (rtnl_mutex){+.+.}-{4:4}, at: default_device_exit_batch
#4: ..cb0 (&dev_instance_lock_key#3){+.+.}-{4:4}, at: unregister_netdevice_many_notify
[..]
Add a helper to close and then unlock a list of net_devices.
Devices that are not up have to be skipped - netif_close_many always
removes them from the list without any other actions taken, so they'd
remain in locked state.
Close devices whenever we've used up half of the tracking slots or we
processed entire list without hitting the limit.
Fixes: 7e4d784f58 ("net: hold netdev instance lock during rtnetlink operations")
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20251013185052.14021-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4f86eb0a38 ]
The jq command is used in vlan_bridge_binding.sh, if it is not supported,
the test will spam the following log.
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# TEST: Test bridge_binding on->off when lower down [FAIL]
# Got operstate of , expected 0
The rtnetlink.sh has the same problem. It makes sense to check if jq is
installed before running these tests. After this patch, the
vlan_bridge_binding.sh skipped if jq is not supported:
# timeout set to 3600
# selftests: net: vlan_bridge_binding.sh
# TEST: jq not installed [SKIP]
Fixes: dca12e9ab7 ("selftests: net: Add a VLAN bridge binding selftest")
Fixes: 6a414fd77f ("selftests: rtnetlink: Add an address proto test")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251013080039.3035898-1-wangliang74@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 295ce1eb36 ]
Neal reported that using neper tcp_stream with TCP_TX_DELAY
set to 50ms would often lead to flows stuck in a small cwnd mode,
regardless of the congestion control.
While tcp_stream sets TCP_TX_DELAY too late after the connect(),
it highlighted two kernel bugs.
The following heuristic in tcp_tso_should_defer() seems wrong
for large RTT:
delta = tp->tcp_clock_cache - head->tstamp;
/* If next ACK is likely to come too late (half srtt), do not defer */
if ((s64)(delta - (u64)NSEC_PER_USEC * (tp->srtt_us >> 4)) < 0)
goto send_now;
If next ACK is expected to come in more than 1 ms, we should
not defer because we prefer a smooth ACK clocking.
While blamed commit was a step in the good direction, it was not
generic enough.
Another patch fixing TCP_TX_DELAY for established flows
will be proposed when net-next reopens.
Fixes: 50c8339e92 ("tcp: tso: restore IW10 after TSO autosizing")
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20251011115742.1245771-1-edumazet@google.com
[pabeni@redhat.com: fixed whitespace issue]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2616222e42 ]
During interface toggle operations (ifdown/ifup), the driver currently
resets the local helper variable 'phy_link' to -1. This causes the link
state machine to incorrectly interpret the state as a link change event,
resulting in spurious "Link is down" messages being logged when the
interface is brought back up.
Preserve the phy_link state across interface toggles to avoid treating
the -1 sentinel value as a legitimate link state transition.
Fixes: 88131a812b ("amd-xgbe: Perform phy connect/disconnect at dev open/stop")
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Link: https://patch.msgid.link/20251010065142.1189310-1-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a3f8c0a273 ]
When the driver requests Tx timestamp value, one of the first steps is
to clone SKB using skb_get. It increases the reference counter for that
SKB to prevent unexpected freeing by another component.
However, there may be a case where the index is requested, SKB is
assigned and never consumed by PTP flows - for example due to reset during
running PTP apps.
Add a check in release timestamping function to verify if the SKB
assigned to Tx timestamp latch was freed, and release remaining SKBs.
Fixes: 4901e83a94 ("idpf: add Tx timestamp capabilities negotiation")
Signed-off-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-1-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 21f4d45eba ]
Similarly to ipv4 tunnel, ipv6 version updates dev->needed_headroom, too.
While ipv4 tunnel headroom adjustment growth was limited in
commit 5ae1e9922b ("net: ip_tunnel: prevent perpetual headroom growth"),
ipv6 tunnel yet increases the headroom without any ceiling.
Reflect ipv4 tunnel headroom adjustment limit on ipv6 version.
Credits to Francesco Ruggeri, who was originally debugging this issue
and wrote local Arista-specific patch and a reproducer.
Fixes: 8eb30be035 ("ipv6: Create ip6_tnl_xmit")
Cc: Florian Westphal <fw@strlen.de>
Cc: Francesco Ruggeri <fruggeri05@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Link: https://patch.msgid.link/20251009-ip6_tunnel-headroom-v2-1-8e4dbd8f7e35@arista.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 70f92ab970 ]
After resume from S4 (hibernate), RTL8168H/RTL8111H truncates incoming
packets. Packet captures show messages like "IP truncated-ip - 146 bytes
missing!".
The issue is caused by RxConfig not being properly re-initialized after
resume. Re-initializing the RxConfig register before the chip
re-initialization sequence avoids the truncation and restores correct
packet reception.
This follows the same pattern as commit ef9da46dde ("r8169: fix data
corruption issue on RTL8402").
Fixes: 6e1d0b8988 ("r8169:add support for RTL8168H and RTL8107E")
Signed-off-by: Linmao Li <lilinmao@kylinos.cn>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/20251009122549.3955845-1-lilinmao@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fcb8b32a68 ]
If the internal flash contains missing or corrupted configuration,
basic communication over the bus still functions, but the device
is not capable of normal operation (for example, using mailboxes).
This condition is indicated in the info register by the ready bit.
If this bit is cleared, the probe procedure times out while fetching
the device state.
Handle this case by checking the ready bit value in zl3073x_dev_start()
and skipping DPLL device and pin registration if it is cleared.
Do not report this condition as an error, allowing the devlink device
to be registered and enabling the user to flash the correct configuration.
Prior this patch:
[ 31.112299] zl3073x-i2c 1-0070: Failed to fetch input state: -ETIMEDOUT
[ 31.116332] zl3073x-i2c 1-0070: error -ETIMEDOUT: Failed to start device
[ 31.136881] zl3073x-i2c 1-0070: probe with driver zl3073x-i2c failed with error -110
After this patch:
[ 41.011438] zl3073x-i2c 1-0070: FW not fully ready - missing or corrupted config
Fixes: 75a71ecc24 ("dpll: zl3073x: Register DPLL devices and pins")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251008141445.841113-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ebb1031c51 ]
Refactor DPLL initialization and move DPLL (de)registration, monitoring
control, fetching device invariant parameters and phase offset
measurement block setup to separate functions.
Use these new functions during device probe and teardown functions and
during changes to the clock_id devlink parameter.
These functions will also be used in the next patch implementing devlink
flash, where this functionality is likewise required.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://patch.msgid.link/20250909091532.11790-5-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: fcb8b32a68 ("dpll: zl3073x: Handle missing or corrupted flash configuration")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 93a27b5891 ]
Currently NETDEV_UNREGISTER event handler is not calling
j1939_cancel_active_session() and j1939_sk_queue_drop_all().
This will result in these calls being skipped when j1939_sk_release() is
called. And I guess that the reason syzbot is still reporting
unregister_netdevice: waiting for vcan0 to become free. Usage count = 2
is caused by lack of these calls.
Calling j1939_cancel_active_session(priv, sk) from j1939_sk_release() can
be covered by calling j1939_cancel_active_session(priv, NULL) from
j1939_netdev_notify().
Calling j1939_sk_queue_drop_all() from j1939_sk_release() can be covered
by calling j1939_sk_netdev_event_netdown() from j1939_netdev_notify().
Therefore, we can reuse j1939_cancel_active_session(priv, NULL) and
j1939_sk_netdev_event_netdown(priv) for NETDEV_UNREGISTER event handler.
Fixes: 7fcbe5b2c6 ("can: j1939: implement NETDEV_UNREGISTER notification handler")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/3ad3c7f8-5a74-4b07-a193-cb0725823558@I-love.SAKURA.ne.jp
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 65946eac6d ]
There is no error handling for `dma_map_single()` failures.
Add error handling by checking `dma_mapping_error()` and freeing
the `skb` using `dev_kfree_skb()` (process context) when it fails.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Yeounsu Moon <yyyynoom@gmail.com>
Tested-on: D-Link DGE-550T Rev-A3
Suggested-by: Simon Horman <horms@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a9e30a22d6 ]
A suspend/resume cycle on a down interface results in the interface
coming up in Error Active state. A suspend/resume cycle on an Up
interface will always result in Error Active state, regardless of the
actual CAN state.
During suspend, only set running interfaces to CAN_STATE_SLEEPING.
During resume only touch the CAN state of running interfaces. For
wakeup sources, set the CAN state depending on the Protocol Status
Regitser (PSR), for non wakeup source interfaces m_can_start() will do
the same.
Fixes: e0d1f4816f ("can: m_can: add Bosch M_CAN controller support")
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://patch.msgid.link/20250929-m_can-fix-state-handling-v4-4-682b49b49d9a@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4942c42fe1 ]
In some SoCs (observed on the STM32MP15) the M_CAN IP core keeps the
CAN state and CAN error counters over an internal reset cycle. An
external reset is not always possible, due to the shared reset with
the other CAN core. This caused the core not always be in Error Active
state when bringing up the controller.
Instead of always setting the CAN state to Error Active in
m_can_chip_config(), fix this by reading and decoding the Protocol
Status Regitser (PSR) and set the CAN state accordingly.
Fixes: e0d1f4816f ("can: m_can: add Bosch M_CAN controller support")
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://patch.msgid.link/20250929-m_can-fix-state-handling-v4-3-682b49b49d9a@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d9db29b45 ]
The CAN Error State is determined by the receive and transmit error
counters. The CAN error counters decrease when reception/transmission
is successful, so that a status transition back to the Error Active
status is possible. This transition is not handled by
m_can_handle_state_errors().
Add the missing detection of the Error Active state to
m_can_handle_state_errors() and extend the handling of this state in
m_can_handle_state_change().
Fixes: e0d1f4816f ("can: m_can: add Bosch M_CAN controller support")
Fixes: cd0d83eab2 ("can: m_can: m_can_handle_state_change(): fix state change")
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://patch.msgid.link/20250929-m_can-fix-state-handling-v4-2-682b49b49d9a@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba569fb07a ]
Commit 227619c3ff ("can: m_can: move runtime PM enable/disable to
m_can_platform") moved the PM runtime enable from the m_can core
driver into the m_can_platform.
That patch forgot to move the pm_runtime_disable() to
m_can_plat_remove(), so that unloading the m_can_platform driver
causes an "Unbalanced pm_runtime_enable!" error message.
Add the missing pm_runtime_disable() to m_can_plat_remove() to fix the
problem.
Cc: Patrik Flykt <patrik.flykt@linux.intel.com>
Fixes: 227619c3ff ("can: m_can: move runtime PM enable/disable to m_can_platform")
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://patch.msgid.link/20250929-m_can-fix-state-handling-v4-1-682b49b49d9a@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a779e27f24 ]
In be1e028302 ("coredump: don't pointlessly check and spew warnings")
we tried to fix input validation so it only happens during a write to
core_pattern. This would avoid needlessly logging a lot of warnings
during a read operation. However the logic accidently got inverted in
this commit. Fix it so the input validation only happens on write and is
skipped on read.
Fixes: be1e028302 ("coredump: don't pointlessly check and spew warnings")
Fixes: 16195d2c7d ("coredump: validate socket name as it is written")
Reviewed-by: Jan Kara <jack@suse.cz>
Reported-by: Yu Watanabe <watanabe.yu@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 154d1e7ad9 ]
The commit 168316db3583("dax: assert that i_rwsem is held
exclusive for writes") added lock assertions to ensure proper
locking in DAX operations. However, these assertions trigger
false-positive lockdep warnings since read lock is unnecessary
on read-only filesystems(e.g., erofs).
This patch skips the read lock assertion for read-only filesystems,
eliminating the spurious warnings while maintaining the integrity
checks for writable filesystems.
Fixes: 168316db35 ("dax: assert that i_rwsem is held exclusive for writes")
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Reviewed-by: Friendy Su <friendy.su@sony.com>
Reviewed-by: Daniel Palmer <daniel.palmer@sony.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 46f781e0d1 upstream.
The sticky fingers quirk (MT_QUIRK_STICKY_FINGERS) was only considering
the case when slots were not released during the last report.
This can be problematic if the firmware forgets to release a finger
while others are still present.
This was observed on the Synaptics DLL0945 touchpad found on the Dell
XPS 9310 and the Dell Inspiron 5406.
Fixes: 4f4001bc76 ("HID: multitouch: fix rare Win 8 cases when the touch up event gets missing")
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 75a5b8d4dd ]
After an bind/unbind cycle, the ncm->notify_req is left stale. If a
subsequent bind fails, the unified error label attempts to free this
stale request, leading to a NULL pointer dereference when accessing
ep->ops->free_request.
Refactor the error handling in the bind path to use the __free()
automatic cleanup mechanism.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
Call trace:
usb_ep_free_request+0x2c/0xec
ncm_bind+0x39c/0x3dc
usb_add_function+0xcc/0x1f0
configfs_composite_bind+0x468/0x588
gadget_bind_driver+0x104/0x270
really_probe+0x190/0x374
__driver_probe_device+0xa0/0x12c
driver_probe_device+0x3c/0x218
__device_attach_driver+0x14c/0x188
bus_for_each_drv+0x10c/0x168
__device_attach+0xfc/0x198
device_initial_probe+0x14/0x24
bus_probe_device+0x94/0x11c
device_add+0x268/0x48c
usb_add_gadget+0x198/0x28c
dwc3_gadget_init+0x700/0x858
__dwc3_set_mode+0x3cc/0x664
process_scheduled_works+0x1d8/0x488
worker_thread+0x244/0x334
kthread+0x114/0x1bc
ret_from_fork+0x10/0x20
Fixes: 9f6ce4240a ("usb: gadget: f_ncm.c added")
Cc: stable@kernel.org
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://lore.kernel.org/r/20250916-ready-v1-3-4997bf277548@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250916-ready-v1-3-4997bf277548@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 47b2116e54 ]
After an bind/unbind cycle, the acm->notify_req is left stale. If a
subsequent bind fails, the unified error label attempts to free this
stale request, leading to a NULL pointer dereference when accessing
ep->ops->free_request.
Refactor the error handling in the bind path to use the __free()
automatic cleanup mechanism.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
Call trace:
usb_ep_free_request+0x2c/0xec
gs_free_req+0x30/0x44
acm_bind+0x1b8/0x1f4
usb_add_function+0xcc/0x1f0
configfs_composite_bind+0x468/0x588
gadget_bind_driver+0x104/0x270
really_probe+0x190/0x374
__driver_probe_device+0xa0/0x12c
driver_probe_device+0x3c/0x218
__device_attach_driver+0x14c/0x188
bus_for_each_drv+0x10c/0x168
__device_attach+0xfc/0x198
device_initial_probe+0x14/0x24
bus_probe_device+0x94/0x11c
device_add+0x268/0x48c
usb_add_gadget+0x198/0x28c
dwc3_gadget_init+0x700/0x858
__dwc3_set_mode+0x3cc/0x664
process_scheduled_works+0x1d8/0x488
worker_thread+0x244/0x334
kthread+0x114/0x1bc
ret_from_fork+0x10/0x20
Fixes: 1f1ba11b64 ("usb gadget: issue notifications from ACM function")
Cc: stable@kernel.org
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://lore.kernel.org/r/20250916-ready-v1-4-4997bf277548@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250916-ready-v1-4-4997bf277548@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bfb1d99d96 ]
Gadget function drivers often have goto-based error handling in their
bind paths, which can be bug-prone. Refactoring these paths to use
__free() scope-based cleanup is desirable, but currently blocked.
The blocker is that usb_ep_free_request(ep, req) requires two
parameters, while the __free() mechanism can only pass a pointer to the
request itself.
Store an endpoint pointer in the struct usb_request. The pointer is
populated centrally in usb_ep_alloc_request() on every successful
allocation, making the request object self-contained.
Signed-off-by: Kuen-Han Tsai <khtsai@google.com>
Link: https://lore.kernel.org/r/20250916-ready-v1-1-4997bf277548@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250916-ready-v1-1-4997bf277548@google.com
Stable-dep-of: 0822894143 ("usb: gadget: f_rndis: Refactor bind path to use __free()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0a6e9e098f ]
[Why]
commit 530694f54d ("drm/amdgpu: do not resume device in thaw for
normal hibernation") optimized the flow for systems that are going
into S4 where the power would be turned off. Basically the thaw()
callback wouldn't resume the device if the hibernation image was
successfully created since the system would be powered off.
This however isn't the correct flow for a system entering into
s0i3 after the hibernation image is created. Some of the amdgpu
callbacks have different behavior depending upon the intended
state of the suspend.
[How]
Use pm_hibernation_mode_is_suspend() as an input to decide whether
to run resume during thaw() callback.
Reported-by: Ionut Nechita <ionut_n2001@yahoo.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4573
Tested-by: Ionut Nechita <ionut_n2001@yahoo.com>
Fixes: 530694f54d ("drm/amdgpu: do not resume device in thaw for normal hibernation")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Kenneth Crudup <kenny@panix.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Cc: 6.17+ <stable@vger.kernel.org> # 6.17+: 495c8d3503: PM: hibernate: Add pm_hibernation_mode_is_suspend()
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 495c8d3503 ]
Some drivers have different flows for hibernation and suspend. If
the driver opportunistically will skip thaw() then it needs a hint
to know what is happening after the hibernate.
Introduce a new symbol pm_hibernation_mode_is_suspend() that drivers
can call to determine if suspending the system for this purpose.
Tested-by: Ionut Nechita <ionut_n2001@yahoo.com>
Tested-by: Kenneth Crudup <kenny@panix.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 0a6e9e098f ("drm/amd: Fix hybrid sleep")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa4f4bae89 upstream.
Some file systems like FUSE-based ones or overlayfs may record the backing
file in struct vm_area_struct vm_file, instead of the user file that the
user mmapped.
That causes perf to misreport the device major/minor numbers of the file
system of the file, and the generation of the file, and potentially other
inode details. There is an existing helper file_user_inode() for that
situation.
Use file_user_inode() instead of file_inode() to get the inode for MMAP2
events.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record -e cycles:u -- /root/test/merged/cat /proc/self/maps
...
55b2c91d0000-55b2c926b000 r-xp 00018000 00:1a 3419 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.004 MB perf.data (5 samples) ]
#
# stat /root/test/merged/cat
File: /root/test/merged/cat
Size: 1127792 Blocks: 2208 IO Block: 4096 regular file
Device: 0,26 Inode: 3419 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2025-09-08 12:23:59.453309624 +0000
Modify: 2025-09-08 12:23:59.454309624 +0000
Change: 2025-09-08 12:23:59.454309624 +0000
Birth: 2025-09-08 12:23:59.453309624 +0000
Before:
Device reported 00:02 differs from stat output and /proc/self/maps
# perf script --show-mmap-events | grep /root/test/merged/cat
cat 377 [-01] 243.078558: PERF_RECORD_MMAP2 377/377: [0x55b2c91d0000(0x9b000) @ 0x18000 00:02 3419 2068525940]: r-xp /root/test/merged/cat
After:
Device reported 00:1a is the same as stat output and /proc/self/maps
# perf script --show-mmap-events | grep /root/test/merged/cat
cat 362 [-01] 127.755167: PERF_RECORD_MMAP2 362/362: [0x55ba6e781000(0x9b000) @ 0x18000 00:1a 3419 0]: r-xp /root/test/merged/cat
With respect to stable kernels, overlayfs mmap function ovl_mmap() was
added in v4.19 but file_user_inode() was not added until v6.8 and never
back-ported to stable kernels. FMODE_BACKING that it depends on was added
in v6.5. This issue has gone largely unnoticed, so back-porting before
v6.8 is probably not worth it, so put 6.8 as the stable kernel prerequisite
version, although in practice the next long term kernel is 6.12.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Amir Goldstein <amir73il@gmail.com>
Cc: stable@vger.kernel.org # 6.8
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8818f507a9 upstream.
Some file systems like FUSE-based ones or overlayfs may record the backing
file in struct vm_area_struct vm_file, instead of the user file that the
user mmapped.
Since commit def3ae83da ("fs: store real path instead of fake path in
backing file f_path"), file_path() no longer returns the user file path
when applied to a backing file. There is an existing helper
file_user_path() for that situation.
Use file_user_path() instead of file_path() to get the path for MMAP
and MMAP2 events.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record -e intel_pt//u -- /root/test/merged/cat /proc/self/maps
...
55b0ba399000-55b0ba434000 r-xp 00018000 00:1a 3419 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.060 MB perf.data ]
#
Before:
File name is wrong (/cat), so decoding fails:
# perf script --no-itrace --show-mmap-events
cat 367 [016] 100.491492: PERF_RECORD_MMAP2 367/367: [0x55b0ba399000(0x9b000) @ 0x18000 00:02 3419 489959280]: r-xp /cat
...
# perf script --itrace=e | wc -l
Warning:
19 instruction trace errors
19
#
After:
File name is correct (/root/test/merged/cat), so decoding is ok:
# perf script --no-itrace --show-mmap-events
cat 364 [016] 72.153006: PERF_RECORD_MMAP2 364/364: [0x55ce4003d000(0x9b000) @ 0x18000 00:02 3419 3132534314]: r-xp /root/test/merged/cat
# perf script --itrace=e
# perf script --itrace=e | wc -l
0
#
Fixes: def3ae83da ("fs: store real path instead of fake path in backing file f_path")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Amir Goldstein <amir73il@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebfc8542ad upstream.
It was reported that Intel PT address filters do not work in Docker
containers. That relates to the use of overlayfs.
overlayfs records the backing file in struct vm_area_struct vm_file,
instead of the user file that the user mmapped. In order for an address
filter to match, it must compare to the user file inode. There is an
existing helper file_user_inode() for that situation.
Use file_user_inode() instead of file_inode() to get the inode for address
filter matching.
Example:
Setup:
# cd /root
# mkdir test ; cd test ; mkdir lower upper work merged
# cp `which cat` lower
# mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
# perf record --buildid-mmap -e intel_pt//u --filter 'filter * @ /root/test/merged/cat' -- /root/test/merged/cat /proc/self/maps
...
55d61d246000-55d61d2e1000 r-xp 00018000 00:1a 3418 /root/test/merged/cat
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB perf.data ]
# perf buildid-cache --add /root/test/merged/cat
Before:
Address filter does not match so there are no control flow packets
# perf script --itrace=e
# perf script --itrace=b | wc -l
0
# perf script -D | grep 'TIP.PGE' | wc -l
0
#
After:
Address filter does match so there are control flow packets
# perf script --itrace=e
# perf script --itrace=b | wc -l
235
# perf script -D | grep 'TIP.PGE' | wc -l
57
#
With respect to stable kernels, overlayfs mmap function ovl_mmap() was
added in v4.19 but file_user_inode() was not added until v6.8 and never
back-ported to stable kernels. FMODE_BACKING that it depends on was added
in v6.5. This issue has gone largely unnoticed, so back-porting before
v6.8 is probably not worth it, so put 6.8 as the stable kernel prerequisite
version, although in practice the next long term kernel is 6.12.
Closes: https://lore.kernel.org/linux-perf-users/aBCwoq7w8ohBRQCh@fremen.lan
Reported-by: Edd Barrett <edd@theunixzoo.co.uk>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Amir Goldstein <amir73il@gmail.com>
Cc: stable@vger.kernel.org # 6.8
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6df8e84aa6 upstream.
The atomic variable vm_fault_info_updated is used to synchronize access to
adev->gmc.vm_fault_info between the interrupt handler and
get_vm_fault_info().
The default atomic functions like atomic_set() and atomic_read() do not
provide memory barriers. This allows for CPU instruction reordering,
meaning the memory accesses to vm_fault_info and the vm_fault_info_updated
flag are not guaranteed to occur in the intended order. This creates a
race condition that can lead to inconsistent or stale data being used.
The previous implementation, which used an explicit mb(), was incomplete
and inefficient. It failed to account for all potential CPU reorderings,
such as the access of vm_fault_info being reordered before the atomic_read
of the flag. This approach is also more verbose and less performant than
using the proper atomic functions with acquire/release semantics.
Fix this by switching to atomic_set_release() and atomic_read_acquire().
These functions provide the necessary acquire and release semantics,
which act as memory barriers to ensure the correct order of operations.
It is also more efficient and idiomatic than using explicit full memory
barriers.
Fixes: b97dfa27ef ("drm/amdgpu: save vm fault information for amdkfd")
Cc: stable@vger.kernel.org
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5801e65206 upstream.
When adding dependencies with drm_sched_job_add_dependency(), that
function consumes the fence reference both on success and failure, so in
the latter case the dma_fence_put() on the error path (xarray failed to
expand) is a double free.
Interestingly this bug appears to have been present ever since
commit ebd5f74255 ("drm/sched: Add dependency tracking"), since the code
back then looked like this:
drm_sched_job_add_implicit_dependencies():
...
for (i = 0; i < fence_count; i++) {
ret = drm_sched_job_add_dependency(job, fences[i]);
if (ret)
break;
}
for (; i < fence_count; i++)
dma_fence_put(fences[i]);
Which means for the failing 'i' the dma_fence_put was already a double
free. Possibly there were no users at that time, or the test cases were
insufficient to hit it.
The bug was then only noticed and fixed after
commit 9c2ba26535 ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2")
landed, with its fixup of
commit 4eaf02d607 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies").
At that point it was a slightly different flavour of a double free, which
commit 963d0b3569 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
noticed and attempted to fix.
But it only moved the double free from happening inside the
drm_sched_job_add_dependency(), when releasing the reference not yet
obtained, to the caller, when releasing the reference already released by
the former in the failure case.
As such it is not easy to identify the right target for the fixes tag so
lets keep it simple and just continue the chain.
While fixing we also improve the comment and explain the reason for taking
the reference and not dropping it.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 963d0b3569 ("drm/scheduler: fix drm_sched_job_add_implicit_dependencies harder")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/dri-devel/aNFbXq8OeYl3QSdm@stanley.mountain/
Cc: Christian König <christian.koenig@amd.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Philipp Stanner <phasta@kernel.org>
Cc: Christian König <ckoenig.leichtzumerken@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Cc: stable@vger.kernel.org # v5.16+
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20251015084015.6273-1-tvrtko.ursulin@igalia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1cf11d80db upstream.
The __component_match_add function may assign the 'matchptr' pointer
the value ERR_PTR(-ENOMEM), which will subsequently be dereferenced.
The call stack leading to the error looks like this:
hda_component_manager_init
|-> component_match_add
|-> component_match_add_release
|-> __component_match_add ( ... ,**matchptr, ... )
|-> *matchptr = ERR_PTR(-ENOMEM); // assign
|-> component_master_add_with_match( ... match)
|-> component_match_realloc(match, match->num); // dereference
Add IS_ERR() check to prevent the crash.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: ae7abe36e3 ("ALSA: hda/realtek: Add CS35L41 support for Thinkpad laptops")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8527bbb339 upstream.
Return value of a function acpi_evaluate_dsm() is dereferenced without
checking for NULL, but it is usually checked for this function.
acpi_evaluate_dsm() may return NULL, when acpi_evaluate_object() returns
acpi_status other than ACPI_SUCCESS, so add a check to prevent the crach.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 447106e92a ("ALSA: hda: cs35l41: Support mute notifications for CS35L41 HDA")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e41e5a91a upstream.
In order to compare the resource against the HMAT memory target,
the resource needs to be memory type. Change the DEFINE_RES()
macro to DEFINE_RES_MEM() in order to set the correct resource type.
hmat_get_extended_linear_cache_size() uses resource_contains()
internally. This causes a regression for platforms with the
extended linear cache enabled as the comparison always fails and the
cache size is not set. User visible impact is that when 'cxl list' is
issued, a CXL region with extended linear cache support will only
report half the size of the actual size. And this also breaks MCE
reporting of the memory region due to incorrect offset calculation
for the memory.
[dj: Fixup commit log suggested by djbw]
[dj: Fixup stable address for cc]
Fixes: 12b3d697c8 ("cxl: Remove core/acpi.c and cxl core dependency on ACPI")
Cc: stable@vger.kernel.org
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6447b0e355 upstream.
Malicious SMB server can send invalid reply to FSCTL_DFS_GET_REFERRALS
- reply smaller than sizeof(struct get_dfs_referral_rsp)
- reply with number of referrals smaller than NumberOfReferrals in the
header
Processing of such replies will cause oob.
Return -EINVAL error on such replies to prevent oob-s.
Signed-off-by: Eugene Korenevsky <ekorenevsky@aliyun.com>
Cc: stable@vger.kernel.org
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5a51bf4e9 upstream.
Currently, when building a free space tree at populate_free_space_tree(),
if we are not using the block group tree feature, we always expect to find
block group items (either extent items or a block group item with key type
BTRFS_BLOCK_GROUP_ITEM_KEY) when we search the extent tree with
btrfs_search_slot_for_read(), so we assert that we found an item. However
this expectation is wrong since we can have a new block group created in
the current transaction which is still empty and for which we still have
not added the block group's item to the extent tree, in which case we do
not have any items in the extent tree associated to the block group.
The insertion of a new block group's block group item in the extent tree
happens at btrfs_create_pending_block_groups() when it calls the helper
insert_block_group_item(). This typically is done when a transaction
handle is released, committed or when running delayed refs (either as
part of a transaction commit or when serving tickets for space reservation
if we are low on free space).
So remove the assertion at populate_free_space_tree() even when the block
group tree feature is not enabled and update the comment to mention this
case.
Syzbot reported this with the following stack trace:
BTRFS info (device loop3 state M): rebuilding free space tree
assertion failed: ret == 0 :: 0, in fs/btrfs/free-space-tree.c:1115
------------[ cut here ]------------
kernel BUG at fs/btrfs/free-space-tree.c:1115!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 6352 Comm: syz.3.25 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
RIP: 0010:populate_free_space_tree+0x700/0x710 fs/btrfs/free-space-tree.c:1115
Code: ff ff e8 d3 (...)
RSP: 0018:ffffc9000430f780 EFLAGS: 00010246
RAX: 0000000000000043 RBX: ffff88805b709630 RCX: fea61d0e2e79d000
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffffc9000430f8b0 R08: ffffc9000430f4a7 R09: 1ffff92000861e94
R10: dffffc0000000000 R11: fffff52000861e95 R12: 0000000000000001
R13: 1ffff92000861f00 R14: dffffc0000000000 R15: 0000000000000000
FS: 00007f424d9fe6c0(0000) GS:ffff888125afc000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd78ad212c0 CR3: 0000000076d68000 CR4: 00000000003526f0
Call Trace:
<TASK>
btrfs_rebuild_free_space_tree+0x1ba/0x6d0 fs/btrfs/free-space-tree.c:1364
btrfs_start_pre_rw_mount+0x128f/0x1bf0 fs/btrfs/disk-io.c:3062
btrfs_remount_rw fs/btrfs/super.c:1334 [inline]
btrfs_reconfigure+0xaed/0x2160 fs/btrfs/super.c:1559
reconfigure_super+0x227/0x890 fs/super.c:1076
do_remount fs/namespace.c:3279 [inline]
path_mount+0xd1a/0xfe0 fs/namespace.c:4027
do_mount fs/namespace.c:4048 [inline]
__do_sys_mount fs/namespace.c:4236 [inline]
__se_sys_mount+0x313/0x410 fs/namespace.c:4213
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f424e39066a
Code: d8 64 89 02 (...)
RSP: 002b:00007f424d9fde68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f424d9fdef0 RCX: 00007f424e39066a
RDX: 0000200000000180 RSI: 0000200000000380 RDI: 0000000000000000
RBP: 0000200000000180 R08: 00007f424d9fdef0 R09: 0000000000000020
R10: 0000000000000020 R11: 0000000000000246 R12: 0000200000000380
R13: 00007f424d9fdeb0 R14: 0000000000000000 R15: 00002000000002c0
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
Reported-by: syzbot+884dc4621377ba579a6f@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/68dc3dab.a00a0220.102ee.004e.GAE@google.com/
Fixes: a5ed918285 ("Btrfs: implement the free space B-tree")
CC: <stable@vger.kernel.org> # 6.1.x: 1961d20f6f: btrfs: fix assertion when building free space tree
CC: <stable@vger.kernel.org> # 6.1.x
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fec9b9d3ce upstream.
At the end of btrfs_load_block_group_zone_info() the first thing we do
is to ensure that if the mapping type is not a SINGLE one and there is
no RAID stripe tree, then we return early with an error.
Doing that, though, prevents the code from running the last calls from
this function which are about freeing memory allocated during its
run. Hence, in this case, instead of returning early, we set the ret
value and fall through the rest of the cleanup code.
Fixes: 5906333cc4 ("btrfs: zoned: don't skip block group profile checks on conventional zones")
CC: stable@vger.kernel.org # 6.8+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8ab2fa6969 upstream.
The intent of btrfs_readahead_expand() was to expand to the length of
the current compressed extent being read. However, "ram_bytes" is *not*
that, in the case where a single physical compressed extent is used for
multiple file extents.
Consider this case with a large compressed extent C and then later two
non-compressed extents N1 and N2 written over C, leaving C1 and C2
pointing to offset/len pairs of C:
[ C ]
[ N1 ][ C1 ][ N2 ][ C2 ]
In such a case, ram_bytes for both C1 and C2 is the full uncompressed
length of C. So starting readahead in C1 will expand the readahead past
the end of C1, past N2, and into C2. This will then expand readahead
again, to C2_start + ram_bytes, way past EOF. First of all, this is
totally undesirable, we don't want to read the whole file in arbitrary
chunks of the large underlying extent if it happens to exist. Secondly,
it results in zeroing the range past the end of C2 up to ram_bytes. This
is particularly unpleasant with fs-verity as it can zero and set
uptodate pages in the verity virtual space past EOF. This incorrect
readahead behavior can lead to verity verification errors, if we iterate
in a way that happens to do the wrong readahead.
Fix this by using em->len for readahead expansion, not em->ram_bytes,
resulting in the expected behavior of stopping readahead at the extent
boundary.
Reported-by: Max Chernoff <git@maxchernoff.ca>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2399898
Fixes: 9e9ff875e4 ("btrfs: use readahead_expand() on compressed extents")
CC: stable@vger.kernel.org # 6.17
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b7fdfd29a1 upstream.
[BUG]
With v6.17-rc kernels, btrfs will always set 'ssd' mount option even if
the block device is not a rotating one:
# cat /sys/block/sdd/queue/rotational
1
# cat /etc/fstab:
LABEL=DATA2 /data2 btrfs rw,relatime,space_cache=v2,subvolid=5,subvol=/,nofail,nosuid,nodev 0 0
# mount
[...]
/dev/sdd on /data2 type btrfs (rw,nosuid,nodev,relatime,ssd,space_cache=v2,subvolid=5,subvol=/)
[CAUSE]
The 'ssd' mount option is set by set_device_specific_options(), and it
expects that if there is any rotating device in the btrfs, it will set
fs_devices::rotating.
However after commit bddf57a707 ("btrfs: delay btrfs_open_devices()
until super block is created"), the device opening is delayed until the
super block is created.
But the timing of set_device_specific_options() is still left as is,
this makes the function be called without any device opened.
Since no device is opened, thus fs_devices::rotating will never be set,
making btrfs incorrectly set 'ssd' mount option.
[FIX]
Only call set_device_specific_options() after btrfs_open_devices().
Also only call set_device_specific_options() after a new mount, if we're
mounting a mounted btrfs, there is no need to set the device specific
mount options again.
Reported-by: HAN Yuwei <hrx@bupt.moe>
Link: https://lore.kernel.org/linux-btrfs/C8FF75669DFFC3C5+5f93bf8a-80a0-48a6-81bf-4ec890abc99a@bupt.moe/
Fixes: bddf57a707 ("btrfs: delay btrfs_open_devices() until super block is created")
CC: stable@vger.kernel.org # 6.17
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 53a4acbfc1 upstream.
On 'btrfs_ioctl_qgroup_assign' we first duplicate the argument as
provided by the user, which is kfree'd in the end. But this was not the
case when allocating memory for 'prealloc'. In this case, if it somehow
failed, then the previous code would go directly into calling
'mnt_drop_write_file', without freeing the string duplicated from the
user space.
Fixes: 4addc1ffd6 ("btrfs: qgroup: preallocate memory before adding a relation")
CC: stable@vger.kernel.org # 6.12+
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e5a5983ed upstream.
When starting relocation, at reloc_chunk_start(), if we happen to find
the flag BTRFS_FS_RELOC_RUNNING is already set we return an error
(-EINPROGRESS) to the callers, however the callers call reloc_chunk_end()
which will clear the flag BTRFS_FS_RELOC_RUNNING, which is wrong since
relocation was started by another task and still running.
Finding the BTRFS_FS_RELOC_RUNNING flag already set is an unexpected
scenario, but still our current behaviour is not correct.
Fix this by never calling reloc_chunk_end() if reloc_chunk_start() has
returned an error, which is what logically makes sense, since the general
widespread pattern is to have end functions called only if the counterpart
start functions succeeded. This requires changing reloc_chunk_start() to
clear BTRFS_FS_RELOC_RUNNING if there's a pending cancel request.
Fixes: 907d2710d7 ("btrfs: add cancellable chunk relocation support")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Boris Burkov <boris@bur.io>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d3ad18394 upstream.
syzbot reported a BUG_ON in ext4_es_cache_extent() when opening a verity
file on a corrupted ext4 filesystem mounted without a journal.
The issue is that the filesystem has an inode with both the INLINE_DATA
and EXTENTS flags set:
EXT4-fs error (device loop0): ext4_cache_extents:545: inode #15:
comm syz.0.17: corrupted extent tree: lblk 0 < prev 66
Investigation revealed that the inode has both flags set:
DEBUG: inode 15 - flag=1, i_inline_off=164, has_inline=1, extents_flag=1
This is an invalid combination since an inode should have either:
- INLINE_DATA: data stored directly in the inode
- EXTENTS: data stored in extent-mapped blocks
Having both flags causes ext4_has_inline_data() to return true, skipping
extent tree validation in __ext4_iget(). The unvalidated out-of-order
extents then trigger a BUG_ON in ext4_es_cache_extent() due to integer
underflow when calculating hole sizes.
Fix this by detecting this invalid flag combination early in ext4_iget()
and rejecting the corrupted inode.
Cc: stable@kernel.org
Reported-and-tested-by: syzbot+038b7bf43423e132b308@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=038b7bf43423e132b308
Suggested-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Message-ID: <20250930112810.315095-1-kartikey406@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 328a782cb1 upstream.
When freeing metadata blocks in nojournal mode, ext4_forget() calls
bforget() to clear the dirty flag on the buffer_head and remvoe
associated mappings. This is acceptable if the metadata has not yet
begun to be written back. However, if the write-back has already started
but is not yet completed, ext4_forget() will have no effect.
Subsequently, ext4_mb_clear_bb() will immediately return the block to
the mb allocator. This block can then be reallocated immediately,
potentially causing an data corruption issue.
Fix this by clearing the buffer's dirty flag and waiting for the ongoing
I/O to complete, ensuring that no further writes to stale data will
occur.
Fixes: 16e08b14a4 ("ext4: cleanup clean_bdev_aliases() calls")
Cc: stable@kernel.org
Reported-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Closes: https://lore.kernel.org/linux-ext4/a9417096-9549-4441-9878-b1955b899b4e@huaweicloud.com/
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20250916093337.3161016-3-yi.zhang@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c652c3a71 upstream.
When releasing file system metadata blocks in jbd2_journal_forget(), if
this buffer has not yet been checkpointed, it may have already been
written back, currently be in the process of being written back, or has
not yet written back. jbd2_journal_forget() calls
jbd2_journal_try_remove_checkpoint() to check the buffer's status and
add it to the current transaction if it has not been written back. This
buffer can only be reallocated after the transaction is committed.
jbd2_journal_try_remove_checkpoint() attempts to lock the buffer and
check its dirty status while holding the buffer lock. If the buffer has
already been written back, everything proceeds normally. However, there
are two issues. First, the function returns immediately if the buffer is
locked by the write-back process. It does not wait for the write-back to
complete. Consequently, until the current transaction is committed and
the block is reallocated, there is no guarantee that the I/O will
complete. This means that ongoing I/O could write stale metadata to the
newly allocated block, potentially corrupting data. Second, the function
unlocks the buffer as soon as it detects that the buffer is still dirty.
If a concurrent write-back occurs immediately after this unlocking and
before clear_buffer_dirty() is called in jbd2_journal_forget(), data
corruption can theoretically still occur.
Although these two issues are unlikely to occur in practice since the
undergoing metadata writeback I/O does not take this long to complete,
it's better to explicitly ensure that all ongoing I/O operations are
completed.
Fixes: 597599268e ("jbd2: discard dirty data when forgetting an un-journalled buffer")
Cc: stable@kernel.org
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20250916093337.3161016-2-yi.zhang@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d5c4f5c7a upstream.
Assuming the disk layout as below,
disk0: 0 --- 0x00035abfff
disk1: 0x00035ac000 --- 0x00037abfff
disk2: 0x00037ac000 --- 0x00037ebfff
and we want to read data from offset=13568 having len=128 across the block
devices, we can illustrate the block addresses like below.
0 .. 0x00037ac000 ------------------- 0x00037ebfff, 0x00037ec000 -------
| ^ ^ ^
| fofs 0 13568 13568+128
| ------------------------------------------------------
| LBA 0x37e8aa9 0x37ebfa9 0x37ec029
--- map 0x3caa9 0x3ffa9
In this example, we should give the relative map of the target block device
ranging from 0x3caa9 to 0x3ffa9 where the length should be calculated by
0x37ebfff + 1 - 0x37ebfa9.
In the below equation, however, map->m_pblk was supposed to be the original
address instead of the one from the target block address.
- map->m_len = min(map->m_len, dev->end_blk + 1 - map->m_pblk);
Cc: stable@vger.kernel.org
Fixes: 71f2c82062 ("f2fs: multidevice: support direct IO")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0aa1b76fe1 upstream.
Another day, another syzkaller bug. KVM erroneously allows userspace to
pend vCPU events for a vCPU that hasn't been initialized yet, leading to
KVM interpreting a bunch of uninitialized garbage for routing /
injecting the exception.
In one case the injection code and the hyp disagree on whether the vCPU
has a 32bit EL1 and put the vCPU into an illegal mode for AArch64,
tripping the BUG() in exception_target_el() during the next injection:
kernel BUG at arch/arm64/kvm/inject_fault.c:40!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
CPU: 3 UID: 0 PID: 318 Comm: repro Not tainted 6.17.0-rc4-00104-g10fd0285305d #6 PREEMPT
Hardware name: linux,dummy-virt (DT)
pstate: 21402009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : exception_target_el+0x88/0x8c
lr : pend_serror_exception+0x18/0x13c
sp : ffff800082f03a10
x29: ffff800082f03a10 x28: ffff0000cb132280 x27: 0000000000000000
x26: 0000000000000000 x25: ffff0000c2a99c20 x24: 0000000000000000
x23: 0000000000008000 x22: 0000000000000002 x21: 0000000000000004
x20: 0000000000008000 x19: ffff0000c2a99c20 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200000c0
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000
x8 : ffff800082f03af8 x7 : 0000000000000000 x6 : 0000000000000000
x5 : ffff800080f621f0 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 000000000040009b x1 : 0000000000000003 x0 : ffff0000c2a99c20
Call trace:
exception_target_el+0x88/0x8c (P)
kvm_inject_serror_esr+0x40/0x3b4
__kvm_arm_vcpu_set_events+0xf0/0x100
kvm_arch_vcpu_ioctl+0x180/0x9d4
kvm_vcpu_ioctl+0x60c/0x9f4
__arm64_sys_ioctl+0xac/0x104
invoke_syscall+0x48/0x110
el0_svc_common.constprop.0+0x40/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x34/0xf0
el0t_64_sync_handler+0xa0/0xe4
el0t_64_sync+0x198/0x19c
Code: f946bc01 b4fffe61 9101e020 17fffff2 (d4210000)
Reject the ioctls outright as no sane VMM would call these before
KVM_ARM_VCPU_INIT anyway. Even if it did the exception would've been
thrown away by the eventual reset of the vCPU's state.
Cc: stable@vger.kernel.org # 6.17
Fixes: b7b27facc7 ("arm/arm64: KVM: Add KVM_GET/SET_VCPU_EVENTS")
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d5d790ba15 upstream.
The function lan78xx_write_raw_eeprom failed to properly propagate EEPROM
write timeout errors (-ETIMEDOUT). In the timeout fallthrough path, it first
attempted to restore the pin configuration for LED outputs and then
returned only the status of that restore operation, discarding the
original timeout error saved in ret.
As a result, callers could mistakenly treat EEPROM write operation as
successful even though the EEPROM write had actually timed out with no
or partial data write.
To fix this, handle errors in restoring the LED pin configuration separately.
If the restore succeeds, return any prior EEPROM write timeout error saved
in ret to the caller.
Suggested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Fixes: 8b1b2ca83b ("net: usb: lan78xx: Improve error handling in EEPROM and OTP operations")
cc: stable@vger.kernel.org
Signed-off-by: Bhanu Seshu Kumar Valluri <bhanuseshukumar@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75527d61d6 upstream.
rtl8152_driver_init() is missing the error handling.
When rtl8152_driver registration fails, rtl8152_cfgselector_driver
should be deregistered.
Fixes: ec51fbd1b8 ("r8152: add USB device driver for config selection")
Cc: stable@vger.kernel.org
Signed-off-by: Yi Cong <yicong@kylinos.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251011082415.580740-1-yicongsrfy@163.com
[pabeni@redhat.com: clarified the commit message]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be7cab44ed upstream.
io_create_region_mmap_safe() protects publishing of a region against
concurrent mmap calls, however we should also protect against it when
removing a region. There is a gap io_register_mem_region() where it
safely publishes a region, but then copy_to_user goes wrong and it
unsafely frees the region.
Cc: stable@vger.kernel.org
Fixes: 087f997870 ("io_uring/memmap: implement mmap for regions")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 927069c4ac upstream.
This reverts commit 90bfb28d5f.
Kevin reports that this commit causes an issue for him with LVM
snapshots, most likely because of turning off NOWAIT support while a
snapshot is being created. This makes -EOPNOTSUPP bubble back through
the completion handler, where io_uring read/write handling should just
retry it.
Reinstate the previous check removed by the referenced commit.
Cc: stable@vger.kernel.org
Fixes: 90bfb28d5f ("io_uring/rw: drop -EOPNOTSUPP check in __io_complete_rw_common()")
Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Reported-by: Kevin Lumik <kevin@xf.ee>
Link: https://lore.kernel.org/io-uring/cceb723c-051b-4de2-9a4c-4aa82e1619ee@kernel.dk/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 86f54f9b6c upstream.
If obj_exts allocation failed, slab->obj_exts is set to OBJEXTS_ALLOC_FAIL,
But we do not clear it when freeing the slab. Since OBJEXTS_ALLOC_FAIL and
MEMCG_DATA_OBJEXTS currently share the same bit position, during the
release of the associated folio, a VM_BUG_ON_FOLIO() check in
folio_memcg_kmem() is triggered because the OBJEXTS_ALLOC_FAIL flag was
not cleared, causing it to be interpreted as a kmem folio (non-slab)
with MEMCG_OBJEXTS_DATA flag set, which is invalid because
MEMCG_OBJEXTS_DATA is supposed to be set only on slabs.
Another problem that predates sharing the OBJEXTS_ALLOC_FAIL and
MEMCG_DATA_OBJEXTS bits is that on configurations with
is_check_pages_enabled(), the non-cleared bit in page->memcg_data will
trigger a free_page_is_bad() failure "page still charged to cgroup"
When freeing a slab, we clear slab->obj_exts if the obj_ext array has
been successfully allocated. So let's clear it also when the allocation
has failed.
Fixes: 09c46563ff ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations")
Fixes: 7612833192 ("slab: Reuse first bit for OBJEXTS_ALLOC_FAIL")
Link: https://lore.kernel.org/all/20251015141642.700170-1-hao.ge@linux.dev/
Cc: <stable@vger.kernel.org>
Signed-off-by: Hao Ge <gehao@kylinos.cn>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6416c2dfe upstream.
The S5_RESET_STATUS register is parsed on boot and printed to kmsg.
However, this could sometimes be misleading and lead to users wasting a
lot of time on meaningless debugging for two reasons:
* Some bits are never cleared by hardware. It's the software's
responsibility to clear them as per the Processor Programming Reference
(see [1]).
* Some rare hardware-initiated platform resets do not update the
register at all.
In both cases, a previous reboot could leave its trace in the register,
resulting in users seeing unrelated reboot reasons while debugging random
reboots afterward.
Write the read value back to the register in order to clear all reason bits
since they are write-1-to-clear while the others must be preserved.
[1]: https://bugzilla.kernel.org/show_bug.cgi?id=206537#attach_303991
[ bp: Massage commit message. ]
Fixes: ab81310287 ("x86/CPU/AMD: Print the reason for the last reset")
Signed-off-by: Rong Zhang <i@rong.moe>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/all/20250913144245.23237-1-i@rong.moe/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2b77f4220 upstream.
Fix three refcount inconsistency issues related to `cifs_sb_tlink`.
Comments for `cifs_sb_tlink` state that `cifs_put_tlink()` needs to be
called after successful calls to `cifs_sb_tlink()`. Three calls fail to
update refcount accordingly, leading to possible resource leaks.
Fixes: 8ceb984379 ("CIFS: Move rename to ops struct")
Fixes: 2f1afe2599 ("cifs: Use smb 2 - 3 and cifsacl mount options getacl functions")
Fixes: 366ed846df ("cifs: Use smb 2 - 3 and cifsacl mount options setacl function")
Cc: stable@vger.kernel.org
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 812258ff41 upstream.
The kernel uses the standard rustc targets for non-x86 targets, and out
of those only 64-bit arm's target has kcfi support enabled. For x86, the
custom 64-bit target enables kcfi.
The HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC config option that allows
CFI_CLANG to be used in combination with RUST does not check whether the
rustc target supports kcfi. This breaks the build on riscv (and
presumably 32-bit arm) when CFI_CLANG and RUST are enabled at the same
time.
Ordinarily, a rustc-option check would be used to detect target support
but unfortunately rustc-option filters out the target for reasons given
in commit 46e24a545c ("rust: kasan/kbuild: fix missing flags on first
build"). As a result, if the host supports kcfi but the target does not,
e.g. when building for riscv on x86_64, the build would remain broken.
Instead, make HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC depend on the only
two architectures where the target used supports it to fix the build.
CC: stable@vger.kernel.org
Fixes: ca627e6365 ("rust: cfi: add support for CFI_CLANG with Rust")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20250908-distill-lint-1ae78bcf777c@spud
Signed-off-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7075f501b upstream.
There was backward compatibility in the terms of mailbox API. Various
drivers from various OSes supporting 10G adapters from Intel portfolio
could easily negotiate mailbox API.
This convention has been broken since introducing API 1.4.
Commit 0062e7cc95 ("ixgbevf: add VF IPsec offload code") added support
for IPSec which is specific only for the kernel ixgbe driver. None of the
rest of the Intel 10G PF/VF drivers supports it. And actually lack of
support was not included in the IPSec implementation - there were no such
code paths. No possibility to negotiate support for the feature was
introduced along with introduction of the feature itself.
Commit 339f289641 ("ixgbevf: Add support for new mailbox communication
between PF and VF") increasing API version to 1.5 did the same - it
introduced code supported specifically by the PF ESX driver. It altered API
version for the VF driver in the same time not touching the version
defined for the PF ixgbe driver. It led to additional discrepancies,
as the code provided within API 1.6 cannot be supported for Linux ixgbe
driver as it causes crashes.
The issue was noticed some time ago and mitigated by Jake within the commit
d0725312ad ("ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5").
As a result we have regression for IPsec support and after increasing API
to version 1.6 ixgbevf driver stopped to support ESX MBX.
To fix this mess add new mailbox op asking PF driver about supported
features. Basing on a response determine whether to set support for IPSec
and ESX-specific enhanced mailbox.
New mailbox op, for compatibility purposes, must be added within new API
revision, as API version of OOT PF & VF drivers is already increased to
1.6 and doesn't incorporate features negotiate op.
Features negotiation mechanism gives possibility to be extended with new
features when needed in the future.
Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Closes: https://lore.kernel.org/intel-wired-lan/20241101-jk-ixgbevf-mailbox-v1-5-fixes-v1-0-f556dc9a66ed@intel.com/
Fixes: 0062e7cc95 ("ixgbevf: add VF IPsec offload code")
Fixes: 339f289641 ("ixgbevf: Add support for new mailbox communication between PF and VF")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-4-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f64b3cd05 upstream.
In normal operation, a registered exec queue is disabled and
deregistered through the GuC, and freed only after the GuC confirms
completion. However, if the driver is forced to unbind while the exec
queue is still running, the user may call exec_destroy() after the GuC
has already been stopped and CT communication disabled.
In this case, the driver cannot receive a response from the GuC,
preventing proper cleanup of exec queue resources. Fix this by directly
releasing the resources when GuC is not running.
Here is the failure dmesg log:
"
[ 468.089581] ---[ end trace 0000000000000000 ]---
[ 468.089608] pci 0000:03:00.0: [drm] *ERROR* GT0: GUC ID manager unclean (1/65535)
[ 468.090558] pci 0000:03:00.0: [drm] GT0: total 65535
[ 468.090562] pci 0000:03:00.0: [drm] GT0: used 1
[ 468.090564] pci 0000:03:00.0: [drm] GT0: range 1..1 (1)
[ 468.092716] ------------[ cut here ]------------
[ 468.092719] WARNING: CPU: 14 PID: 4775 at drivers/gpu/drm/xe/xe_ttm_vram_mgr.c:298 ttm_vram_mgr_fini+0xf8/0x130 [xe]
"
v2: use xe_uc_fw_is_running() instead of xe_guc_ct_enabled().
As CT may go down and come back during VF migration.
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: stable@vger.kernel.org
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20251010172529.2967639-2-shuicheng.lin@intel.com
(cherry picked from commit 9b42321a02c50a12b2beb6ae9469606257fbecea)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e9ad390a48 upstream.
The GIC CDEOI system instruction requires the Rt field to be set to 0b11111
otherwise the instruction behaviour becomes CONSTRAINED UNPREDICTABLE.
Currenly, its usage is encoded as a system register write, with a constant
0 value:
write_sysreg_s(0, GICV5_OP_GIC_CDEOI)
While compiling with GCC, the 0 constant value, through these asm
constraints and modifiers ('x' modifier and 'Z' constraint combo):
asm volatile(__msr_s(r, "%x0") : : "rZ" (__val));
forces the compiler to issue the XZR register for the MSR operation (ie
that corresponds to Rt == 0b11111) issuing the right instruction encoding.
Unfortunately LLVM does not yet understand that modifier/constraint
combo so it ends up issuing a different register from XZR for the MSR
source, which in turns means that it encodes the GIC CDEOI instruction
wrongly and the instruction behaviour becomes CONSTRAINED UNPREDICTABLE
that we must prevent.
Add a conditional to write_sysreg_s() macro that detects whether it
is passed a constant 0 value and issues an MSR write with XZR as source
register - explicitly doing what the asm modifier/constraint is meant to
achieve through constraints/modifiers, fixing the LLVM compilation issue.
Fixes: 7ec80fb3f0 ("irqchip/gic-v5: Add GICv5 PPI support")
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Cc: Sascha Bischoff <sascha.bischoff@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12d724f285 upstream.
Commit 6d4405b16d ("ata: libata-core: Cache the general purpose log
directory") introduced caching of a device general purpose log directory
to avoid repeated access to this log page during device scan. This
change also added a check on this log page to verify that the log page
version is 0x0001 as mandated by the ACS specifications.
And it turns out that some devices do not bother reporting this version,
instead reporting a version 0, resulting in error messages such as:
ata6.00: Invalid log directory version 0x0000
and to the device being marked as not supporting the general purpose log
directory log page.
Since before commit 6d4405b16d the log page version check did not
exist and things were still working correctly for these devices, relax
ata_read_log_directory() version check and only warn about the invalid
log page version number without disabling access to the log directory
page.
Fixes: 6d4405b16d ("ata: libata-core: Cache the general purpose log directory")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220635
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56094ad3ea upstream.
When user calls open_by_handle_at() on some inode that is not cached, we
will create disconnected dentry for it. If such dentry is a directory,
exportfs_decode_fh_raw() will then try to connect this dentry to the
dentry tree through reconnect_path(). It may happen for various reasons
(such as corrupted fs or race with rename) that the call to
lookup_one_unlocked() in reconnect_one() will fail to find the dentry we
are trying to reconnect and instead create a new dentry under the
parent. Now this dentry will not be marked as disconnected although the
parent still may well be disconnected (at least in case this
inconsistency happened because the fs is corrupted and .. doesn't point
to the real parent directory). This creates inconsistency in
disconnected flags but AFAICS it was mostly harmless. At least until
commit f1ee616214 ("VFS: don't keep disconnected dentries on d_anon")
which removed adding of most disconnected dentries to sb->s_anon list.
Thus after this commit cleanup of disconnected dentries implicitely
relies on the fact that dput() will immediately reclaim such dentries.
However when some leaf dentry isn't marked as disconnected, as in the
scenario described above, the reclaim doesn't happen and the dentries
are "leaked". Memory reclaim can eventually reclaim them but otherwise
they stay in memory and if umount comes first, we hit infamous "Busy
inodes after unmount" bug. Make sure all dentries created under a
disconnected parent are marked as disconnected as well.
Reported-by: syzbot+1d79ebe5383fc016cf07@syzkaller.appspotmail.com
Fixes: f1ee616214 ("VFS: don't keep disconnected dentries on d_anon")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 00d95fcc4d upstream.
The ErrorString() and SafeString() docutils functions were helpers meant to
ease the handling of encodings during the Python 3 transition. There is no
real need for them after Python 3.6, and docutils 0.22 removes them,
breaking the docs build
Handle this by just injecting our own one-liner version of ErrorString(),
and removing the sole SafeString() call entirely.
Reported-by: Zhixu Liu <zhixu.liu@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <87ldmnv2pi.fsf@trenco.lwn.net>
[ Salvatore Bonaccorso: Backport to v6.17.y for context changes in
Documentation/sphinx/kernel_include.py with major refactorings for the v6.18
development cycle ]
Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6c7ca6a02f ]
When calling in listmount() mnt_ns_release() may be passed a NULL
pointer. Handle that case gracefully.
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9a6ebbdbd4 ]
With lazytime mount option enabled we can be switching many dirty inodes
on cgroup exit to the parent cgroup. The numbers observed in practice
when systemd slice of a large cron job exits can easily reach hundreds
of thousands or millions. The logic in inode_do_switch_wbs() which sorts
the inode into appropriate place in b_dirty list of the target wb
however has linear complexity in the number of dirty inodes thus overall
time complexity of switching all the inodes is quadratic leading to
workers being pegged for hours consuming 100% of the CPU and switching
inodes to the parent wb.
Simple reproducer of the issue:
FILES=10000
# Filesystem mounted with lazytime mount option
MNT=/mnt/
echo "Creating files and switching timestamps"
for (( j = 0; j < 50; j ++ )); do
mkdir $MNT/dir$j
for (( i = 0; i < $FILES; i++ )); do
echo "foo" >$MNT/dir$j/file$i
done
touch -a -t 202501010000 $MNT/dir$j/file*
done
wait
echo "Syncing and flushing"
sync
echo 3 >/proc/sys/vm/drop_caches
echo "Reading all files from a cgroup"
mkdir /sys/fs/cgroup/unified/mycg1 || exit
echo $$ >/sys/fs/cgroup/unified/mycg1/cgroup.procs || exit
for (( j = 0; j < 50; j ++ )); do
cat /mnt/dir$j/file* >/dev/null &
done
wait
echo "Switching wbs"
# Now rmdir the cgroup after the script exits
We need to maintain b_dirty list ordering to keep writeback happy so
instead of sorting inode into appropriate place just append it at the
end of the list and clobber dirtied_time_when. This may result in inode
writeback starting later after cgroup switch however cgroup switches are
rare so it shouldn't matter much. Since the cgroup had write access to
the inode, there are no practical concerns of the possible DoS issues.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66c14dccd8 ]
process_inode_switch_wbs_work() can be switching over 100 inodes to a
different cgroup. Since switching an inode requires counting all dirty &
under-writeback pages in the address space of each inode, this can take
a significant amount of time. Add a possibility to reschedule after
processing each inode to avoid softlockups.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 38f4885088 ]
Actual removal is done under the lock, but for checking if need to bother
the lockless RB_EMPTY_NODE() is safe - either that namespace had never
been added to mnt_ns_tree, in which case the the node will stay empty, or
whoever had allocated it has called mnt_ns_tree_add() and it has already
run to completion. After that point RB_EMPTY_NODE() will become false and
will remain false, no matter what we do with other nodes in the tree.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 278033a225 ]
When CONFIG_TMPFS is enabled, the initial root filesystem is a tmpfs.
By default, a tmpfs mount is limited to using 50% of the available RAM
for its content. This can be problematic in memory-constrained
environments, particularly during a kdump capture.
In a kdump scenario, the capture kernel boots with a limited amount of
memory specified by the 'crashkernel' parameter. If the initramfs is
large, it may fail to unpack into the tmpfs rootfs due to insufficient
space. This is because to get X MB of usable space in tmpfs, 2*X MB of
memory must be available for the mount. This leads to an OOM failure
during the early boot process, preventing a successful crash dump.
This patch introduces a new kernel command-line parameter,
initramfs_options, which allows passing specific mount options directly
to the rootfs when it is first mounted. This gives users control over
the rootfs behavior.
For example, a user can now specify initramfs_options=size=75% to allow
the tmpfs to use up to 75% of the available memory. This can
significantly reduce the memory pressure for kdump.
Consider a practical example:
To unpack a 48MB initramfs, the tmpfs needs 48MB of usable space. With
the default 50% limit, this requires a memory pool of 96MB to be
available for the tmpfs mount. The total memory requirement is therefore
approximately: 16MB (vmlinuz) + 48MB (loaded initramfs) + 48MB (unpacked
kernel) + 96MB (for tmpfs) + 12MB (runtime overhead) ≈ 220MB.
By using initramfs_options=size=75%, the memory pool required for the
48MB tmpfs is reduced to 48MB / 0.75 = 64MB. This reduces the total
memory requirement by 32MB (96MB - 64MB), allowing the kdump to succeed
with a smaller crashkernel size, such as 192MB.
An alternative approach of reusing the existing rootflags parameter was
considered. However, a new, dedicated initramfs_options parameter was
chosen to avoid altering the current behavior of rootflags (which
applies to the final root filesystem) and to prevent any potential
regressions.
Also add documentation for the new kernel parameter "initramfs_options"
This approach is inspired by prior discussions and patches on the topic.
Ref: https://www.lightofdawn.org/blog/?viewDetailed=00128
Ref: https://landley.net/notes-2015.html#01-01-2015
Ref: https://lkml.org/lkml/2021/6/29/783
Ref: https://www.kernel.org/doc/html/latest/filesystems/ramfs-rootfs-initramfs.html#what-is-rootfs
Signed-off-by: Lichen Liu <lichliu@redhat.com>
Link: https://lore.kernel.org/20250815121459.3391223-1-lichliu@redhat.com
Tested-by: Rob Landley <rob@landley.net>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f75e07bf52 ]
According to the PLIC specification[1], global interrupt sources are
assigned small unsigned integer identifiers beginning at the value 1.
An interrupt ID of 0 is reserved to mean "no interrupt".
The current plic_irq_resume() and plic_irq_suspend() functions incorrectly
start the loop from index 0, which accesses the register space for the
reserved interrupt ID 0.
Change the loop to start from index 1, skipping the reserved
interrupt ID 0 as per the PLIC specification.
This prevents potential undefined behavior when accessing the reserved
register space during suspend/resume cycles.
Fixes: e80f0b6a2c ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation")
Co-developed-by: Jia Wang <wangjia@ultrarisc.com>
Signed-off-by: Jia Wang <wangjia@ultrarisc.com>
Co-developed-by: Charles Mirabile <cmirabil@redhat.com>
Signed-off-by: Charles Mirabile <cmirabil@redhat.com>
Signed-off-by: Lucas Zampieri <lzampier@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://github.com/riscv/riscv-plic-spec/releases/tag/1.0.0
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit baf60d5cb8 ]
In certain circumstances, the ACPI handle of a data-only node may be
NULL, in which case it does not make sense to attempt to attach that
node to an ACPI namespace object, so update the code to avoid attempts
to do so.
This prevents confusing and unuseful error messages from being printed.
Also document the fact that the ACPI handle of a data-only node may be
NULL and when that happens in a code comment. In addition, make
acpi_add_nondev_subnodes() print a diagnostic message for each data-only
node with an unknown ACPI namespace scope.
Fixes: 1d52f10917 ("ACPI: property: Tie data nodes to acpi handles")
Cc: 6.0+ <stable@vger.kernel.org> # 6.0+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 737c3a09dc ]
In some places in the ACPI device properties handling code, it is
unclear why the code is what it is. Some assumptions are not documented
and some pieces of code are based on knowledge that is not mentioned
anywhere.
Add code comments explaining these things.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Stable-dep-of: baf60d5cb8 ("ACPI: property: Do not pass NULL handles to acpi_attach_data()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d06118fe9b ]
Data-only subnode links following the ACPI data subnode GUID in a _DSD
package are expected to point to named objects returning _DSD-equivalent
packages. If a reference to such an object is used in the target field
of any of those links, that object will be evaluated in place (as a
named object) and its return data will be embedded in the outer _DSD
package.
For this reason, it is not expected to see a subnode link with the
target field containing a local reference (that would mean pointing
to a device or another object that cannot be evaluated in place and
therefore cannot return a _DSD-equivalent package).
Accordingly, simplify the code parsing data-only subnode links to
simply print a message when it encounters a local reference in the
target field of one of those links.
Moreover, since acpi_nondev_subnode_data_ok() would only have one
caller after the change above, fold it into that caller.
Link: https://lore.kernel.org/linux-acpi/CAJZ5v0jVeSrDO6hrZhKgRZrH=FpGD4vNUjFD8hV9WwN9TLHjzQ@mail.gmail.com/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Stable-dep-of: baf60d5cb8 ("ACPI: property: Do not pass NULL handles to acpi_attach_data()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4d6fc29f36 ]
Patch series "mm/ksm: Fix incorrect accounting of KSM counters during
fork", v3.
The first patch in this series fixes the incorrect accounting of KSM
counters such as ksm_merging_pages, ksm_rmap_items, and the global
ksm_zero_pages during fork.
The following patch add a selftest to verify the ksm_merging_pages counter
was updated correctly during fork.
Test Results
============
Without the first patch
-----------------------
# [RUN] test_fork_ksm_merging_page_count
not ok 10 ksm_merging_page in child: 32
With the first patch
--------------------
# [RUN] test_fork_ksm_merging_page_count
ok 10 ksm_merging_pages is not inherited after fork
This patch (of 2):
Currently, the KSM-related counters in `mm_struct`, such as
`ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited
by the child process during fork. This results in inconsistent
accounting.
When a process uses KSM, identical pages are merged and an rmap item is
created for each merged page. The `ksm_merging_pages` and
`ksm_rmap_items` counters are updated accordingly. However, after a fork,
these counters are copied to the child while the corresponding rmap items
are not. As a result, when the child later triggers an unmerge, there are
no rmap items present in the child, so the counters remain stale, leading
to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a
global counter and a per-process counter. During fork, the per-process
counter is inherited by the child, but the global counter is not
incremented. Since the child also references zero pages, the global
counter should be updated as well. Otherwise, during zero-page unmerge,
both the global and per-process counters are decremented, causing the
global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during
fork, and the global ksm_zero_pages counter is updated with the
per-process ksm_zero_pages value inherited by the child. This ensures
that KSM statistics remain accurate and reflect the activity of each
process correctly.
Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.1758648700.git.donettom@linux.ibm.com
Fixes: 7609385337 ("ksm: count ksm merging pages for each process")
Fixes: cb4df4cae4 ("ksm: count allocated ksm rmap_items for each process")
Fixes: e2942062e0 ("ksm: count all zero pages placed by KSM")
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Aboorva Devarajan <aboorvad@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: <stable@vger.kernel.org> [6.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ replaced mm_flags_test() calls with test_bit() ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2c69490dd upstream.
Prior to commit b52da4054e ("ipmi: Rework user message limit handling"),
i_ipmi_request() used to increase the user reference counter if the receive
message is provided by the caller of IPMI API functions. This is no longer
the case. However, ipmi_free_recv_msg() is still called and decreases the
reference counter. This results in the reference counter reaching zero,
the user data pointer is released, and all kinds of interesting crashes are
seen.
Fix the problem by increasing user reference counter if the receive message
has been provided by the caller.
Fixes: b52da4054e ("ipmi: Rework user message limit handling")
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20251006201857.3433837-1-linux@roeck-us.net>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit feb8ae81b2 upstream.
Introduce acpi_gbl_use_global_lock, which allows to skip the Global Lock
initialization. This is useful for systems without Global Lock (such as
loong_arch), so as to avoid error messages during boot phase:
ACPI Error: Could not enable global_lock event (20240827/evxfevnt-182)
ACPI Error: No response from Global Lock hardware, disabling lock (20240827/evglock-59)
Link: https://github.com/acpica/acpica/commit/463cb0fe
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 57295e8354 upstream.
syzkaller found a path where ext4_xattr_inode_update_ref() reads an EA
inode refcount that is already <= 0 and then applies ref_change (often
-1). That lets the refcount underflow and we proceed with a bogus value,
triggering errors like:
EXT4-fs error: EA inode <n> ref underflow: ref_count=-1 ref_change=-1
EXT4-fs warning: ea_inode dec ref err=-117
Make the invariant explicit: if the current refcount is non-positive,
treat this as on-disk corruption, emit ext4_error_inode(), and fail the
operation with -EFSCORRUPTED instead of updating the refcount. Delete the
WARN_ONCE() as negative refcounts are now impossible; keep error reporting
in ext4_error_inode().
This prevents the underflow and the follow-on orphan/cleanup churn.
Reported-by: syzbot+0be4f339a8218d2a5bb1@syzkaller.appspotmail.com
Fixes: https://syzbot.org/bug?extid=0be4f339a8218d2a5bb1
Cc: stable@kernel.org
Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com>
Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com>
Message-ID: <20250920021342.45575-1-eraykrdg1@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12e803c882 upstream.
During the movement of a written extent, mext_page_mkuptodate() is
called to read data in the range [from, to) into the page cache and to
update the corresponding buffers. Therefore, we should not wait on any
buffer whose start offset is >= 'to'. Otherwise, it will return -EIO and
fail the extents movement.
$ for i in `seq 3 -1 0`; \
do xfs_io -fs -c "pwrite -b 1024 $((i * 1024)) 1024" /mnt/foo; \
done
$ umount /mnt && mount /dev/pmem1s /mnt # drop cache
$ e4defrag /mnt/foo
e4defrag 1.47.0 (5-Feb-2023)
ext4 defragmentation for /mnt/foo
[1/1]/mnt/foo: 0% [ NG ]
Success: [0/1]
Cc: stable@kernel.org
Fixes: a40759fb16 ("ext4: remove array of buffer_heads from mext_page_mkuptodate()")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20250912105841.1886799-1-yi.zhang@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 46c22a8bb4 upstream.
Currently, our handling of metadata is _ambiguous_ in some scenarios,
that is, we end up returning unknown if the range only covers the
mapping partially.
For example, in the following case:
$ xfs_io -c fsmap -d
0: 254:16 [0..7]: static fs metadata 8
1: 254:16 [8..15]: special 102:1 8
2: 254:16 [16..5127]: special 102:2 5112
3: 254:16 [5128..5255]: special 102:3 128
4: 254:16 [5256..5383]: special 102:4 128
5: 254:16 [5384..70919]: inodes 65536
6: 254:16 [70920..70967]: unknown 48
...
$ xfs_io -c fsmap -d 24 33
0: 254:16 [24..39]: unknown 16 <--- incomplete reporting
$ xfs_io -c fsmap -d 24 33 (With patch)
0: 254:16 [16..5127]: special 102:2 5112
This is because earlier in ext4_getfsmap_meta_helper, we end up ignoring
any extent that starts before our queried range, but overlaps it. While
the man page [1] is a bit ambiguous on this, this fix makes the output
make more sense since we are anyways returning an "unknown" extent. This
is also consistent to how XFS does it:
$ xfs_io -c fsmap -d
...
6: 254:16 [104..127]: free space 24
7: 254:16 [128..191]: inodes 64
...
$ xfs_io -c fsmap -d 137 150
0: 254:16 [128..191]: inodes 64 <-- full extent returned
[1] https://man7.org/linux/man-pages/man2/ioctl_getfsmap.2.html
Reported-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Message-ID: <023f37e35ee280cd9baac0296cbadcbe10995cab.1757058211.git.ojaswin@linux.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d80eaa1a1 upstream.
After running a stress test combined with fault injection,
we performed fsck -a followed by fsck -fn on the filesystem
image. During the second pass, fsck -fn reported:
Inode 131512, end of extent exceeds allowed value
(logical block 405, physical block 1180540, len 2)
This inode was not in the orphan list. Analysis revealed the
following call chain that leads to the inconsistency:
ext4_da_write_end()
//does not update i_disksize
ext4_punch_hole()
//truncate folio, keep size
ext4_page_mkwrite()
ext4_block_page_mkwrite()
ext4_block_write_begin()
ext4_get_block()
//insert written extent without update i_disksize
journal commit
echo 1 > /sys/block/xxx/device/delete
da-write path updates i_size but does not update i_disksize. Then
ext4_punch_hole truncates the da-folio yet still leaves i_disksize
unchanged(in the ext4_update_disksize_before_punch function, the
condition offset + len < size is met). Then ext4_page_mkwrite sees
ext4_nonda_switch return 1 and takes the nodioread_nolock path, the
folio about to be written has just been punched out, and it’s offset
sits beyond the current i_disksize. This may result in a written
extent being inserted, but again does not update i_disksize. If the
journal gets committed and then the block device is yanked, we might
run into this. It should be noted that replacing ext4_punch_hole with
ext4_zero_range in the call sequence may also trigger this issue, as
neither will update i_disksize under these circumstances.
To fix this, we can modify ext4_update_disksize_before_punch to
increase i_disksize to min(i_size, offset + len) when both i_size and
(offset + len) are greater than i_disksize.
Cc: stable@kernel.org
Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Message-ID: <20250911133024.1841027-1-sunyongjian@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a6ce20c15 upstream.
In principle orphan file can be arbitrarily large. However orphan replay
needs to traverse it all and we also pin all its buffers in memory. Thus
filesystems with absurdly large orphan files can lead to big amounts of
memory consumed. Limit orphan file size to a sane value and also use
kvmalloc() for allocating array of block descriptor structures to avoid
large order allocations for sane but large orphan files.
Reported-by: syzbot+0b92850d68d9b12934f5@syzkaller.appspotmail.com
Fixes: 02f310fcf4 ("ext4: Speedup ext4 orphan inode handling")
Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Message-ID: <20250909112206.10459-2-jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 963845748f upstream.
Commit bc264fea0f ("iomap: support incremental iomap_iter advances")
changed the error handling logic in iomap_iter(). Previously any error
from iomap_dio_bio_iter() got propagated to userspace, after this commit
if ->iomap_end returns error, it gets propagated to userspace instead of
an error from iomap_dio_bio_iter(). This results in unaligned writes to
ext4 to silently fallback to buffered IO instead of erroring out.
Now returning ENOTBLK for DIO writes from ext4_iomap_end() seems
unnecessary these days. It is enough to return ENOTBLK from
ext4_iomap_begin() when we don't support DIO write for that particular
file offset (due to hole).
Fixes: bc264fea0f ("iomap: support incremental iomap_iter advances")
Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Message-ID: <20250901112739.32484-2-jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8b90e6387 upstream.
The implicit __GFP_NOFAIL flag in ext4_sb_bread() was removed in commit
8a83ac5494 ("ext4: call bdev_getblk() from sb_getblk_gfp()"), meaning
the function can now fail under memory pressure.
Most callers of ext4_sb_bread() propagate the error to userspace and do not
remount the filesystem read-only. However, ext4_free_branches() handles
ext4_sb_bread() failure by remounting the filesystem read-only.
This implies that an ext3 filesystem (mounted via the ext4 driver) could be
forcibly remounted read-only due to a transient page allocation failure,
which is unacceptable.
To mitigate this, introduce a new helper function, ext4_sb_bread_nofail(),
which explicitly uses __GFP_NOFAIL, and use it in ext4_free_branches().
Fixes: 8a83ac5494 ("ext4: call bdev_getblk() from sb_getblk_gfp()")
Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8172f57746 upstream.
Improve drain handling by ensuring the LAST flag is attached to final
capture buffer when drain response is received from the firmware.
Previously, the driver failed to attach the V4L2_BUF_FLAG_LAST flag when
a drain response was received from the firmware, relying on userspace to
mark the next queued buffer as LAST. This update fixes the issue by
checking the pending drain status, attaching the LAST flag to the
capture buffer received from the firmware (with EOS attached), and
returning it to the V4L2 layer correctly.
Fixes: d09100763b ("media: iris: add support for drain sequence")
Cc: stable@vger.kernel.org
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK
Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cae3619e4 upstream.
Currently, internal buffers are destroyed only if 'PENDING_RELEASE' flag
is set when a release response is received from the firmware, which is
incorrect. Internal buffers should always be destroyed when the firmware
explicitly releases it, regardless of whether the 'PENDING_RELEASE' flag
was set by the driver. This is specially important during force-stop
scenarios, where the firmware may release buffers without driver marking
them for release.
Fix this by removing the incorrect check and ensuring all buffers are
properly cleaned up.
Fixes: 73702f45db ("media: iris: allocate, initialize and queue internal buffers")
Cc: stable@vger.kernel.org
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK
Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65f72c6a8d upstream.
A client (e.g., GST for encoder) can initiate streaming on the capture
port before the output port, causing the instance state to transition to
OUTPUT_STREAMING. When streaming is subsequently started on the output
port, the instance state advances to STREAMING, and the substate should
transition to LOAD_RESOURCES.
Previously, the code blocked the substate transition to LOAD_RESOURCES
if the instance state was OUTPUT_STREAMING. This update modifies the
logic to permit the substate transition to LOAD_RESOURCES when the
instance state is OUTPUT_STREAMING, thereby supporting this client
streaming sequence.
Fixes: 547f7b8c50 ("media: iris: add check to allow sub states transitions")
Cc: stable@vger.kernel.org
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK
Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b67ef9b33 upstream.
The previous check to block capture port streaming before output port
was incorrect and caused some valid usecase to fail. While removing that
check allows capture port to enter streaming independently, it also
introduced firmware errors due to premature queuing of DPB buffers
before the firmware session was fully started which happens only when
streamon is called on output port.
Fix this by deferring DPB buffer queuing to the firmware until both
capture and output are streaming and state is 'STREAMING'.
Fixes: 11712ce70f ("media: iris: implement vb2 streaming ops")
Cc: stable@vger.kernel.org
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK
Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 57429b0fdd upstream.
When we succeed loading the firmware, we don't want to hold on to the
firmware pointer anymore, since it won't be freed anywhere else. The same
applies for the mapped memory. Unmapping the memory is particularly
important since the memory will be protected after the Iris firmware is
started, so we need to make sure there will be no accidental access to this
region (even if just a speculative one from the CPU).
Almost the same firmware loading code also exists in venus/firmware.c,
there it is implemented correctly.
Fix this by dropping the early "return ret" and move the call of
qcom_scm_pas_auth_and_reset() out of iris_load_fw_to_memory(). We should
unmap the memory before bringing the firmware out of reset.
Cc: stable@vger.kernel.org
Fixes: d19b163356 ("media: iris: implement video firmware load/unload")
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fbb823a07 upstream.
Driver implements different callbacks for the power off controller
(.power_off_controller):
- iris_vpu_power_off_controller,
- iris_vpu33_power_off_controller,
The generic wrapper for handling power off - iris_vpu_power_off() -
calls them via 'iris_platform_data->vpu_ops', so shall the cleanup code
in iris_vpu_power_on().
This makes also sense if looking at caller of iris_vpu_power_on(), which
unwinds also with the wrapper calling respective platfortm code (unwinds
with iris_vpu_power_off()).
Otherwise power off sequence on the newer VPU3.3 in error path is not
complete.
Fixes: c69df5de4a ("media: platform: qcom/iris: add power_off_controller to vpu_ops")
Cc: stable@vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a082e4b4d0 upstream.
When v3 NLM request finds a conflicting delegation, it triggers
a delegation recall and nfsd_open fails with EAGAIN. nfsd_open
then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead
of returning nlm_failed for when there is a conflicting delegation,
drop this NLM request so that the client retries. Once delegation
is recalled and if a local lock is claimed, a retry would lead to
nfsd returning a nlm_lck_blocked error or a successful nlm lock.
Fixes: d343fce148 ("[PATCH] knfsd: Allow lockd to drop replies as appropriate")
Cc: stable@vger.kernel.org # v6.6
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ab1c282c01 upstream.
Commit 5304877936 ("NFSD: Fix strncpy() fortify warning") replaced
strncpy(,, sizeof(..)) with strlcpy(,, sizeof(..) - 1), but strlcpy()
already guaranteed NUL-termination of the destination buffer and
subtracting one byte potentially truncated the source string.
The incorrect size was then carried over in commit 72f78ae00a ("NFSD:
move from strlcpy with unused retval to strscpy") when switching from
strlcpy() to strscpy().
Fix this off-by-one error by using the full size of the destination
buffer again.
Cc: stable@vger.kernel.org
Fixes: 5304877936 ("NFSD: Fix strncpy() fortify warning")
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4f574ca9c upstream.
A while back I had reported that an NFSv3 client could successfully
mount using '-o xprtsec=none' an export that had been exported with
'xprtsec=tls:mtls'. By "successfully" I mean that the mount command
would succeed and the mount would show up in /proc/mount. Attempting
to do anything futher with the mount would be met with NFS3ERR_ACCES.
This was fixed (albeit accidentally) by commit bb4f07f240 ("nfsd:
Fix NFSD_MAY_BYPASS_GSS and NFSD_MAY_BYPASS_GSS_ON_ROOT") and was
subsequently re-broken by commit 0813c5f012 ("nfsd: fix access
checking for NLM under XPRTSEC policies").
Transport Layer Security isn't an RPC security flavor or pseudo-flavor,
so we shouldn't be conflating them when determining whether the access
checks can be bypassed. Split check_nfsd_access() into two helpers, and
have __fh_verify() call the helpers directly since __fh_verify() has
logic that allows one or both of the checks to be skipped. All other
sites will continue to call check_nfsd_access().
Link: https://lore.kernel.org/linux-nfs/ZjO3Qwf_G87yNXb2@aion/
Fixes: 9280c57743 ("NFSD: Handle new xprtsec= export option")
Cc: stable@vger.kernel.org
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e18190b7e9 upstream.
damon_lru_sort_apply_parameters() allocates a new DAMON context, stages
user-specified DAMON parameters on it, and commits to running DAMON
context at once, using damon_commit_ctx(). The code is, however, directly
updating the monitoring attributes of the running context. And the
attributes are over-written by later damon_commit_ctx() call. This means
that the monitoring attributes parameters are not really working. Fix the
wrong use of the parameter context.
Link: https://lkml.kernel.org/r/20250916031549.115326-1-sj@kernel.org
Fixes: a309694364 ("mm/damon/lru_sort: use damon_commit_ctx()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: <stable@vger.kernel.org> [6.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b93af2cc8e upstream.
DAMON's virtual address space operation set implementation (vaddr) calls
pte_offset_map_lock() inside the page table walk callback function. This
is for reading and writing page table accessed bits. If
pte_offset_map_lock() fails, it retries by returning the page table walk
callback function with ACTION_AGAIN.
pte_offset_map_lock() can continuously fail if the target is a pmd
migration entry, though. Hence it could cause an infinite page table walk
if the migration cannot be done until the page table walk is finished.
This indeed caused a soft lockup when CPU hotplugging and DAMON were
running in parallel.
Avoid the infinite loop by simply not retrying the page table walk. DAMON
is promising only a best-effort accuracy, so missing access to such pages
is no problem.
Link: https://lkml.kernel.org/r/20250930004410.55228-1-sj@kernel.org
Fixes: 7780d04046 ("mm/pagewalkers: ACTION_AGAIN if pte_offset_map_lock() fails")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: Xinyu Zheng <zhengxinyu6@huawei.com>
Closes: https://lore.kernel.org/20250918030029.2652607-1-zhengxinyu6@huawei.com
Acked-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org> [6.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a204d4b14 upstream.
Commit 524c48072e ("mm/page_alloc: rename ALLOC_HIGH to
ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH,
which implies ALLOC_MIN_RESERVE, is going to be used instead of
__GFP_ATOMIC for high atomic reserves.
Commit eb2e2b425c ("mm/page_alloc: explicitly record high-order atomic
allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such
allocations of order higher than 0. It still used __GFP_ATOMIC, though.
Then, commit 1ebbb21811 ("mm/page_alloc: explicitly define how
__GFP_HIGH non-blocking allocations accesses reserves") just turned that
check for !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were
expected to test for __GFP_HIGH.
This leads to high atomic reserves being added for high-order GFP_NOWAIT
allocations and others that clear __GFP_DIRECT_RECLAIM, which is
unexpected. Later, those reserves lead to 0-order allocations going to
the slow path and starting reclaim.
From /proc/pagetypeinfo, without the patch:
Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA32, type HighAtomic 1 8 10 9 7 3 0 0 0 0 0
Node 0, zone Normal, type HighAtomic 64 20 12 5 0 0 0 0 0 0 0
With the patch:
Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Link: https://lkml.kernel.org/r/20250814172245.1259625-1-cascardo@igalia.com
Fixes: 1ebbb21811 ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Tested-by: Helen Koike <koike@igalia.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: NeilBrown <neilb@suse.de>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c24248ed78 upstream.
The value of skb_data->wait indicates whether skb is passed on to the
core mac80211 stack or released by the driver itself. Make sure that by
the time skb is added to txwd queue and becomes visible to the completing
side, it has already allocated and initialized TX wait related data (in
case it's needed).
This is found by code review and addresses a possible race scenario
described below:
Waiting thread Completing thread
rtw89_core_send_nullfunc()
rtw89_core_tx_write_link()
...
rtw89_pci_txwd_submit()
skb_data->wait = NULL
/* add skb to the queue */
skb_queue_tail(&txwd->queue, skb)
/* another thread (e.g. rtw89_ops_tx) performs TX kick off for the same queue */
rtw89_pci_napi_poll()
...
rtw89_pci_release_txwd_skb()
/* get skb from the queue */
skb_unlink(skb, &txwd->queue)
rtw89_pci_tx_status()
rtw89_core_tx_wait_complete()
/* use incorrect skb_data->wait */
rtw89_core_tx_kick_off_and_wait()
/* assign skb_data->wait but too late */
Found by Linux Verification Center (linuxtesting.org).
Fixes: 1ae5ca6152 ("wifi: rtw89: add function to wait for completion of TX skbs")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250919210852.823912-3-pchelkin@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4038016397 upstream.
When object extension vector allocation fails, we set slab->obj_exts to
OBJEXTS_ALLOC_FAIL to indicate the failure. Later, once the vector is
successfully allocated, we will use this flag to mark codetag references
stored in that vector as empty to avoid codetag warnings.
slab_obj_exts() used to retrieve the slab->obj_exts vector pointer checks
slab->obj_exts for being either NULL or a pointer with MEMCG_DATA_OBJEXTS
bit set. However it does not handle the case when slab->obj_exts equals
OBJEXTS_ALLOC_FAIL. Add the missing condition to avoid extra warning.
Fixes: 09c46563ff ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations")
Reported-by: Shakeel Butt <shakeel.butt@linux.dev>
Closes: https://lore.kernel.org/all/jftidhymri2af5u3xtcqry3cfu6aqzte3uzlznhlaylgrdztsi@5vpjnzpsemf5/
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Cc: stable@vger.kernel.org # v6.10+
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa7a0a53ee upstream.
If the decompressor is compiled with clang this can lead to the following
warning:
In file included from arch/s390/boot/startup.c:4:
...
In file included from ./include/linux/pgtable.h:6:
./arch/s390/include/asm/pgtable.h:2065:48: warning: passing 'unsigned long *' to parameter of type
'long *' converts between pointers to integer types with different sign [-Wpointer-sign]
2065 | value = __atomic64_or_barrier(PGSTE_PCL_BIT, ptr);
Add -Wno-pointer-sign to the decompressor compile flags, like it is also
done for the kernel. This is similar to what was done for x86 to address
the same problem [1].
[1] commit dca5203e3f ("x86/boot: Add -Wno-pointer-sign to KBUILD_CFLAGS")
Cc: stable@vger.kernel.org
Reported-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f4ed0ce48 upstream.
Currently, if CCW request creation fails with -EINVAL, the DASD driver
returns BLK_STS_IOERR to the block layer.
This can happen, for example, when a user-space application such as QEMU
passes a misaligned buffer, but the original cause of the error is
masked as a generic I/O error.
This patch changes the behavior so that -EINVAL is returned as
BLK_STS_INVAL, allowing user space to properly detect alignment issues
instead of interpreting them as I/O errors.
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Cc: stable@vger.kernel.org #6.11+
Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 130e6de621 upstream.
The block layer validates buffer alignment using the device's
dma_alignment value. If dma_alignment is smaller than
logical_block_size(bp_block) -1, misaligned buffer incorrectly pass
validation and propagate to the lower-level driver.
This patch adjusts dma_alignment to be at least logical_block_size -1,
ensuring that misalignment buffers are properly rejected at the block
layer and do not reach the DASD driver unnecessarily.
Fixes: 2a07bb64d8 ("s390/dasd: Remove DMA alignment")
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Cc: stable@vger.kernel.org #6.11+
Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f0edc8f113 upstream.
For the __xsch() inline assembly the conversion to flag output macros is
incomplete. Only the conditional shift of the return value was added, while
the required changes to the inline assembly itself are missing.
If compiled with GCC versions before 14.2 this leads to a double shift of
the cc output operand and therefore the returned value of __xsch() is
incorrectly always zero, instead of the expected condition code.
Fixes: e200565d43 ("s390/cio/ioasm: Convert to use flag output macros")
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 008385efd0 upstream.
The previous commit adds an exception for the C-flag case. The
'mptcp_join.sh' selftest is extended to validate this case.
In this subtest, there is a typical CDN deployment with a client where
MPTCP endpoints have been 'automatically' configured:
- the server set net.mptcp.allow_join_initial_addr_port=0
- the client has multiple 'subflow' endpoints, and the default limits:
not accepting ADD_ADDRs.
Without the parent patch, the client is not able to establish new
subflows using its 'subflow' endpoints. The parent commit fixes that.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes: df377be387 ("mptcp: add deny_join_id0 in mptcp_options_received")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-2-ad126cc47c6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b1ff850e0 upstream.
When servers set the C-flag in their MP_CAPABLE to tell clients not to
create subflows to the initial address and port, clients will likely not
use their other endpoints. That's because the in-kernel path-manager
uses the 'subflow' endpoints to create subflows only to the initial
address and port.
If the limits have not been modified to accept ADD_ADDR, the client
doesn't try to establish new subflows. If the limits accept ADD_ADDR,
the routing routes will be used to select the source IP.
The C-flag is typically set when the server is operating behind a legacy
Layer 4 load balancer, or using anycast IP address. Clients having their
different 'subflow' endpoints setup, don't end up creating multiple
subflows as expected, and causing some deployment issues.
A special case is then added here: when servers set the C-flag in the
MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted.
The 'subflows' endpoints will then be used with this new remote IP and
port. This exception is only allowed when the ADD_ADDR is sent
immediately after the 3WHS, and makes the client switching to the 'fully
established' mode. After that, 'select_local_address()' will not be able
to find any subflows, because 'id_avail_bitmap' will be filled in
mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully
established' mode.
Fixes: df377be387 ("mptcp: add deny_join_id0 in mptcp_options_received")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-1-ad126cc47c6b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 27b1fd6201 upstream.
Filter out the register forms of 0F 01 when determining whether or not to
emulate in response to a potential UMIP violation #GP, as SGDT and SIDT only
accept memory operands. The register variants of 0F 01 are used to encode
instructions for things like VMX and SGX, i.e. not checking the Mod field
would cause the kernel to incorrectly emulate on #GP, e.g. due to a CPL
violation on VMLAUNCH.
Fixes: 1e5db22369 ("x86/umip: Add emulation code for UMIP instructions")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32278c6779 upstream.
When checking for a potential UMIP violation on #GP, verify the decoder found
at least two opcode bytes to avoid false positives when the kernel encounters
an unknown instruction that starts with 0f. Because the array of opcode.bytes
is zero-initialized by insn_init(), peeking at bytes[1] will misinterpret
garbage as a potential SLDT or STR instruction, and can incorrectly trigger
emulation.
E.g. if a VPALIGNR instruction
62 83 c5 05 0f 08 ff vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff
hits a #GP, the kernel emulates it as STR and squashes the #GP (and corrupts
the userspace code stream).
Arguably the check should look for exactly two bytes, but no three byte
opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be
impossible to get a false positive if the first two opcode bytes match '0f 00'
or '0f 01'. Go with a more conservative check with respect to the existing
code to minimize the chances of breaking userspace, e.g. due to decoder
weirdness.
Analyzed by Nick Bray <ncbray@google.com>.
Fixes: 1e5db22369 ("x86/umip: Add emulation code for UMIP instructions")
Reported-by: Dan Snyder <dansnyder@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3da01ffe1a upstream.
The FRED specification has been changed in v9.0 to state that there
is no need for FRED event handlers to begin with ENDBR64, because
in the presence of supervisor indirect branch tracking, FRED event
delivery does not enter the WAIT_FOR_ENDBRANCH state.
As a result, remove ENDBR64 from FRED entry points.
Then add ANNOTATE_NOENDBR to indicate that FRED entry points will
never be used for indirect calls to suppress an objtool warning.
This change implies that any indirect CALL/JMP to FRED entry points
causes #CP in the presence of supervisor indirect branch tracking.
Credit goes to Jennifer Miller <jmill@asu.edu> and other contributors
from Arizona State University whose research shows that placing ENDBR
at entry points has negative value thus led to this change.
Note: This is obviously an incompatible change to the FRED
architecture. But, it's OK because there no FRED systems out in the
wild today. All production hardware and late pre-production hardware
will follow the FRED v9 spec and be compatible with this approach.
[ dhansen: add note to changelog about incompatibility ]
Fixes: 14619d912b ("x86/fred: FRED entry/exit and dispatch code")
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Link: https://lore.kernel.org/linux-hardening/Z60NwR4w%2F28Z7XUa@ubun/
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250716063320.1337818-1-xin%40zytor.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd32a0c0dc upstream.
When we're removing rmap records for crosslinked blocks, use deferred
intent items so that we can try to free/unmap as many of the old data
structure's blocks as we can in the same transaction as the commit.
Cc: <stable@vger.kernel.org> # v6.6
Fixes: 1c7ce115e5 ("xfs: reap large AG metadata extents when possible")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ad55767e7 upstream.
cqspi_read_setup() and cqspi_write_setup() program the address width as
the last step in the setup. This is likely to be immediately followed by
a DAC region read/write. On TI K3 SoCs the DAC region is on a different
endpoint from the register region. This means that the order of the two
operations is not guaranteed, and they might be reordered at the
interconnect level. It is possible that the DAC read/write goes through
before the address width update goes through. In this situation if the
previous command used a different address width the OSPI command is sent
with the wrong number of address bytes, resulting in an invalid command
and undefined behavior.
Read back the size register to make sure the write gets flushed before
accessing the DAC region.
Fixes: 1406234105 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller")
CC: stable@vger.kernel.org
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
Message-ID: <20250905185958.3575037-3-s-k6@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29e0b471cc upstream.
cqspi_indirect_read_execute() and cqspi_indirect_write_execute() first
set the enable bit on APB region and then start reading/writing to the
AHB region. On TI K3 SoCs these regions lie on different endpoints. This
means that the order of the two operations is not guaranteed, and they
might be reordered at the interconnect level.
It is possible for the AHB write to be executed before the APB write to
enable the indirect controller, causing the transaction to be invalid
and the write erroring out. Read back the APB region write before
accessing the AHB region to make sure the write got flushed and the race
condition is eliminated.
Fixes: 1406234105 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller")
CC: stable@vger.kernel.org
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Santhosh Kumar K <s-k6@ti.com>
Message-ID: <20250905185958.3575037-2-s-k6@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42f9c66a6d upstream.
Tegra already defines all BARs except BAR0 as BAR_RESERVED. This is
sufficient for pci-epf-test to not allocate backing memory and to not call
set_bar() for those BARs. However, marking a BAR as BAR_RESERVED does not
mean that the BAR gets disabled.
The host side driver, pci_endpoint_test, simply does an ioremap for all
enabled BARs and will run tests against all enabled BARs, so it will run
tests against the BARs marked as BAR_RESERVED.
After running the BAR tests (which will write to all enabled BARs), the
inbound address translation is broken. This is because the tegra controller
exposes the ATU Port Logic Structure in BAR4, so when BAR4 is written, the
inbound address translation settings get overwritten.
To avoid this, implement the dw_pcie_ep_ops .init() callback and start off
by disabling all BARs (pci-epf-test will later enable/configure BARs that
are not defined as BAR_RESERVED).
This matches the behavior of other PCIe endpoint drivers: dra7xx, imx6,
layerscape-ep, artpec6, dw-rockchip, qcom-ep, rcar-gen4, and uniphier-ep.
With this, the PCI endpoint kselftest test case CONSECUTIVE_BAR_TEST (which
was specifically made to detect address translation issues) passes.
Fixes: c57247f940 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250922140822.519796-7-cassel@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f8c9ad46b0 upstream.
The return value from tegra_bpmp_transfer() indicates the success or
failure of the IPC transaction with BPMP. If the transaction succeeded, we
also need to check the actual command's result code.
If we don't have error handling for tegra_bpmp_transfer(), we will set the
pcie->ep_state to EP_STATE_ENABLED even when the tegra_bpmp_transfer()
command fails. Thus, the pcie->ep_state will get out of sync with reality,
and any further PERST# assert + deassert will be a no-op and will not
trigger the hardware initialization sequence.
This is because pex_ep_event_pex_rst_deassert() checks the current
pcie->ep_state, and does nothing if the current state is already
EP_STATE_ENABLED.
Thus, it is important to have error handling for tegra_bpmp_transfer(),
such that the pcie->ep_state can not get out of sync with reality, so that
we will try to initialize the hardware not only during the first PERST#
assert + deassert, but also during any succeeding PERST# assert + deassert.
One example where this fix is needed is when using a rock5b as host.
During the initial PERST# assert + deassert (triggered by the bootloader on
the rock5b) pex_ep_event_pex_rst_deassert() will get called, but for some
unknown reason, the tegra_bpmp_transfer() call to initialize the PHY fails.
Once Linux has been loaded on the rock5b, the PCIe driver will once again
assert + deassert PERST#. However, without tegra_bpmp_transfer() error
handling, this second PERST# assert + deassert will not trigger the
hardware initialization sequence.
With tegra_bpmp_transfer() error handling, the second PERST# assert +
deassert will once again trigger the hardware to be initialized and this
time the tegra_bpmp_transfer() succeeds.
Fixes: c57247f940 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
[cassel: improve commit log]
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250922140822.519796-8-cassel@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b640d42a6a upstream.
The pci_epc_raise_irq() supplies a MSI or MSI-X interrupt number in range
(1-N), as per the pci_epc_raise_irq() kdoc, where N is 32 for MSI.
But tegra_pcie_ep_raise_msi_irq() incorrectly uses the interrupt number as
the MSI vector. This causes wrong MSI vector to be triggered, leading to
the failure of PCI endpoint Kselftest MSI_TEST test case.
To fix this issue, convert the interrupt number to MSI vector.
Fixes: c57247f940 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250922140822.519796-6-cassel@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a8f173d9d upstream.
The pmsr_lock spinlock used to be necessary to synchronize access to the
PMSR register, because that access could have been triggered from either
config space access in rcar_pcie_config_access() or an exception handler
rcar_pcie_aarch32_abort_handler().
The rcar_pcie_aarch32_abort_handler() case is no longer applicable since
commit 6e36203bc1 ("PCI: rcar: Use PCI_SET_ERROR_RESPONSE after read
which triggered an exception"), which performs more accurate, controlled
invocation of the exception, and a fixup.
This leaves rcar_pcie_config_access() as the only call site from which
rcar_pcie_wakeup() is called. The rcar_pcie_config_access() can only be
called from the controller struct pci_ops .read and .write callbacks,
and those are serialized in drivers/pci/access.c using raw spinlock
'pci_lock' . It should be noted that CONFIG_PCI_LOCKLESS_CONFIG is never
set on this platform.
Since the 'pci_lock' is a raw spinlock , and the 'pmsr_lock' is not a
raw spinlock, this constellation triggers 'BUG: Invalid wait context'
with CONFIG_PROVE_RAW_LOCK_NESTING=y .
Remove the pmsr_lock to fix the locking.
Fixes: a115b1bd3a ("PCI: rcar: Add L1 link state fix into data abort hook")
Reported-by: Duy Nguyen <duy.nguyen.rh@renesas.com>
Reported-by: Thuan Nguyen <thuan.nguyen-hong@banvien.com.vn>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250909162707.13927-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f842d3313b upstream.
The Cadence PCIe Controller integrated in the TI K3 SoCs supports both
Root-Complex and Endpoint modes of operation. The Glue Layer allows
"strapping" the Mode of operation of the Controller, the Link Speed
and the Link Width. This is enabled by programming the "PCIEn_CTRL"
register (n corresponds to the PCIe instance) within the CTRL_MMR
memory-mapped register space. The "reset-values" of the registers are
also different depending on the mode of operation.
Since the PCIe Controller latches onto the "reset-values" immediately
after being powered on, if the Glue Layer configuration is not done while
the PCIe Controller is off, it will result in the PCIe Controller latching
onto the wrong "reset-values". In practice, this will show up as a wrong
representation of the PCIe Controller's capability structures in the PCIe
Configuration Space. Some such capabilities which are supported by the PCIe
Controller in the Root-Complex mode but are incorrectly latched onto as
being unsupported are:
- Link Bandwidth Notification
- Alternate Routing ID (ARI) Forwarding Support
- Next capability offset within Advanced Error Reporting (AER) capability
Fix this by powering off the PCIe Controller before programming the "strap"
settings and powering it on after that. The runtime PM APIs namely
pm_runtime_put_sync() and pm_runtime_get_sync() will decrement and
increment the usage counter respectively, causing GENPD to power off and
power on the PCIe Controller.
Fixes: f3e25911a4 ("PCI: j721e: Add TI J721E PCIe driver")
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250908120828.1471776-1-s-vadapalli@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a7f144e18 upstream.
Commit a2790bf81f ("PCI: j721e: Add support to build as a loadable
module") added support to build the driver as a loadable module. However,
it did not add MODULE_DEVICE_TABLE() which is required for autoloading the
driver based on device table when it is built as a loadable module.
Fix it by adding MODULE_DEVICE_TABLE.
Fixes: a2790bf81f ("PCI: j721e: Add support to build as a loadable module")
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
[mani: reworded description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250901120359.3410774-1-s-vadapalli@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31af09b3ea upstream.
Since 96336ec702 ("PCI: Perform reset_resource() and build fail list in
sync") the failed list is always built and returned to let the caller
decide what to do with the failures. The caller may want to retry resource
fitting and assignment and before that can happen, the resources should be
restored to their original state (a reset effectively clears the struct
resource), which requires returning them to the failed list so the original
state remains stored in the associated struct pci_dev_resource.
Resource resizing is different from the ordinary resource fitting and
assignment in that it only considers part of the resources. This means
failures for other resource types are not relevant at all and should be
ignored. As resize doesn't unassign such unrelated resources, those
resources ending up in the failed list implies assignment of that
resource must have failed before resize too. The check in
pci_reassign_bridge_resources() to decide if the whole assignment is
successful, however, is based on list emptiness which will cause false
negatives when the failed list has resources with an unrelated type.
If the failed list is not empty, call pci_required_resource_failed() and
extend it to be able to filter on specific resource types too (if
provided).
Calling pci_required_resource_failed() at this point is slightly
problematic because the resource itself is reset when the failed list
is constructed in __assign_resources_sorted(). As a result,
pci_resource_is_optional() does not have access to the original
resource flags. This could be worked around by restoring and
re-resetting the resource around the call to pci_resource_is_optional(),
however, it shouldn't cause issue as resource resizing is meant for
64-bit prefetchable resources according to Christian König (see the
Link which unfortunately doesn't point directly to Christian's reply
because lore didn't store that email at all).
Fixes: 96336ec702 ("PCI: Perform reset_resource() and build fail list in sync")
Link: https://lore.kernel.org/all/c5d1b5d8-8669-5572-75a7-0b480f581ac1@linux.intel.com/
Reported-by: D Scott Phillips <scott@os.amperecomputing.com>
Closes: https://lore.kernel.org/all/86plf0lgit.fsf@scott-ph-mail.amperecomputing.com/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: D Scott Phillips <scott@os.amperecomputing.com>
Reviewed-by: D Scott Phillips <scott@os.amperecomputing.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: stable@vger.kernel.org # v6.15+
Link: https://patch.msgid.link/20250822123359.16305-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1cbc5e25fb upstream.
Upon failure to recover from a PCIe error through AER, DPC or EDR, a
uevent is sent to inform user space about disconnection of the bridge
whose subordinate devices failed to recover.
However the bridge itself is not disconnected. Instead, a uevent should
be sent for each of the subordinate devices.
Only if the "bridge" happens to be a Root Complex Event Collector or
Integrated Endpoint does it make sense to send a uevent for it (because
there are no subordinate devices).
Right now if there is a mix of subordinate devices with and without
pci_error_handlers, a BEGIN_RECOVERY event is sent for those with
pci_error_handlers but no FAILED_RECOVERY event is ever sent for them
afterwards. Fix it.
Fixes: 856e1eb9bd ("PCI/AER: Add uevents in AER and EEH error/resume")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.16+
Link: https://patch.msgid.link/68fc527a380821b5d861dd554d2ce42cb739591c.1755008151.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05703271c3 upstream.
Before disabling SR-IOV via config space accesses to the parent PF,
sriov_disable() first removes the PCI devices representing the VFs.
Since commit 9d16947b75 ("PCI: Add global pci_lock_rescan_remove()")
such removal operations are serialized against concurrent remove and
rescan using the pci_rescan_remove_lock. No such locking was ever added
in sriov_disable() however. In particular when commit 18f9e9d150
("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device
removal into sriov_del_vfs() there was still no locking around the
pci_iov_remove_virtfn() calls.
On s390 the lack of serialization in sriov_disable() may cause double
remove and list corruption with the below (amended) trace being observed:
PSW: 0704c00180000000 0000000c914e4b38 (klist_put+56)
GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001
00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480
0000000000000001 0000000000000000 0000000000000000 0000000180692828
00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8
#0 [3800313fb20] device_del at c9158ad5c
#1 [3800313fb88] pci_remove_bus_device at c915105ba
#2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198
#3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0
#4 [3800313fc60] zpci_bus_remove_device at c90fb6104
#5 [3800313fca0] __zpci_event_availability at c90fb3dca
#6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2
#7 [3800313fd60] crw_collect_info at c91905822
#8 [3800313fe10] kthread at c90feb390
#9 [3800313fe68] __ret_from_fork at c90f6aa64
#10 [3800313fe98] ret_from_fork at c9194f3f2.
This is because in addition to sriov_disable() removing the VFs, the
platform also generates hot-unplug events for the VFs. This being the
reverse operation to the hotplug events generated by sriov_enable() and
handled via pdev->no_vf_scan. And while the event processing takes
pci_rescan_remove_lock and checks whether the struct pci_dev still exists,
the lack of synchronization makes this checking racy.
Other races may also be possible of course though given that this lack of
locking persisted so long observable races seem very rare. Even on s390 the
list corruption was only observed with certain devices since the platform
events are only triggered by config accesses after the removal, so as long
as the removal finished synchronously they would not race. Either way the
locking is missing so fix this by adding it to the sriov_del_vfs() helper.
Just like PCI rescan-remove, locking is also missing in sriov_add_vfs()
including for the error case where pci_stop_and_remove_bus_device() is
called without the PCI rescan-remove lock being held. Even in the non-error
case, adding new PCI devices and buses should be serialized via the PCI
rescan-remove lock. Add the necessary locking.
Fixes: 18f9e9d150 ("PCI/IOV: Factor out sriov_add_vfs()")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Julian Ruess <julianr@linux.ibm.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-1-2d0bc938f2a3@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48991e4935 upstream.
The "max_link_width", "current_link_speed", "current_link_width",
"secondary_bus_number", and "subordinate_bus_number" sysfs files all access
config registers, but they don't check the runtime PM state. If the device
is in D3cold or a parent bridge is suspended, we may see -EINVAL, bogus
values, or worse, depending on implementation details.
Wrap these access in pci_config_pm_runtime_{get,put}() like most of the
rest of the similar sysfs attributes.
Notably, "max_link_speed" does not access config registers; it returns a
cached value since d2bd39c045 ("PCI: Store all PCIe Supported Link
Speeds").
Fixes: 56c1af4606 ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc")
Signed-off-by: Brian Norris <briannorris@google.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250924095711.v2.1.Ibb5b6ca1e2c059e04ec53140cd98a44f2684c668@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98a4f5b735 upstream.
When PCIe has been set up by the bootloader, the ecam_size field in the
E_ECAM_CONTROL register already contains a value.
The driver previously programmed it to 0xc (for 16 busses; 16 MB), but
bumped to 0x10 (for 256 busses; 256 MB) by the commit 2fccd11518 ("PCI:
xilinx-nwl: Modify ECAM size to enable support for 256 buses").
Regardless of what the bootloader has programmed, the driver ORs in a
new maximal value without doing a proper RMW sequence. This can lead to
problems.
For example, if the bootloader programs in 0xc and the driver uses 0x10,
the ORed result is 0x1c, which is beyond the ecam_max_size limit of 0x10
(from E_ECAM_CAPABILITIES).
Avoid the problems by doing a proper RMW.
Fixes: 2fccd11518 ("PCI: xilinx-nwl: Modify ECAM size to enable support for 256 buses")
Signed-off-by: Jani Nurminen <jani.nurminen@windriver.com>
[mani: added stable tag]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/e83a2af2-af0b-4670-bcf5-ad408571c2b0@windriver.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a001cd248a upstream.
Add "extern" to the glibc-defined weak rseq symbols to convert the rseq
selftest's usage from weak symbol definitions to weak symbol _references_.
Effectively re-defining the glibc symbols wreaks havoc when building with
-fno-common, e.g. generates segfaults when running multi-threaded programs,
as dynamically linked applications end up with multiple versions of the
symbols.
Building with -fcommon, which until recently has the been the default for
GCC and clang, papers over the bug by allowing the linker to resolve the
weak/tentative definition to glibc's "real" definition.
Note, the symbol itself (or rather its address), not the value of the
symbol, is set to 0/NULL for unresolved weak symbol references, as the
symbol doesn't exist and thus can't have a value. Check for a NULL rseq
size pointer to handle the scenario where the test is statically linked
against a libc that doesn't support rseq in any capacity.
Fixes: 3bcbc20942 ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: stable@vger.kernel.org
Closes: https://lore.kernel.org/all/87frdoybk4.ffs@tglx
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 795cda8338 upstream.
As described in the old comment dating back to
commit 6610e0893b ("RTC: Rework RTC code to use timerqueue for events")
from 2010, we have been living with a race window when setting alarm
with an expiry in the near future (i.e. next second).
With 1 second resolution, it can happen that the second ticks after the
check for the timer having expired, but before the alarm is actually set.
When this happen, no alarm IRQ is generated, at least not with some RTC
chips (isl12022 is an example of this).
With UIE RTC timer being implemented on top of alarm irq, being re-armed
every second, UIE will occasionally fail to work, as an alarm irq lost
due to this race will stop the re-arming loop.
For now, I have limited the additional expiry check to only be done for
alarms set to next seconds. I expect it should be good enough, although I
don't know if we can now for sure that systems with loads could end up
causing the same problems for alarms set 2 seconds or even longer in the
future.
I haven't been able to reproduce the problem with this check in place.
Cc: stable@vger.kernel.org
Signed-off-by: Esben Haabendal <esben@geanix.com>
Link: https://lore.kernel.org/r/20250516-rtc-uie-irq-fixes-v2-1-3de8e530a39e@geanix.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6744085079 upstream.
The of_platform_populate() call at the end of the function has a
possible failure path, causing a resource leak.
Replace of_iomap() with devm_platform_ioremap_resource() to ensure
automatic cleanup of srom->reg_base.
This issue was detected by smatch static analysis:
drivers/memory/samsung/exynos-srom.c:155 exynos_srom_probe()warn:
'srom->reg_base' from of_iomap() not released on lines: 155.
Fixes: 8ac2266d88 ("memory: samsung: exynos-srom: Add support for bank configuration")
Cc: stable@vger.kernel.org
Signed-off-by: Zhen Ni <zhen.ni@easystack.cn>
Link: https://lore.kernel.org/r/20250806025538.306593-1-zhen.ni@easystack.cn
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fef12d9f5b upstream.
For multiple block read, the current implementation, transfer packet
includes cmd53 + cmd53 response + block nums*(1byte token +
block length bytes payload + 2bytes CRC + 1byte transfer), the last
1byte transfer of every block is not needed, so remove it.
Why doesn't multiple block read need CRC ack?
For read operation, host side get the payload and CRC value, then
will only check the CRC value to confirm if the data is correct or
not, but not send CRC ack to card. If the data is correct, save it,
or discard it and retransmit if data is error, so the last 1byte
transfer of every block make no sense.
What's the side effect of this 1byte transfer?
As the SPI is full duplex, if add this redundant 1byte transfer, SDIO
card side take it as the token of next block, then all the next sub
blocks sequence distort.
Signed-off-by: Rex Chen <rex.chen_1@nxp.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250728082230.1037917-3-rex.chen_1@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1001cc1171 upstream.
Commit f04ced6d54 ("mtd: nand: raw: gpmi: improve power management
handling") moved all clock handling into PM callbacks. With CONFIG_PM
disabled, those callbacks are missing, leaving the driver unusable.
Add clock init/teardown for !CONFIG_PM builds to restore basic operation.
Keeping the driver working without requiring CONFIG_PM is preferred over
adding a Kconfig dependency.
Fixes: f04ced6d54 ("mtd: nand: raw: gpmi: improve power management handling")
Signed-off-by: Maarten Zanders <maarten@zanders.be>
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07ca98f906 upstream.
Turned out certain clearly invalid values passed in xdp_desc from
userspace can pass xp_{,un}aligned_validate_desc() and then lead
to UBs or just invalid frames to be queued for xmit.
desc->len close to ``U32_MAX`` with a non-zero pool->tx_metadata_len
can cause positive integer overflow and wraparound, the same way low
enough desc->addr with a non-zero pool->tx_metadata_len can cause
negative integer overflow. Both scenarios can then pass the
validation successfully.
This doesn't happen with valid XSk applications, but can be used
to perform attacks.
Always promote desc->len to ``u64`` first to exclude positive
overflows of it. Use explicit check_{add,sub}_overflow() when
validating desc->addr (which is ``u64`` already).
bloat-o-meter reports a little growth of the code size:
add/remove: 0/0 grow/shrink: 2/1 up/down: 60/-16 (44)
Function old new delta
xskq_cons_peek_desc 299 330 +31
xsk_tx_peek_release_desc_batch 973 1002 +29
xsk_generic_xmit 3148 3132 -16
but hopefully this doesn't hurt the performance much.
Fixes: 341ac980ea ("xsk: Support tx_metadata_len")
Cc: stable@vger.kernel.org # 6.8+
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20251008165659.4141318-1-aleksander.lobakin@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 302c04110f upstream.
Once of_device_register() failed, we should call put_device() to
decrement reference count for cleanup. Or it could cause memory leak.
So fix this by calling put_device(), then the name can be freed in
kobject_cleanup().
Calling path: of_device_register() -> of_device_add() -> device_add().
As comment of device_add() says, 'if device_add() succeeds, you should
call device_del() when you want to get rid of it. If device_add() has
not succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable@vger.kernel.org
Fixes: cf44bbc26c ("[SPARC]: Beginnings of generic of_device framework.")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6fd44a481b upstream.
An attempt to exercise sparc hugetlb code in a sun4u-based guest
running under qemu results in the guest hanging due to being stuck
in a trap loop. This is due to invalid hugetlb TTEs being installed
that do not have the expected _PAGE_PMD_HUGE and page size bits set.
Although the breakage has gone apparently unnoticed for several years,
fix it now so there is the option to exercise sparc hugetlb code under
qemu. This can be useful because sun4v support in qemu does not support
linux guests currently and sun4v-based hardware resources may not be
readily available.
Fix tested with a 6.15.2 and 6.16-rc6 kernels by running libhugetlbfs
tests on a qemu guest running Debian 13.
Fixes: c7d9f77d33 ("sparc64: Multi-page size support")
Cc: stable@vger.kernel.org
Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Link: https://lore.kernel.org/r/20250716012446.10357-1-anthony.yznaga@oracle.com
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa12118dbc upstream.
Test generic/637 spotted a problem with create of a new file in a
cached directory (by the same client) could cause cases where the
new file does not show up properly in ls on that client until the
lease times out.
Fixes: 037e1bae58 ("smb: client: use ParentLeaseKey in cifs_do_create")
Cc: stable@vger.kernel.org
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5f717b31b upstream.
A build warning was triggered due to excessive stack usage in
sd_revalidate_disk():
drivers/scsi/sd.c: In function ‘sd_revalidate_disk.isra’:
drivers/scsi/sd.c:3824:1: warning: the frame size of 1160 bytes is larger than 1024 bytes [-Wframe-larger-than=]
This is caused by a large local struct queue_limits (~400B) allocated on
the stack. Replacing it with a heap allocation using kmalloc()
significantly reduces frame usage. Kernel stack is limited (~8 KB), and
allocating large structs on the stack is discouraged. As the function
already performs heap allocations (e.g. for buffer), this change fits
well.
Fixes: 804e498e04 ("sd: convert to the atomic queue limits API")
Cc: stable@vger.kernel.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Abinash Singh <abinashsinghlalotra@gmail.com>
Link: https://lore.kernel.org/r/20250825183940.13211-2-abinashsinghlalotra@gmail.com
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b81296591c upstream.
Replace kmalloc() followed by copy_from_user() with memdup_user() to fix
a memory leak that occurs when copy_from_user(buff[sg_used],,) fails and
the 'cleanup1:' path does not free the memory for 'buff[sg_used]'. Using
memdup_user() avoids this by freeing the memory internally.
Since memdup_user() already allocates memory, use kzalloc() in the else
branch instead of manually zeroing 'buff[sg_used]' using memset(0).
Cc: stable@vger.kernel.org
Fixes: edd163687e ("[SCSI] hpsa: add driver for HP Smart Array controllers.")
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Acked-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8fd5485fb4 upstream.
When a CPU chooses to call push_dl_task and picks a task to push to
another CPU's runqueue then it will call find_lock_later_rq method
which would take a double lock on both CPUs' runqueues. If one of the
locks aren't readily available, it may lead to dropping the current
runqueue lock and reacquiring both the locks at once. During this window
it is possible that the task is already migrated and is running on some
other CPU. These cases are already handled. However, if the task is
migrated and has already been executed and another CPU is now trying to
wake it up (ttwu) such that it is queued again on the runqeue
(on_rq is 1) and also if the task was run by the same CPU, then the
current checks will pass even though the task was migrated out and is no
longer in the pushable tasks list.
Please go through the original rt change for more details on the issue.
To fix this, after the lock is obtained inside the find_lock_later_rq,
it ensures that the task is still at the head of pushable tasks list.
Also removed some checks that are no longer needed with the addition of
this new check.
However, the new check of pushable tasks list only applies when
find_lock_later_rq is called by push_dl_task. For the other caller i.e.
dl_task_offline_migration, existing checks are used.
Signed-off-by: Harshit Agarwal <harshit@nutanix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250408045021.3283624-1-harshit@nutanix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d09ee1bec upstream.
This reverts commit c608966f3f.
This patch has a subtle bug that can cause the IPMI driver to go into an
infinite loop if the BMC misbehaves in a certain way. Apparently
certain BMCs do misbehave this way because several reports have come in
recently about this.
Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Eric Hagberg <ehagberg@janestreet.com>
Cc: <stable@vger.kernel.org> # 6.2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee6cd8f3e2 upstream.
CHARGE_CONTROL_LIMIT is a wrong property to report charge current limit,
because `CHARGE_*` attributes represents capacity, not current. The
correct attribute to report and set charge current limit is
CONSTANT_CHARGE_CURRENT.
Rename CHARGE_CONTROL_LIMIT to CONSTANT_CHARGE_CURRENT.
Cc: stable@vger.kernel.org
Fixes: 715ecbc10d ("power: supply: max77976: add Maxim MAX77976 charger driver")
Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f4c6f9ed4 upstream.
Commit 12ffc3b151 ("PM: Restrict swap use to later in the
suspend sequence") caused hibernation_platform_enter() to call
pm_restore_gfp_mask() via dpm_resume_end(), so when power_down()
returns after aborting hibernation_platform_enter(), it needs
to match the pm_restore_gfp_mask() call in hibernate() that will
occur subsequently.
Address this by adding a pm_restrict_gfp_mask() call to the relevant
error path in power_down().
Fixes: 12ffc3b151 ("PM: Restrict swap use to later in the suspend sequence")
Cc: 6.16+ <stable@vger.kernel.org> # 6.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eeaed48980 upstream.
On the TUXEDO InfinityBook S Gen8, a Samsung 990 Evo NVMe leads to
a high power consumption in s2idle sleep (3.5 watts).
This patch applies 'Force No Simple Suspend' quirk to achieve a sleep with
a lower power consumption, typically around 1 watts.
Signed-off-by: Georg Gottleuber <ggo@tuxedocomputers.com>
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Cc: stable@vger.kernel.org
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8ec5a066f8 upstream.
Similar in nature to ab10727660. glibc-2.42
drops the legacy termio struct, but the ioctls.h header still defines some
TC* constants in terms of termio (via sizeof). Hardcode the values instead.
This fixes building Python for example, which falls over like:
./Modules/termios.c:1119:16: error: invalid application of 'sizeof' to incomplete type 'struct termio'
Link: https://bugs.gentoo.org/961769
Link: https://bugs.gentoo.org/962600
Co-authored-by: Stian Halseth <stian@itx.no>
Cc: stable@vger.kernel.org
Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 501302d5ce upstream.
When seq_nr wraps around, the next reorder job with seq 0 is hashed to
the first CPU in padata_do_serial(). Correspondingly, need reset pd->cpu
to the first one when pd->processed wraps around. Otherwise, if the
number of used CPUs is not a power of 2, padata_find_next() will be
checking a wrong list, hence deadlock.
Fixes: 6fc4dbcf02 ("padata: Replace delayed timer with immediate workqueue in padata_reorder")
Cc: <stable@vger.kernel.org>
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8de554774 upstream.
In of_unittest_pci_node_verify(), when the add parameter is false,
device_find_any_child() obtains a reference to a child device. This
function implicitly calls get_device() to increment the device's
reference count before returning the pointer. However, the caller
fails to properly release this reference by calling put_device(),
leading to a device reference count leak. Add put_device() in the else
branch immediately after child_dev is no longer needed.
As the comment of device_find_any_child states: "NOTE: you will need
to drop the reference with put_device() after use".
Found by code review.
Cc: stable@vger.kernel.org
Fixes: 26409dd045 ("of: unittest: Add pci_dt_testdrv pci driver")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22f166218f upstream.
If bio is split by internal handling like chunksize or badblocks, the
corresponding trace_block_split() is missing, resulting in blktrace
inability to catch BIO split events and making it harder to analyze the
BIO sequence.
Cc: stable@vger.kernel.org
Fixes: 4b1faf9316 ("block: Kill bio_pair_split()")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f322a97aeb upstream.
kho_fill_kimage() only checks for KHO being enabled before filling in the
FDT to the image. KHO being enabled does not mean that the kernel has
data to hand over. That happens when KHO is finalized.
When a kexec is done with KHO enabled but not finalized, the FDT page is
allocated but not initialized. FDT initialization happens after finalize.
This means the KHO segment is filled in but the FDT contains garbage
data.
This leads to the below error messages in the next kernel:
[ 0.000000] KHO: setup: handover FDT (0x10116b000) is invalid: -9
[ 0.000000] KHO: disabling KHO revival: -22
There is no problem in practice, and the next kernel boots and works fine.
But this still leads to misleading error messages and garbage being
handed over.
Only fill in KHO segment when KHO is finalized. When KHO is not enabled,
the debugfs interface is not created and there is no way to finalize it
anyway. So the check for kho_enable is not needed, and kho_out.finalize
alone is enough.
Link: https://lkml.kernel.org/r/20250918170617.91413-1-pratyush@kernel.org
Fixes: 3bdecc3c93 ("kexec: add KHO support to kexec file loads")
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eed0e3d305 upstream.
To prevent timing attacks, HMAC value comparison needs to be constant
time. Replace the memcmp() with the correct function, crypto_memneq().
[For the Fixes commit I used the commit that introduced the memcmp().
It predates the introduction of crypto_memneq(), but it was still a bug
at the time even though a helper function didn't exist yet.]
Fixes: d00a1c72f7 ("keys: add new trusted key-type")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a15f37a401 upstream.
The usage of task_lock(tsk->group_leader) in sys_prlimit64()->do_prlimit()
path is very broken.
sys_prlimit64() does get_task_struct(tsk) but this only protects task_struct
itself. If tsk != current and tsk is not a leader, this process can exit/exec
and task_lock(tsk->group_leader) may use the already freed task_struct.
Another problem is that sys_prlimit64() can race with mt-exec which changes
->group_leader. In this case do_prlimit() may take the wrong lock, or (worse)
->group_leader may change between task_lock() and task_unlock().
Change sys_prlimit64() to take tasklist_lock when necessary. This is not
nice, but I don't see a better fix for -stable.
Link: https://lkml.kernel.org/r/20250915120917.GA27702@redhat.com
Fixes: 18c91bb2d8 ("prlimit: do not grab the tasklist_lock")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8fd8ea2869 upstream.
Dan Carpenter got a Smatch warning:
drivers/char/ipmi/ipmi_msghandler.c:5265 ipmi_free_recv_msg()
warn: sleeping in atomic context
due to the recent rework of the IPMI driver's locking. I didn't realize
vfree could block. But there is an easy solution to this, now that
almost everything in the message handler runs in thread context.
I wanted to spend the time earlier to see if seq_lock could be converted
from a spinlock to a mutex, but I wanted the previous changes to go in
and soak before I did that. So I went ahead and did the analysis and
converting should work. And solve this problem.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202503240244.LR7pOwyr-lkp@intel.com/
Fixes: 3be997d5a6 ("ipmi:msghandler: Remove srcu from the ipmi user structure")
Cc: <stable@vger.kernel.org> # 6.16
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ef7e24c74 upstream.
The specification, Section 7.10, "Software Steps to Drain Page Requests &
Responses," requires software to submit an Invalidation Wait Descriptor
(inv_wait_dsc) with the Page-request Drain (PD=1) flag set, along with
the Invalidation Wait Completion Status Write flag (SW=1). It then waits
for the Invalidation Wait Descriptor's completion.
However, the PD field in the Invalidation Wait Descriptor is optional, as
stated in Section 6.5.2.9, "Invalidation Wait Descriptor":
"Page-request Drain (PD): Remapping hardware implementations reporting
Page-request draining as not supported (PDS = 0 in ECAP_REG) treat this
field as reserved."
This implies that if the IOMMU doesn't support the PDS capability, software
can't drain page requests and group responses as expected.
Do not enable PCI/PRI if the IOMMU doesn't support PDS.
Reported-by: Joel Granados <joel.granados@kernel.org>
Closes: https://lore.kernel.org/r/20250909-jag-pds-v1-1-ad8cba0e494e@kernel.org
Fixes: 66ac4db36f ("iommu/vt-d: Add page request draining support")
Cc: stable@vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20250915062946.120196-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0792c1984a upstream.
Rework the power management in inv_icm42600_core_probe() to use
devm_pm_runtime_set_active_enabled(), which simplifies the runtime PM
setup by handling activation and enabling in one step.
Remove the separate inv_icm42600_disable_pm callback, as it's no longer
needed with the devm-managed approach.
Using devm_pm_runtime_enable() also fixes the missing disable of
autosuspend.
Update inv_icm42600_disable_vddio_reg() to only disable the regulator if
the device is not suspended i.e. powered-down, preventing unbalanced
disables.
Also remove redundant error msg on regulator_disable(), the regulator
framework already emits an error message when regulator_disable() fails.
This simplifies the PM setup and avoids manipulating the usage counter
unnecessarily.
Fixes: 31c24c1e93 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver")
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Link: https://patch.msgid.link/20250901-icm42pmreg-v3-1-ef1336246960@geanix.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit feb500c7ae upstream.
To convert level-triggered alarms into edge-triggered IIO events, alarms
are masked when they are triggered. To ensure we catch subsequent
alarms, we then periodically poll to see if the alarm is still active.
If it isn't, we unmask it. Active but masked alarms are stored in
current_masked_alarm.
If an active alarm is disabled, it will remain set in
current_masked_alarm until ams_unmask_worker clears it. If the alarm is
re-enabled before ams_unmask_worker runs, then it will never be cleared
from current_masked_alarm. This will prevent the alarm event from being
pushed even if the alarm is still active.
Fix this by recalculating current_masked_alarm immediately when enabling
or disabling alarms.
Fixes: d5c70627a7 ("iio: adc: Add Xilinx AMS driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: O'Griofa, Conall <conall.ogriofa@amd.com>
Tested-by: Erim, Salih <Salih.Erim@amd.com>
Acked-by: Erim, Salih <Salih.Erim@amd.com>
Link: https://patch.msgid.link/20250715002847.2035228-1-sean.anderson@linux.dev
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33d7ecbf69 upstream.
The ADF4350/1 features a programmable dual-modulus prescaler of 4/5 or 8/9.
When set to 4/5, the maximum RF frequency allowed is 3 GHz.
Therefore, when operating the ADF4351 above 3 GHz, this must be set to 8/9.
In this context not the RF output frequency is meant
- it's the VCO frequency.
Therefore move the prescaler selection after we derived the VCO frequency
from the desired RF output frequency.
This BUG may have caused PLL lock instabilities when operating the VCO at
the very high range close to 4.4 GHz.
Fixes: e31166f0fd ("iio: frequency: New driver for Analog Devices ADF4350/ADF4351 Wideband Synthesizers")
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Link: https://patch.msgid.link/20250829-adf4350-fix-v2-1-0bf543ba797d@analog.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c63ba1c43 upstream.
There are two problems with the chip configuration in this driver:
- First, is that writing 12 bytes (ARRAY_SIZE(regs)) would anyhow
lead to a config overflow due to HW auto increment implementation
in the chip.
- Second, the i2c_smbus_write_block_data write ends up in writing
unexpected value to the channel_dis register, this is because
the smbus size that is 0x03 in this case gets written to the
register. The PAC1931/2/3/4 data sheet does not really specify
that block write is indeed supported.
This problem is probably not visible on PAC1934 version where all
channels are used as the chip is properly configured by luck,
but in our case whenusing PAC1931 this leads to nonfunctional device.
Fixes: 0fb528c825 (iio: adc: adding support for PAC193x)
Suggested-by: Rene Straub <mailto:rene.straub@belden.com>
Signed-off-by: Aleksandar Gerasimovski <aleksandar.gerasimovski@belden.com>
Reviewed-by: Marius Cristea <marius.cristea@microchip.com>
Link: https://patch.msgid.link/20250811130904.2481790-1-aleksandar.gerasimovski@belden.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9395b3c412 upstream.
Commit 3a379bbcea ("i3c: Add core I3C infrastructure") set the default
adapter timeout for I2C transfers as 1000 (ms). However that parameter
is defined in jiffies not in milliseconds.
With mipi-i3c-hci driver this wasn't visible until commit c0a90eb55a
("i3c: mipi-i3c-hci: use adapter timeout value for I2C transfers").
Fix this by setting the default timeout as HZ (CONFIG_HZ) not 1000.
Fixes: 1b84691e78 ("i3c: dw: use adapter timeout value for I2C transfers")
Fixes: be27ed6728 ("i3c: master: cdns: use adapter timeout value for I2C transfers")
Fixes: c0a90eb55a ("i3c: mipi-i3c-hci: use adapter timeout value for I2C transfers")
Fixes: a747e01ada ("i3c: master: svc: use adapter timeout value for I2C transfers")
Fixes: d028219a9f ("i3c: master: Add basic driver for the Renesas I3C controller")
Fixes: 3a379bbcea ("i3c: Add core I3C infrastructure")
Cc: stable@vger.kernel.org # 6.17
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Link: https://lore.kernel.org/r/20250905100320.954536-1-jarkko.nikula@linux.intel.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc06114363 upstream.
mpfs_gpio_direction_output() actually sets the line to input mode.
Use the correct register settings for output mode so that this function
actually works as intended.
This was a copy-paste mistake made when converting to regmap during the
driver submission process. It went unnoticed because my test for output
mode is toggling LEDs on an Icicle kit which functions with the
incorrect code. The internal reporter has yet to test the patch, but on
their system the incorrect setting may be the reason for failures to
drive the GPIO lines on the BeagleV-fire board.
CC: stable@vger.kernel.org
Fixes: a987b78f36 ("gpio: mpfs: add polarfire soc gpio support")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20250925-boogieman-carrot-82989ff75d10@spud
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26e5c67deb upstream.
I observed a hang when running generic/323 against a fuseblk server.
This test opens a file, initiates a lot of AIO writes to that file
descriptor, and closes the file descriptor before the writes complete.
Unsurprisingly, the AIO exerciser threads are mostly stuck waiting for
responses from the fuseblk server:
# cat /proc/372265/task/372313/stack
[<0>] request_wait_answer+0x1fe/0x2a0 [fuse]
[<0>] __fuse_simple_request+0xd3/0x2b0 [fuse]
[<0>] fuse_do_getattr+0xfc/0x1f0 [fuse]
[<0>] fuse_file_read_iter+0xbe/0x1c0 [fuse]
[<0>] aio_read+0x130/0x1e0
[<0>] io_submit_one+0x542/0x860
[<0>] __x64_sys_io_submit+0x98/0x1a0
[<0>] do_syscall_64+0x37/0xf0
[<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
But the /weird/ part is that the fuseblk server threads are waiting for
responses from itself:
# cat /proc/372210/task/372232/stack
[<0>] request_wait_answer+0x1fe/0x2a0 [fuse]
[<0>] __fuse_simple_request+0xd3/0x2b0 [fuse]
[<0>] fuse_file_put+0x9a/0xd0 [fuse]
[<0>] fuse_release+0x36/0x50 [fuse]
[<0>] __fput+0xec/0x2b0
[<0>] task_work_run+0x55/0x90
[<0>] syscall_exit_to_user_mode+0xe9/0x100
[<0>] do_syscall_64+0x43/0xf0
[<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
The fuseblk server is fuse2fs so there's nothing all that exciting in
the server itself. So why is the fuse server calling fuse_file_put?
The commit message for the fstest sheds some light on that:
"By closing the file descriptor before calling io_destroy, you pretty
much guarantee that the last put on the ioctx will be done in interrupt
context (during I/O completion).
Aha. AIO fgets a new struct file from the fd when it queues the ioctx.
The completion of the FUSE_WRITE command from userspace causes the fuse
server to call the AIO completion function. The completion puts the
struct file, queuing a delayed fput to the fuse server task. When the
fuse server task returns to userspace, it has to run the delayed fput,
which in the case of a fuseblk server, it does synchronously.
Sending the FUSE_RELEASE command sychronously from fuse server threads
is a bad idea because a client program can initiate enough simultaneous
AIOs such that all the fuse server threads end up in delayed_fput, and
now there aren't any threads left to handle the queued fuse commands.
Fix this by only using asynchronous fputs when closing files, and leave
a comment explaining why.
Cc: stable@vger.kernel.org # v2.6.38
Fixes: 5a18ec176c ("fuse: fix hang of single threaded fuseblk filesystem")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b563aad1c upstream.
In case of FUSE_NOTIFY_RESEND and FUSE_NOTIFY_INC_EPOCH fuse_copy_finish()
isn't called.
Fix by always calling fuse_copy_finish() after fuse_notify(). It's a no-op
if called a second time.
Fixes: 760eac73f9 ("fuse: Introduce a new notification type for resend pending requests")
Fixes: 2396356a94 ("fuse: add more control over cache invalidation behaviour")
Cc: <stable@vger.kernel.org> # v6.9
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72b7ceca85 upstream.
There is a kernel panic due to WARN_ONCE when panic_on_warn is set.
This issue occurs when writeback is triggered due to sync call for an
opened file(ie, writeback reason is WB_REASON_SYNC). When f2fs balance
is needed at sync path, flush for quota_release_work is triggered.
By default quota_release_work is queued to "events_unbound" queue which
does not have WQ_MEM_RECLAIM flag. During f2fs balance "writeback"
workqueue tries to flush quota_release_work causing kernel panic due to
MEM_RECLAIM flag mismatch errors.
This patch creates dedicated workqueue with WQ_MEM_RECLAIM flag
for work quota_release_work.
------------[ cut here ]------------
WARNING: CPU: 4 PID: 14867 at kernel/workqueue.c:3721 check_flush_dependency+0x13c/0x148
Call trace:
check_flush_dependency+0x13c/0x148
__flush_work+0xd0/0x398
flush_delayed_work+0x44/0x5c
dquot_writeback_dquots+0x54/0x318
f2fs_do_quota_sync+0xb8/0x1a8
f2fs_write_checkpoint+0x3cc/0x99c
f2fs_gc+0x190/0x750
f2fs_balance_fs+0x110/0x168
f2fs_write_single_data_page+0x474/0x7dc
f2fs_write_data_pages+0x7d0/0xd0c
do_writepages+0xe0/0x2f4
__writeback_single_inode+0x44/0x4ac
writeback_sb_inodes+0x30c/0x538
wb_writeback+0xf4/0x440
wb_workfn+0x128/0x5d4
process_scheduled_works+0x1c4/0x45c
worker_thread+0x32c/0x3e8
kthread+0x11c/0x1b0
ret_from_fork+0x10/0x20
Kernel panic - not syncing: kernel: panic_on_warn set ...
Fixes: ac6f420291 ("quota: flush quota_release_work upon quota writeback")
CC: stable@vger.kernel.org
Signed-off-by: Shashank A P <shashank.ap@samsung.com>
Link: https://patch.msgid.link/20250901092905.2115-1-shashank.ap@samsung.com
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15df28699b upstream.
A regression was reported to me recently whereby /dev/fb0 had disappeared
from a PowerBook G3 Series "Wallstreet". The problem shows up when the
"video=ofonly" parameter is passed to the kernel, which is what the
bootloader does when "no video driver" is selected. The cause of the
problem is the "offb" string comparison, which got mangled when it got
refactored. Fix it.
Cc: stable@vger.kernel.org
Fixes: 93604a5ade ("fbdev: Handle video= parameter in video/cmdline.c")
Reported-and-tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c43094f8c upstream.
The ready event list of an epoll object is protected by read-write
semaphore:
- The consumer (waiter) acquires the write lock and takes items.
- the producer (waker) takes the read lock and adds items.
The point of this design is enabling epoll to scale well with large number
of producers, as multiple producers can hold the read lock at the same
time.
Unfortunately, this implementation may cause scheduling priority inversion
problem. Suppose the consumer has higher scheduling priority than the
producer. The consumer needs to acquire the write lock, but may be blocked
by the producer holding the read lock. Since read-write semaphore does not
support priority-boosting for the readers (even with CONFIG_PREEMPT_RT=y),
we have a case of priority inversion: a higher priority consumer is blocked
by a lower priority producer. This problem was reported in [1].
Furthermore, this could also cause stall problem, as described in [2].
Fix this problem by replacing rwlock with spinlock.
This reduces the event bandwidth, as the producers now have to contend with
each other for the spinlock. According to the benchmark from
https://github.com/rouming/test-tools/blob/master/stress-epoll.c:
On 12 x86 CPUs:
Before After Diff
threads events/ms events/ms
8 7162 4956 -31%
16 8733 5383 -38%
32 7968 5572 -30%
64 10652 5739 -46%
128 11236 5931 -47%
On 4 riscv CPUs:
Before After Diff
threads events/ms events/ms
8 2958 2833 -4%
16 3323 3097 -7%
32 3451 3240 -6%
64 3554 3178 -11%
128 3601 3235 -10%
Although the numbers look bad, it should be noted that this benchmark
creates multiple threads who do nothing except constantly generating new
epoll events, thus contention on the spinlock is high. For real workload,
the event rate is likely much lower, and the performance drop is not as
bad.
Using another benchmark (perf bench epoll wait) where spinlock contention
is lower, improvement is even observed on x86:
On 12 x86 CPUs:
Before: Averaged 110279 operations/sec (+- 1.09%), total secs = 8
After: Averaged 114577 operations/sec (+- 2.25%), total secs = 8
On 4 riscv CPUs:
Before: Averaged 175767 operations/sec (+- 0.62%), total secs = 8
After: Averaged 167396 operations/sec (+- 0.23%), total secs = 8
In conclusion, no one is likely to be upset over this change. After all,
spinlock was used originally for years, and the commit which converted to
rwlock didn't mention a real workload, just that the benchmark numbers are
nice.
This patch is not exactly the revert of commit a218cc4914 ("epoll: use
rwlock in order to reduce ep_poll_callback() contention"), because git
revert conflicts in some places which are not obvious on the resolution.
This patch is intended to be backported, therefore go with the obvious
approach:
- Replace rwlock_t with spinlock_t one to one
- Delete list_add_tail_lockless() and chain_epi_lockless(). These were
introduced to allow producers to concurrently add items to the list.
But now that spinlock no longer allows producers to touch the event
list concurrently, these two functions are not necessary anymore.
Fixes: a218cc4914 ("epoll: use rwlock in order to reduce ep_poll_callback() contention")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/ec92458ea357ec503c737ead0f10b2c6e4c37d47.1752581388.git.namcao@linutronix.de
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: stable@vger.kernel.org
Reported-by: Frederic Weisbecker <frederic@kernel.org>
Closes: https://lore.kernel.org/linux-rt-users/20210825132754.GA895675@lothringen/ [1]
Reported-by: Valentin Schneider <vschneid@redhat.com>
Closes: https://lore.kernel.org/linux-rt-users/xhsmhttqvnall.mognet@vschneid.remote.csb/ [2]
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69e5d50fcf upstream.
The cpufreq_cpu_put() call in update_qos_request() takes place too early
because the latter subsequently calls freq_qos_update_request() that
indirectly accesses the policy object in question through the QoS request
object passed to it.
Fortunately, update_qos_request() is called under intel_pstate_driver_lock,
so this issue does not matter for changing the intel_pstate operation
mode, but it theoretically can cause a crash to occur on CPU device hot
removal (which currently can only happen in virt, but it is formally
supported nevertheless).
Address this issue by modifying update_qos_request() to drop the
reference to the policy later.
Fixes: da5c504c7a ("cpufreq: intel_pstate: Implement QoS supported freq constraints")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Link: https://patch.msgid.link/2255671.irdbgypaU6@rafael.j.wysocki
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f965d111e6 upstream.
If cppc_get_transition_latency() returns CPUFREQ_ETERNAL to indicate a
failure to retrieve the transition latency value from the platform
firmware, the CPPC cpufreq driver will use that value (converted to
microseconds) as the policy transition delay, but it is way too large
for any practical use.
Address this by making the driver use the cpufreq's default
transition latency value (in microseconds) as the transition delay
if CPUFREQ_ETERNAL is returned by cppc_get_transition_latency().
Fixes: d4f3388afd ("cpufreq / CPPC: Set platform specific transition_delay_us")
Cc: 5.19+ <stable@vger.kernel.org> # 5.19
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Qais Yousef <qyousef@layalina.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04ff48239f upstream.
With the introduction of clone3 in commit 7f192e3cd3 ("fork: add
clone3") the effective bit width of clone_flags on all architectures was
increased from 32-bit to 64-bit. However, the signature of the copy_*
helper functions (e.g., copy_sighand) used by copy_process was not
adapted.
As such, they truncate the flags on any 32-bit architectures that
supports clone3 (arc, arm, csky, m68k, microblaze, mips32, openrisc,
parisc32, powerpc32, riscv32, x86-32 and xtensa).
For copy_sighand with CLONE_CLEAR_SIGHAND being an actual u64
constant, this triggers an observable bug in kernel selftest
clone3_clear_sighand:
if (clone_flags & CLONE_CLEAR_SIGHAND)
in function copy_sighand within fork.c will always fail given:
unsigned long /* == uint32_t */ clone_flags
#define CLONE_CLEAR_SIGHAND 0x100000000ULL
This commit fixes the bug by always passing clone_flags to copy_sighand
via their declared u64 type, invariant of architecture-dependent integer
sizes.
Fixes: b612e5df45 ("clone3: add CLONE_CLEAR_SIGHAND")
Cc: stable@vger.kernel.org # linux-5.5+
Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Link: https://lore.kernel.org/20250901-nios2-implement-clone3-v2-1-53fcf5577d57@siemens-energy.com
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f5225a34bd upstream.
The mhi_ep_read_channel function incorrectly assumes the End of Transfer
(EOT) bit is present for each packet in a chained transactions, causing
it to advance mhi_chan->rd_offset beyond wr_offset during host-to-device
transfers when EOT has not yet arrived. This leads to access of unmapped
host memory, causing IOMMU faults and processing of stale TREs.
Modify the loop condition to ensure mhi_queue is not empty, allowing the
function to process only valid TREs up to the current write pointer to
prevent premature reads and ensure safe traversal of chained TREs.
Due to this change, buf_left needs to be removed from the while loop
condition to avoid exiting prematurely before reading the ring completely,
and also remove write_offset since it will always be zero because the new
cache buffer is allocated every time.
Fixes: 5301258899 ("bus: mhi: ep: Add support for reading from the host")
Co-developed-by: Akhil Vinod <akhil.vinod@oss.qualcomm.com>
Signed-off-by: Akhil Vinod <akhil.vinod@oss.qualcomm.com>
Signed-off-by: Sumit Kumar <sumit.kumar@oss.qualcomm.com>
[mani: reworded description slightly]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Reviewed-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250910-final_chained-v3-1-ec77c9d88ace@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dff4f9ff5d upstream.
The function btrfs_encode_fh() does not properly account for the three
cases it handles.
Before writing to the file handle (fh), the function only returns to the
user BTRFS_FID_SIZE_NON_CONNECTABLE (5 dwords, 20 bytes) or
BTRFS_FID_SIZE_CONNECTABLE (8 dwords, 32 bytes).
However, when a parent exists and the root ID of the parent and the
inode are different, the function writes BTRFS_FID_SIZE_CONNECTABLE_ROOT
(10 dwords, 40 bytes).
If *max_len is not large enough, this write goes out of bounds because
BTRFS_FID_SIZE_CONNECTABLE_ROOT is greater than
BTRFS_FID_SIZE_CONNECTABLE originally returned.
This results in an 8-byte out-of-bounds write at
fid->parent_root_objectid = parent_root_id.
A previous attempt to fix this issue was made but was lost.
https://lore.kernel.org/all/4CADAEEC020000780001B32C@vpn.id2.novell.com/
Although this issue does not seem to be easily triggerable, it is a
potential memory corruption bug that should be fixed. This patch
resolves the issue by ensuring the function returns the appropriate size
for all three cases and validates that *max_len is large enough before
writing any data.
Fixes: be6e8dc0ba ("NFS support for btrfs - v3")
CC: stable@vger.kernel.org # 3.0+
Signed-off-by: Anderson Nascimento <anderson@allelesecurity.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ddbfac1528 upstream.
The point of isolating code that uses kernel mode FPU in separate
compilation units is to ensure that even implicit uses of, e.g., SIMD
registers for spilling occur only in a context where this is permitted,
i.e., from inside a kernel_fpu_begin/end block.
This is important on arm64, which uses -mgeneral-regs-only to build all
kernel code, with the exception of such compilation units where FP or
SIMD registers are expected to be used. Given that the compiler may
invent uses of FP/SIMD anywhere in such a unit, none of its code may be
accessible from outside a kernel_fpu_begin/end block.
This means that all callers into such compilation units must use the
DC_FP start/end macros, which must not occur there themselves. For
robustness, all functions with external linkage that reside there should
call dc_assert_fp_enabled() to assert that the FPU context was set up
correctly.
Fix this for the DCN35, DCN351 and DCN36 implementations.
Cc: Austin Zheng <austin.zheng@amd.com>
Cc: Jun Lei <jun.lei@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Rodrigo Siqueira <siqueira@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d1684a077 upstream.
Currently this is hidden behind perfmon_capable() since this is
technically an info leak, given that this is a system wide metric.
However the granularity reported here is always PAGE_SIZE aligned, which
matches what the core kernel is already willing to expose to userspace
if querying how many free RAM pages there are on the system, and that
doesn't need any special privileges. In addition other drm drivers seem
happy to expose this.
The motivation here if with oneAPI where they want to use the system
wide 'used' reporting here, so not the per-client fdinfo stats. This has
also come up with some perf overlay applications wanting this
information.
Fixes: 1105ac15d2 ("drm/xe/uapi: restrict system wide accounting")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joshua Santosh <joshua.santosh.ranjan@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250919122052.420979-2-matthew.auld@intel.com
(cherry picked from commit 4d0b035fd6dae8ee48e9c928b10f14877e595356)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4bea91958 upstream.
In `nouveau_bo_move_prep`, if `nouveau_mem_map` fails, an error code
should be returned. Currently, it returns zero even if vmm addr is not
correctly mapped.
Cc: stable@vger.kernel.org
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Fixes: 9ce523cc3b ("drm/nouveau: separate buffer object backing memory from nvkm structures")
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d83f1d19c8 upstream.
Remove fixed PPI lane count setup. The R-Car DSI host is capable
of operating in 1..4 DSI lane mode. Remove the hard-coded 4-lane
configuration from PPI register settings and instead configure
the PPI lane count according to lane count information already
obtained by this driver instance.
Configure TXSETR register to match PPI lane count. The R-Car V4H
Reference Manual R19UH0186EJ0121 Rev.1.21 section 67.2.2.3 Tx Set
Register (TXSETR), field LANECNT description indicates that the
TXSETR register LANECNT bitfield lane count must be configured
such, that it matches lane count configuration in PPISETR register
DLEN bitfield. Make sure the LANECNT and DLEN bitfields are
configured to match.
Fixes: 155358310f ("drm: rcar-du: Add R-Car DSI driver")
Cc: stable@vger.kernel.org
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/20250813210840.97621-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f248d5d515 upstream.
Since the PDC resides out of the GPU subsystem and cannot be reset in
case it enters bad state, utmost care must be taken to trigger the PDC
wake/sleep routines in the correct order.
The PDC wake sequence can be exercised only after a PDC sleep sequence.
Additionally, GMU firmware should initialize a few registers before the
KMD can trigger a PDC sleep sequence. So PDC sleep can't be done if the
GMU firmware has not initialized. Track these dependencies using a new
status variable and trigger PDC sleep/wake sequences appropriately.
Cc: stable@vger.kernel.org
Fixes: 4b565ca5a2 ("drm/msm: Add A6XX device support")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/673362/
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e1361a4f1b upstream.
Condition guards are found to be redundant, as the call flow is properly
managed now, as also observed in the Exynos5433 DECON driver. Since
state checking is no longer necessary, remove it.
This also fixes an issue which prevented decon_commit() from
decon_atomic_enable() due to an incorrect state change setting.
Fixes: 96976c3d9a ("drm/exynos: Add DECON driver")
Cc: stable@vger.kernel.org
Suggested-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f4098c57e upstream.
When cdev_device_add() failed, calling put_device() to explicitly
release dev->lirc_dev. Otherwise, it could cause the fault of the
reference count.
Found by code review.
Cc: stable@vger.kernel.org
Fixes: a6ddd4fecb ("media: lirc: remove last remnants of lirc kapi")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e743cd0a7 upstream.
We don't use OF ports and remote-endpoints to connect the CSI2RX bridge
and this device in the device tree, thus it is wrong to use
v4l2_create_fwnode_links_to_pad() to create the media graph link between
the two.
It works out on accident, as neither the source nor the sink implement
the .get_fwnode_pad() callback, and the framework helper falls back on
using the first source and sink pads to create the link between them.
Instead, manually create the media link from the first source pad of the
bridge to the first sink pad of the J721E CSI2RX.
Fixes: b4a3d877dc ("media: ti: Add CSI2RX support for J721E")
Cc: stable@vger.kernel.org
Reviewed-by: Devarsh Thakkar <devarsht@ti.com>
Tested-by: Yemike Abhilash Chandra <y-abhilashchandra@ti.com> (on SK-AM68)
Signed-off-by: Jai Luthra <jai.luthra@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4bd8a61476 upstream.
The vivid driver supports the <Vendor Command With ID> message,
but if the Vendor ID of the received message didn't match the Vendor ID
of the CEC Adapter, then it ignores it (good) and returns 0 (bad).
It should return -ENOMSG to indicate that other followers should be
asked to handle it. Return code 0 means that the driver handled it,
which is wrong in this case.
As a result, userspace followers never get the chance to process such a
message.
Refactor the code a bit to have the function return -ENOMSG at the end,
drop the default case, and ensure that the message handlers return 0.
That way 0 is only returned if the message is actually handled in the
vivid_received() function.
Fixes: 812765cd69 ("media: vivid: add <Vendor Command With ID> support")
Cc: stable@vger.kernel.org
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93f213b444 upstream.
When starting venus with the "no_tz" code path, IRIS2 needs the same
boot/reset sequence as IRIS2_1. This is because most of the registers were
moved to the "wrapper_tz_base", which is already defined for both IRIS2 and
IRIS2_1 inside core.c. Add IRIS2 to the checks inside firmware.c as well to
make sure that it uses the correct reset sequence.
Both IRIS2 and IRIS2_1 are HFI v6 variants, so the correct sequence was
used before commit c38610f898 ("media: venus: firmware: Sanitize
per-VPU-version").
Fixes: c38610f898 ("media: venus: firmware: Sanitize per-VPU-version")
Cc: stable@vger.kernel.org
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
[bod: Fixed commit log IRIS -> IRIS2]
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 895d3b4b58 upstream.
The PM usage counter of isys was bumped up when start camera stream
(opening firmware) but it was not dropped after stream stop(closing
firmware), it forbids system fail to suspend due to the wrong PM state
of ISYS. This patch drop the PM usage counter in firmware close to fix
it.
Cc: Stable@vger.kernel.org
Fixes: a516d36bdc ("media: staging/ipu7: add IPU7 input system device driver")
Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7fa37ba25a upstream.
The s5p_mfc_cmd_args structure in the v6 driver is never used, not
initialized to anything other than zero, but as of clang-21 this
causes a warning:
drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c:45:7: error: variable 'h2r_args' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
45 | &h2r_args);
| ^~~~~~~~
Just remove this for simplicity. Since the function is also called
through a callback, this does require adding a trivial wrapper with
the correct prototype.
Fixes: f96f3cfa0b ("[media] s5p-mfc: Update MFC v4l2 driver to support MFC6.x")
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bbcc6d16de upstream.
Commit 4a81656c8e ("arm64: dts: mediatek: mt8188: Address binding
warnings for MDP3 nodes") caused a regression on the MDP functionality
when it removed the MT8195 compatibles from the MDP3 nodes, since the
MT8188 compatible was not yet listed as a possible MDP component
compatible in mdp_comp_dt_ids. This resulted in an empty output
bitstream when using the MDP from userspace, as well as the following
errors:
mtk-mdp3 14001000.dma-controller: Uninit component inner id 4
mtk-mdp3 14001000.dma-controller: mdp_path_ctx_init error 0
mtk-mdp3 14001000.dma-controller: CMDQ sendtask failed: -22
Add the missing compatible to the array to restore functionality.
Fixes: 4a81656c8e ("arm64: dts: mediatek: mt8188: Address binding warnings for MDP3 nodes")
Cc: stable@vger.kernel.org
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1069a4fe63 upstream.
The DMA map functions can fail and should be tested for errors.
If the mapping fails, free blanking_ptr and set it to 0. As 0 is a
valid DMA address, use blanking_ptr to test if the DMA address
is set.
Fixes: 1a0adaf37c ("V4L/DVB (5345): ivtv driver for Conexant cx23416/cx23415 MPEG encoder/decoder")
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bacd713145 upstream.
Change "ret" from unsigned int to int type in mt9v111_calc_frame_rate()
to store negative error codes or zero returned by __mt9v111_hw_reset()
and other functions.
Storing the negative error codes in unsigned type, doesn't cause an issue
at runtime but it's ugly as pants.
No effect on runtime.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Fixes: aab7ed1c39 ("media: i2c: Add driver for Aptina MT9V111")
Cc: stable@vger.kernel.org
Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 075710b670 upstream.
The mediabus code is device dependent, but the probe() function
thought that device_get_match_data() would return the code directly,
when in fact it returned a pointer to a struct mt9p031_model_info.
As a result, the initial mbus code was garbage.
Tested with a BeagleBoard xM and a Leopard Imaging LI-5M03 sensor board.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Tested-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Fixes: a80b1bbff8 ("media: mt9p031: Refactor format handling for different sensor models")
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d5d12cc03e upstream.
Delete the external-module style Makefile commands. They are not needed
for in-tree modules.
This is the only Makefile in the kernel tree (aside from tools/ and
samples/) that uses this Makefile style.
The same files are built with or without this patch.
Fixes: 056f2821b6 ("media: cec: extron-da-hd-4k-plus: add the Extron DA HD 4K Plus CEC driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3fcc8e1469 upstream.
VIRQs come in 3 flavors, per-VPU, per-domain, and global, and the VIRQs
are tracked in per-cpu virq_to_irq arrays.
Per-domain and global VIRQs must be bound on CPU 0, and
bind_virq_to_irq() sets the per_cpu virq_to_irq at registration time
Later, the interrupt can migrate, and info->cpu is updated. When
calling __unbind_from_irq(), the per-cpu virq_to_irq is cleared for a
different cpu. If bind_virq_to_irq() is called again with CPU 0, the
stale irq is returned. There won't be any irq_info for the irq, so
things break.
Make xen_rebind_evtchn_to_cpu() update the per_cpu virq_to_irq mappings
to keep them update to date with the current cpu. This ensures the
correct virq_to_irq is cleared in __unbind_from_irq().
Fixes: e46cdb66c8 ("xen: event channels")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250828003604.8949-4-jason.andryuk@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07ce121d93 upstream.
Change find_virq() to return -EEXIST when a VIRQ is bound to a
different CPU than the one passed in. With that, remove the BUG_ON()
from bind_virq_to_irq() to propogate the error upwards.
Some VIRQs are per-cpu, but others are per-domain or global. Those must
be bound to CPU0 and can then migrate elsewhere. The lookup for
per-domain and global will probably fail when migrated off CPU 0,
especially when the current CPU is tracked. This now returns -EEXIST
instead of BUG_ON().
A second call to bind a per-domain or global VIRQ is not expected, but
make it non-fatal to avoid trying to look up the irq, since we don't
know which per_cpu(virq_to_irq) it will be in.
Cc: stable@vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250828003604.8949-3-jason.andryuk@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f770c3d858 upstream.
The device power management API has the following asymmetry:
* dpm_suspend_start() does not clean up on failure
(it requires a call to dpm_resume_end())
* dpm_suspend_end() does clean up on failure
(it does not require a call to dpm_resume_start())
The asymmetry was introduced by commit d8f3de0d24 ("Suspend-related
patches for 2.6.27") in June 2008: It removed a call to device_resume()
from device_suspend() (which was later renamed to dpm_suspend_start()).
When Xen began using the device power management API in May 2008 with
commit 0e91398f2a ("xen: implement save/restore"), the asymmetry did
not yet exist. But since it was introduced, a call to dpm_resume_end()
is missing in the error path of dpm_suspend_start(). Fix it.
Fixes: d8f3de0d24 ("Suspend-related patches for 2.6.27")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v2.6.27
Reviewed-by: "Rafael J. Wysocki (Intel)" <rafael@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <22453676d1ddcebbe81641bb68ddf587fee7e21e.1756990799.git.lukas@wunner.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08df2d7dd4 upstream.
rc is overwritten by the evtchn_status hypercall in each iteration, so
the return value will be whatever the last iteration is. This could
incorrectly return success even if the event channel was not found.
Change to an explicit -ENOENT for an un-found virq and return 0 on a
successful match.
Fixes: 62cc5fc7b2 ("xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250828003604.8949-2-jason.andryuk@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29da8c823a upstream.
Prior to running an SEV-ES guest, set TSC_AUX in the host save area to the
current value in hardware, as tracked by the user return infrastructure,
instead of always loading the host's desired value for the CPU. If the
pCPU is also running a non-SEV-ES vCPU, loading the host's value on #VMEXIT
could clobber the other vCPU's value, e.g. if the SEV-ES vCPU preempted
the non-SEV-ES vCPU, in which case KVM expects the other vCPU's TSC_AUX
value to be resident in hardware.
Note, unlike TDX, which blindly _zeroes_ TSC_AUX on TD-Exit, SEV-ES CPUs
can load an arbitrary value. Stuff the current value in the host save
area instead of refreshing the user return cache so that KVM doesn't need
to track whether or not the vCPU actually enterred the guest and thus
loaded TSC_AUX from the host save area.
Opportunistically tag tsc_aux_uret_slot as read-only after init to guard
against unexpected modifications, and to make it obvious that using the
variable in sev_es_prepare_switch_to_guest() is safe.
Fixes: 916e3e5f26 ("KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX")
Cc: stable@vger.kernel.org
Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
[sean: handle the SEV-ES case in sev_es_prepare_switch_to_guest()]
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Link: https://lore.kernel.org/r/20250923153738.1875174-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0dccbc75e1 upstream.
When running as an SNP or TDX guest under KVM, force the legacy PCI hole,
i.e. memory between Top of Lower Usable DRAM and 4GiB, to be mapped as UC
via a forced variable MTRR range.
In most KVM-based setups, legacy devices such as the HPET and TPM are
enumerated via ACPI. ACPI enumeration includes a Memory32Fixed entry, and
optionally a SystemMemory descriptor for an OperationRegion, e.g. if the
device needs to be accessed via a Control Method.
If a SystemMemory entry is present, then the kernel's ACPI driver will
auto-ioremap the region so that it can be accessed at will. However, the
ACPI spec doesn't provide a way to enumerate the memory type of
SystemMemory regions, i.e. there's no way to tell software that a region
must be mapped as UC vs. WB, etc. As a result, Linux's ACPI driver always
maps SystemMemory regions using ioremap_cache(), i.e. as WB on x86.
The dedicated device drivers however, e.g. the HPET driver and TPM driver,
want to map their associated memory as UC or WC, as accessing PCI devices
using WB is unsupported.
On bare metal and non-CoCO, the conflicting requirements "work" as firmware
configures the PCI hole (and other device memory) to be UC in the MTRRs.
So even though the ACPI mappings request WB, they are forced to UC- in the
kernel's tracking due to the kernel properly handling the MTRR overrides,
and thus are compatible with the drivers' requested WC/UC-.
With force WB MTRRs on SNP and TDX guests, the ACPI mappings get their
requested WB if the ACPI mappings are established before the dedicated
driver code attempts to initialize the device. E.g. if acpi_init()
runs before the corresponding device driver is probed, ACPI's WB mapping
will "win", and result in the driver's ioremap() failing because the
existing WB mapping isn't compatible with the requested WC/UC-.
E.g. when a TPM is emulated by the hypervisor (ignoring the security
implications of relying on what is allegedly an untrusted entity to store
measurements), the TPM driver will request UC and fail:
[ 1.730459] ioremap error for 0xfed40000-0xfed45000, requested 0x2, got 0x0
[ 1.732780] tpm_tis MSFT0101:00: probe with driver tpm_tis failed with error -12
Note, the '0x2' and '0x0' values refer to "enum page_cache_mode", not x86's
memtypes (which frustratingly are an almost pure inversion; 2 == WB, 0 == UC).
E.g. tracing mapping requests for TPM TIS yields:
Mapping TPM TIS with req_type = 0
WARNING: CPU: 22 PID: 1 at arch/x86/mm/pat/memtype.c:530 memtype_reserve+0x2ab/0x460
Modules linked in:
CPU: 22 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.16.0-rc7+ #2 VOLUNTARY
Tainted: [W]=WARN
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/29/2025
RIP: 0010:memtype_reserve+0x2ab/0x460
__ioremap_caller+0x16d/0x3d0
ioremap_cache+0x17/0x30
x86_acpi_os_ioremap+0xe/0x20
acpi_os_map_iomem+0x1f3/0x240
acpi_os_map_memory+0xe/0x20
acpi_ex_system_memory_space_handler+0x273/0x440
acpi_ev_address_space_dispatch+0x176/0x4c0
acpi_ex_access_region+0x2ad/0x530
acpi_ex_field_datum_io+0xa2/0x4f0
acpi_ex_extract_from_field+0x296/0x3e0
acpi_ex_read_data_from_field+0xd1/0x460
acpi_ex_resolve_node_to_value+0x2ee/0x530
acpi_ex_resolve_to_value+0x1f2/0x540
acpi_ds_evaluate_name_path+0x11b/0x190
acpi_ds_exec_end_op+0x456/0x960
acpi_ps_parse_loop+0x27a/0xa50
acpi_ps_parse_aml+0x226/0x600
acpi_ps_execute_method+0x172/0x3e0
acpi_ns_evaluate+0x175/0x5f0
acpi_evaluate_object+0x213/0x490
acpi_evaluate_integer+0x6d/0x140
acpi_bus_get_status+0x93/0x150
acpi_add_single_object+0x43a/0x7c0
acpi_bus_check_add+0x149/0x3a0
acpi_bus_check_add_1+0x16/0x30
acpi_ns_walk_namespace+0x22c/0x360
acpi_walk_namespace+0x15c/0x170
acpi_bus_scan+0x1dd/0x200
acpi_scan_init+0xe5/0x2b0
acpi_init+0x264/0x5b0
do_one_initcall+0x5a/0x310
kernel_init_freeable+0x34f/0x4f0
kernel_init+0x1b/0x200
ret_from_fork+0x186/0x1b0
ret_from_fork_asm+0x1a/0x30
</TASK>
The above traces are from a Google-VMM based VM, but the same behavior
happens with a QEMU based VM that is modified to add a SystemMemory range
for the TPM TIS address space.
The only reason this doesn't cause problems for HPET, which appears to
require a SystemMemory region, is because HPET gets special treatment via
x86_init.timers.timer_init(), and so gets a chance to create its UC-
mapping before acpi_init() clobbers things. Disabling the early call to
hpet_time_init() yields the same behavior for HPET:
[ 0.318264] ioremap error for 0xfed00000-0xfed01000, requested 0x2, got 0x0
Hack around the ACPI gap by forcing the legacy PCI hole to UC when
overriding the (virtual) MTRRs for CoCo guest, so that ioremap handling
of MTRRs naturally kicks in and forces the ACPI mappings to be UC.
Note, the requested/mapped memtype doesn't actually matter in terms of
accessing the device. In practically every setup, legacy PCI devices are
emulated by the hypervisor, and accesses are intercepted and handled as
emulated MMIO, i.e. never access physical memory and thus don't have an
effective memtype.
Even in a theoretical setup where such devices are passed through by the
host, i.e. point at real MMIO memory, it is KVM's (as the hypervisor)
responsibility to force the memory to be WC/UC, e.g. via EPT memtype
under TDX or real hardware MTRRs under SNP. Not doing so cannot work,
and the hypervisor is highly motivated to do the right thing as letting
the guest access hardware MMIO with WB would likely result in a variety
of fatal #MCs.
In other words, forcing the range to be UC is all about coercing the
kernel's tracking into thinking that it has established UC mappings, so
that the ioremap code doesn't reject mappings from e.g. the TPM driver and
thus prevent the driver from loading and the device from functioning.
Note #2, relying on guest firmware to handle this scenario, e.g. by setting
virtual MTRRs and then consuming them in Linux, is not a viable option, as
the virtual MTRR state is managed by the untrusted hypervisor, and because
OVMF at least has stopped programming virtual MTRRs when running as a TDX
guest.
Link: https://lore.kernel.org/all/8137d98e-8825-415b-9282-1d2a115bb51a@linux.intel.com
Fixes: 8e690b817e ("x86/kvm: Override default caching mode for SEV-SNP and TDX")
Cc: stable@vger.kernel.org
Cc: Peter Gonda <pgonda@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Jürgen Groß <jgross@suse.com>
Cc: Korakit Seemakhupt <korakit@google.com>
Cc: Jianxiong Gao <jxgao@google.com>
Cc: Nikolay Borisov <nik.borisov@suse.com>
Suggested-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Tested-by: Korakit Seemakhupt <korakit@google.com>
Link: https://lore.kernel.org/r/20250828005249.39339-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f9466b50c upstream.
The user_mem_abort() function acquires a page reference via
__kvm_faultin_pfn() early in its execution. However, the subsequent
checks for mismatched attributes between stage 1 and stage 2 mappings
would return an error code directly, bypassing the corresponding page
release.
Fix this by storing the error and releasing the unused page before
returning the error.
Fixes: 6d674e28f6 ("KVM: arm/arm64: Properly handle faulting of device mappings")
Fixes: 2a8dfab266 ("KVM: arm64: Block cacheable PFNMAP mapping")
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5deafa27d9 upstream.
KVM run fails when guests with 'cmm' cpu feature and host are
under memory pressure and use swap heavily. This is because
npages becomes ENOMEN (out of memory) in hva_to_pfn_slow()
which inturn propagates as EFAULT to qemu. Clearing the page
table entry when discarding an address that maps to a swap
entry resolves the issue.
Fixes: 200197908d ("KVM: s390: Refactor and split some gmap helpers")
Cc: stable@vger.kernel.org
Suggested-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Gautam Gala <ggala@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3fe1c83a5 upstream.
CMN S3's DTM offset is different between r0px and r1p0, and it
turns out this was not a error in the earlier documentation, but
does actually exist in the design. Lovely.
Cc: stable@vger.kernel.org
Fixes: 0dc2f4963f ("perf/arm-cmn: Support CMN S3")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 572ce54639 upstream.
The quirk version range is typically a string constant and must not be
modified (e.g. as it may be stored in read-only memory). Attempting
to do so can trigger faults such as:
| Unable to handle kernel write to read-only memory at virtual
| address ffffc036d998a947
Update the range parsing so that it operates on a copy of the version
range string, and mark all the quirk strings as const to reduce the
risk of introducing similar future issues.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220437
Fixes: 487c407d57 ("firmware: arm_scmi: Add common framework to handle firmware quirks")
Cc: stable@vger.kernel.org # 6.16
Cc: Cristian Marussi <cristian.marussi@arm.com>
Reported-by: Jan Palus <jpalus@fastmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Message-Id: <20250829132152.28218-1-johan@kernel.org>
[sudeep.holla: minor commit message rewording; switch to cleanup helpers]
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a6506e1ba upstream.
There is an issue possible where TI AM33xx SoCs do not boot properly after
a reset if EMU0/EMU1 pins were used as GPIO and have been driving low level
actively prior to reset [1].
"Advisory 1.0.36 EMU0 and EMU1: Terminals Must be Pulled High Before
ICEPick Samples
The state of the EMU[1:0] terminals are latched during reset to determine
ICEPick boot mode. For normal device operation, these terminals must be
pulled up to a valid high logic level ( > VIH min) before ICEPick samples
the state of these terminals, which occurs
[five CLK_M_OSC clock cycles - 10 ns] after the falling edge of WARMRSTn.
Many applications may not require the secondary GPIO function of the
EMU[1:0] terminals. In this case, they would only be connected to pull-up
resistors, which ensures they are always high when ICEPick samples.
However, some applications may need to use these terminals as GPIO where
they could be driven low before reset is asserted. This usage of the
EMU[1:0] terminals may require special attention to ensure the terminals
are allowed to return to a valid high-logic level before ICEPick samples
the state of these terminals.
When any device reset is asserted, the pin mux mode of EMU[1:0] terminals
configured to operate as GPIO (mode 7) will change back to EMU input
(mode 0) on the falling edge of WARMRSTn. This only provides a short period
of time for the terminals to return high if driven low before reset is
asserted...
If the EMU[1:0] terminals are configured to operate as GPIO, the product
should be designed such these terminals can be pulled to a valid high-logic
level within 190 ns after the falling edge of WARMRSTn."
We've noticed this problem with custom am335x hardware in combination with
recently implemented cold reset method
(commit 6521f6a195 ("ARM: AM33xx: PRM: Implement REBOOT_COLD")).
It looks like the problem can affect other HW, for instance AM335x
Chiliboard, because the latter has LEDs on GPIO3_7/GPIO3_8 as well.
One option would be to check if the pins are in GPIO mode and either switch
to output active high, or switch to input and poll until the external
pull-ups have brought the pins to the desired high state. But fighting
with GPIO driver for these pins is probably not the most straight forward
approch in a reboot handler.
Fortunately we can easily control pinmuxing here and rely on the external
pull-ups. TI recommends 4k7 external pull up resistors [2] and even with
quite conservative estimation for pin capacity (1 uF should never happen)
the required delay shall not exceed 5ms.
[1] Link: https://www.ti.com/lit/pdf/sprz360
[2] Link: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/866346/am3352-emu-1-0-questions
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://lore.kernel.org/r/20250717152708.487891-1-alexander.sverdlin@siemens.com
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f620d66af3 upstream.
Commit 68d54ceeec ("arm64: mte: Allow PTRACE_PEEKMTETAGS access to the
zero page") attempted to fix ptrace() reading of tags from the zero page
by marking it as PG_mte_tagged during cpu_enable_mte(). The same commit
also changed the ptrace() tag access permission check to the VM_MTE vma
flag while turning the page flag test into a WARN_ON_ONCE().
Attempting to set the PG_mte_tagged flag early with
CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled may either hang (after commit
d77e59a8fc "arm64: mte: Lock a page for MTE tag initialisation") or
have the flags cleared later during page_alloc_init_late(). In addition,
pages_identical() -> memcmp_pages() will reject any comparison with the
zero page as it is marked as tagged.
Partially revert the above commit to avoid setting PG_mte_tagged on the
zero page. Update the __access_remote_tags() warning on untagged pages
to ignore the zero page since it is known to have the tags initialised.
Note that all user mapping of the zero page are marked as pte_special().
The arm64 set_pte_at() will not call mte_sync_tags() on such pages, so
PG_mte_tagged will remain cleared.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 68d54ceeec ("arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page")
Reported-by: Gergely Kovacs <Gergely.Kovacs2@arm.com>
Cc: stable@vger.kernel.org # 5.10.x
Cc: Will Deacon <will@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Lance Yang <lance.yang@linux.dev>
Acked-by: Lance Yang <lance.yang@linux.dev>
Reviewed-by: David Hildenbrand <david@redhat.com>
Tested-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f73c82c855 upstream.
On most MSM8939 devices, the bootloader already initializes the display to
show the boot splash screen. In this situation, MDSS is already configured
and left running when starting Linux. To avoid side effects from the
bootloader configuration, the MDSS reset can be specified in the device
tree to start again with a clean hardware state.
The reset for MDSS is currently missing in msm8939.dtsi, which causes
errors when the MDSS driver tries to re-initialize the registers:
dsi_err_worker: status=6
dsi_err_worker: status=6
dsi_err_worker: status=6
...
It turns out that we have always indirectly worked around this by building
the MDSS driver as a module. Before v6.17, the power domain was temporarily
turned off until the module was loaded, long enough to clear the register
contents. In v6.17, power domains are not turned off during boot until
sync_state() happens, so this is no longer working. Even before v6.17 this
resulted in broken behavior, but notably only when the MDSS driver was
built-in instead of a module.
Cc: stable@vger.kernel.org
Fixes: 61550c6c15 ("arm64: dts: qcom: Add msm8939 SoC")
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250915-msm8916-resets-v1-2-a5c705df0c45@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 99b78773c2 upstream.
On most MSM8916 devices (aside from the DragonBoard 410c), the bootloader
already initializes the display to show the boot splash screen. In this
situation, MDSS is already configured and left running when starting Linux.
To avoid side effects from the bootloader configuration, the MDSS reset can
be specified in the device tree to start again with a clean hardware state.
The reset for MDSS is currently missing in msm8916.dtsi, which causes
errors when the MDSS driver tries to re-initialize the registers:
dsi_err_worker: status=6
dsi_err_worker: status=6
dsi_err_worker: status=6
...
It turns out that we have always indirectly worked around this by building
the MDSS driver as a module. Before v6.17, the power domain was temporarily
turned off until the module was loaded, long enough to clear the register
contents. In v6.17, power domains are not turned off during boot until
sync_state() happens, so this is no longer working. Even before v6.17 this
resulted in broken behavior, but notably only when the MDSS driver was
built-in instead of a module.
Cc: stable@vger.kernel.org
Fixes: 305410ffd1 ("arm64: dts: msm8916: Add display support")
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250915-msm8916-resets-v1-1-a5c705df0c45@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 399dbcadc0 upstream.
There is no synchronization between different code paths in the ACPI
battery driver that update its sysfs interface or its power supply
class device interface. In some cases this results to functional
failures due to race conditions.
One example of this is when two ACPI notifications:
- ACPI_BATTERY_NOTIFY_STATUS (0x80)
- ACPI_BATTERY_NOTIFY_INFO (0x81)
are triggered (by the platform firmware) in a row with a little delay
in between after removing and reinserting a laptop battery. Both
notifications cause acpi_battery_update() to be called and if the delay
between them is sufficiently small, sysfs_add_battery() can be re-entered
before battery->bat is set which leads to a duplicate sysfs entry error:
sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT1'
CPU: 1 UID: 0 PID: 185 Comm: kworker/1:4 Kdump: loaded Not tainted 6.12.38+deb13-amd64 #1 Debian 6.12.38-1
Hardware name: Gateway NV44 /SJV40-MV , BIOS V1.3121 04/08/2009
Workqueue: kacpi_notify acpi_os_execute_deferred
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
sysfs_warn_dup.cold+0x17/0x23
sysfs_create_dir_ns+0xce/0xe0
kobject_add_internal+0xba/0x250
kobject_add+0x96/0xc0
? get_device_parent+0xde/0x1e0
device_add+0xe2/0x870
__power_supply_register.part.0+0x20f/0x3f0
? wake_up_q+0x4e/0x90
sysfs_add_battery+0xa4/0x1d0 [battery]
acpi_battery_update+0x19e/0x290 [battery]
acpi_battery_notify+0x50/0x120 [battery]
acpi_ev_notify_dispatch+0x49/0x70
acpi_os_execute_deferred+0x1a/0x30
process_one_work+0x177/0x330
worker_thread+0x251/0x390
? __pfx_worker_thread+0x10/0x10
kthread+0xd2/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
kobject: kobject_add_internal failed for BAT1 with -EEXIST, don't try to register things with the same name in the same directory.
There are also other scenarios in which analogous issues may occur.
Address this by using a common lock in all of the code paths leading
to updates of driver interfaces: ACPI Notify () handler, system resume
callback and post-resume notification, device addition and removal.
This new lock replaces sysfs_lock that has been used only in
sysfs_remove_battery() which now is going to be always called under
the new lock, so it doesn't need any internal locking any more.
Fixes: 1066625155 ("ACPI: battery: Install Notify() handler directly")
Closes: https://lore.kernel.org/linux-acpi/20250910142653.313360-1-luogf2025@163.com/
Reported-by: GuangFei Luo <luogf2025@163.com>
Tested-by: GuangFei Luo <luogf2025@163.com>
Cc: 6.6+ <stable@vger.kernel.org> # 6.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 496f9372ea upstream.
In the ACPI debugger interface, the helper functions for read and write
operations use "int" as the length parameter data type. When a large
"size_t count" is passed from the file operations, this cast to "int"
results in truncation and a negative value due to signed integer
representation.
Logically, this negative number propagates to the min() calculation,
where it is selected over the positive buffer space value, leading to
unexpected behavior. Subsequently, when this negative value is used in
copy_to_user() or copy_from_user(), it is interpreted as a large positive
value due to the unsigned nature of the size parameter in these functions,
causing the copy operations to attempt handling sizes far beyond the
intended buffer limits.
Address the issue by:
- Changing the length parameters in acpi_aml_read_user() and
acpi_aml_write_user() from "int" to "size_t", aligning with the
expected unsigned size semantics.
- Updating return types and local variables in acpi_aml_read() and
acpi_aml_write() to "ssize_t" for consistency with kernel file
operation conventions.
- Using "size_t" for the "n" variable to ensure calculations remain
unsigned.
- Using min_t() for circ_count_to_end() and circ_space_to_end() to
ensure type-safe comparisons and prevent integer overflow.
Signed-off-by: Amir Mohammad Jahangirzad <a.jahangirzad@gmail.com>
Link: https://patch.msgid.link/20250923013113.20615-1-a.jahangirzad@gmail.com
[ rjw: Changelog tweaks, local variable definitions ordering adjustments ]
Fixes: 8cfb0cdf07 ("ACPI / debugger: Add IO interface to access debugger functionalities")
Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0759b1098 upstream.
The ACPI handle passed to acpi_extract_properties() as the first
argument represents the ACPI namespace scope in which to look for
objects returning buffers associated with buffer properties.
For _DSD objects located immediately under ACPI devices, this handle is
the same as the handle of the device object holding the _DSD, but for
data-only subnodes it is not so.
First of all, data-only subnodes are represented by objects that
cannot hold other objects in their scopes (like control methods).
Therefore a data-only subnode handle cannot be used for completing
relative pathname segments, so the current code in
in acpi_nondev_subnode_extract() passing a data-only subnode handle
to acpi_extract_properties() is invalid.
Moreover, a data-only subnode of device A may be represented by an
object located in the scope of device B (which kind of makes sense,
for instance, if A is a B's child). In that case, the scope in
question would be the one of device B. In other words, the scope
mentioned above is the same as the scope used for subnode object
lookup in acpi_nondev_subnode_extract().
Accordingly, rearrange that function to use the same scope for the
extraction of properties and subnode object lookup.
Fixes: 103e10c69c ("ACPI: property: Add support for parsing buffer property UUID")
Cc: 6.0+ <stable@vger.kernel.org> # 6.0+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9338d660b7 ]
When building s390 defconfig with binutils older than 2.32, there are
several warnings during the final linking stage:
s390-linux-ld: .tmp_vmlinux1: warning: allocated section `.got.plt' not in segment
s390-linux-ld: .tmp_vmlinux2: warning: allocated section `.got.plt' not in segment
s390-linux-ld: vmlinux.unstripped: warning: allocated section `.got.plt' not in segment
s390-linux-objcopy: vmlinux: warning: allocated section `.got.plt' not in segment
s390-linux-objcopy: st7afZyb: warning: allocated section `.got.plt' not in segment
binutils commit afca762f598 ("S/390: Improve partial relro support for
64 bit") [1] in 2.32 changed where .got.plt is emitted, avoiding the
warning.
The :NONE in the .vmlinux.info output section description changes the
segment for subsequent allocated sections. Move .vmlinux.info right
above the discards section to place all other sections in the previously
defined segment, .data.
Fixes: 30226853d6 ("s390: vmlinux.lds.S: explicitly handle '.got' and '.plt' sections")
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=afca762f598d453c563f244cd3777715b1a0cb72 [1]
Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Alexey Gladkov <legion@kernel.org>
Acked-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20251008-kbuild-fix-modinfo-regressions-v1-3-9fc776c5887c@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4f375ade6a ]
When unpinning a BPF hash table (htab or htab_lru) that contains internal
structures (timer, workqueue, or task_work) in its values, a BUG warning
is triggered:
BUG: sleeping function called from invalid context at kernel/bpf/hashtab.c:244
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0
...
The issue arises from the interaction between BPF object unpinning and
RCU callback mechanisms:
1. BPF object unpinning uses ->free_inode() which schedules cleanup via
call_rcu(), deferring the actual freeing to an RCU callback that
executes within the RCU_SOFTIRQ context.
2. During cleanup of hash tables containing internal structures,
htab_map_free_internal_structs() is invoked, which includes
cond_resched() or cond_resched_rcu() calls to yield the CPU during
potentially long operations.
However, cond_resched() or cond_resched_rcu() cannot be safely called from
atomic RCU softirq context, leading to the BUG warning when attempting
to reschedule.
Fix this by changing from ->free_inode() to ->destroy_inode() and rename
bpf_free_inode() to bpf_destroy_inode() for BPF objects (prog, map, link).
This allows direct inode freeing without RCU callback scheduling,
avoiding the invalid context warning.
Reported-by: Le Chen <tom2cat@sjtu.edu.cn>
Closes: https://lore.kernel.org/all/1444123482.1827743.1750996347470.JavaMail.zimbra@sjtu.edu.cn/
Fixes: 68134668c1 ("bpf: Add map side support for bpf timers.")
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20251008102628.808045-2-kafai.wan@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b5f8aa8d4b ]
The slimbus regmap passed to the GPIO driver down from MFD does not use
fast_io. This means a mutex is used for locking and thus this GPIO chip
must not be used in atomic context. Change the can_sleep switch in
struct gpio_chip to true.
Fixes: 59c3246834 ("gpio: wcd934x: Add support to wcd934x gpio controller")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a81236f2c ]
The tpm_tis_write8() call specifies arguments in wrong order. Should be
(data, addr, value) not (data, value, addr). The initial correct order
was changed during the major refactoring when the code was split.
Fixes: 41a5e1cf1f ("tpm/tpm_tis: Split tpm_tis driver into a core and TCG TIS compliant phy")
Signed-off-by: Gunnar Kudrjavets <gunnarku@amazon.com>
Reviewed-by: Justinien Bouron <jbouron@amazon.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 057ac50638 ]
EA $LXMOD is required for WSL non-symlink reparse points.
Fixes: ef86ab131d ("cifs: Fix querying of WSL CHR and BLK reparse points over SMB1")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b95cd1bdf5 ]
Don't reuse open handle when changing timestamps to prevent the server
from disabling automatic timestamp updates as per MS-FSA 2.1.4.17.
---8<---
import os
import time
filename = '/mnt/foo'
def print_stat(prefix):
st = os.stat(filename)
print(prefix, ': ', time.ctime(st.st_atime), time.ctime(st.st_ctime))
fd = os.open(filename, os.O_CREAT|os.O_TRUNC|os.O_WRONLY, 0o644)
print_stat('old')
os.utime(fd, None)
time.sleep(2)
os.write(fd, b'foo')
os.close(fd)
time.sleep(2)
print_stat('new')
---8<---
Before patch:
$ mount.cifs //srv/share /mnt -o ...
$ python3 run.py
old : Fri Oct 3 14:01:21 2025 Fri Oct 3 14:01:21 2025
new : Fri Oct 3 14:01:21 2025 Fri Oct 3 14:01:21 2025
After patch:
$ mount.cifs //srv/share /mnt -o ...
$ python3 run.py
old : Fri Oct 3 17:03:34 2025 Fri Oct 3 17:03:34 2025
new : Fri Oct 3 17:03:36 2025 Fri Oct 3 17:03:36 2025
Fixes: b6f2a0f89d ("cifs: for compound requests, use open handle if possible")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0cc380d0e1 ]
The return value of copy_to_iter() function will never be negative,
it is the number of bytes copied, or zero if nothing was copied.
Update the check to treat 0 as an error, and return -1 in that case.
Fixes: d08089f649 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Acked-by: Tom Talpey <tom@talpey.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6bb73db694 ]
Move the ssize check to the start in essiv_aead_crypt so that
it's also checked for decryption and in-place encryption.
Reported-by: Muhammad Alifa Ramdhan <ramdhan@starlabs.sg>
Fixes: be1eb7f78a ("crypto: essiv - create wrapper template for ESSIV generation")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e9a9dcb4cc ]
Don't forget to adjust the source offset in io_copy_page(), otherwise
it'll be copying into the same location in some cases for highmem
setups.
Fixes: e67645bb7f ("io_uring/zcrx: prepare fallback for larger pages")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e84945bdc6 ]
Jakub reported this self test flaking occasionally (fails, but passes on
re-run) on debug kernels.
This is because the test checks for elapsed time to determine if both
connections were established in parallel.
Rework this to no longer depend on timing.
Use busywait helper to check that both sockets have moved to established
state and then query the conntrack engine for the two entries.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netfilter-devel/20250926163318.40d1a502@kernel.org/
Fixes: 117e149e26 ("selftests: netfilter: test nat source port clash resolution interaction with tcp early demux")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a126ab6b26 ]
Jakub reports spurious failure of nft_fib.sh test.
This is caused by a subtle bug inherited when i moved faulty ping
from one test case to another.
nft_fib.sh not only checks that the fib expression matched, it also
records the number of matches and then validates we have the expected
count. When I did this it was under the assumption that we would
have 0 to n matching packets. In case of the failure, the entry has
n+1 matching packets.
This happens because ping_unreachable helper uses "ping -c 1 -w 1",
instead of the intended "-W". -w alters the meaning of -c (count),
namely, its then treated as number of wanted *replies* instead of
"number of packets to send".
So, in some cases, ping -c 1 -w 1 ends up sending two packets which then
makes the test fail due to the higher-than-expected packet count.
Fix the actual bug (s/-w/-W) and also change the error handling:
1. Show the number of expected packets in the error message
2. Always try to delete the key from the set.
Else, later test that makes sure we don't have unexpected keys
in there will always fail as well.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netfilter-devel/20250927090709.0b3cd783@kernel.org/
Fixes: 98287045c9 ("selftests: netfilter: move fib vrf test to nft_fib.sh")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f359b809d5 ]
Referencing a synproxy stateful object from OUTPUT hook causes kernel
crash due to infinite recursive calls:
BUG: TASK stack guard page was hit at 000000008bda5b8c (stack is 000000003ab1c4a5..00000000494d8b12)
[...]
Call Trace:
__find_rr_leaf+0x99/0x230
fib6_table_lookup+0x13b/0x2d0
ip6_pol_route+0xa4/0x400
fib6_rule_lookup+0x156/0x240
ip6_route_output_flags+0xc6/0x150
__nf_ip6_route+0x23/0x50
synproxy_send_tcp_ipv6+0x106/0x200
synproxy_send_client_synack_ipv6+0x1aa/0x1f0
nft_synproxy_do_eval+0x263/0x310
nft_do_chain+0x5a8/0x5f0 [nf_tables
nft_do_chain_inet+0x98/0x110
nf_hook_slow+0x43/0xc0
__ip6_local_out+0xf0/0x170
ip6_local_out+0x17/0x70
synproxy_send_tcp_ipv6+0x1a2/0x200
synproxy_send_client_synack_ipv6+0x1aa/0x1f0
[...]
Implement objref and objrefmap expression validate functions.
Currently, only NFT_OBJECT_SYNPROXY object type requires validation.
This will also handle a jump to a chain using a synproxy object from the
OUTPUT hook.
Now when trying to reference a synproxy object in the OUTPUT hook, nft
will produce the following error:
synproxy_crash.nft: Error: Could not process rule: Operation not supported
synproxy name mysynproxy
^^^^^^^^^^^^^^^^^^^^^^^^
Fixes: ee394f96ad ("netfilter: nft_synproxy: add synproxy stateful object support")
Reported-by: Georg Pfuetzenreuter <georg.pfuetzenreuter@suse.com>
Closes: https://bugzilla.suse.com/1250237
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 229c586b5e ]
Commit afddce13ce ("crypto: api - Add reqsize to crypto_alg")
introduced cra_reqsize field in crypto_alg struct to replace type
specific reqsize fields. It looks like this was introduced specifically
for ahash and acomp from the commit description as subsequent commits
add necessary changes in these alg frameworks.
However, this is being recommended for use in all crypto algs [1]
instead of setting reqsize using crypto_*_set_reqsize(). Using
cra_reqsize in skcipher algorithms, hence, causes memory
corruptions and crashes as the underlying functions in the algorithm
framework have not been updated to set the reqsize properly from
cra_reqsize. [2]
Add proper set_reqsize calls in the skcipher init function to
properly initialize reqsize for these algorithms in the framework.
[1]: https://lore.kernel.org/linux-crypto/aCL8BxpHr5OpT04k@gondor.apana.org.au/
[2]: https://gist.github.com/Pratham-T/24247446f1faf4b7843e4014d5089f6b
Fixes: afddce13ce ("crypto: api - Add reqsize to crypto_alg")
Fixes: 52f641bc63 ("crypto: ti - Add driver for DTHE V2 AES Engine (ECB, CBC)")
Signed-off-by: T Pratham <t-pratham@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2c95a756e0 ]
The TPS23881 improves on the TPS23880 with current sense resistors reduced
from 255 mOhm to 200 mOhm. This has a direct impact on the scaling of the
current measurement. However, the latest TPS23881 data sheet from May 2023
still shows the scaling of the TPS23880 model.
Fixes: 7f076ce3f1 ("net: pse-pd: tps23881: Add support for power limit and measurement features")
Signed-off-by: Thomas Wismer <thomas.wismer@scs.ch>
Acked-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20251006204029.7169-2-thomas@wismer.xyz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 58e6fc2fb9 ]
kfd_lookup_process_by_pid hold the kfd process reference to ensure it
doesn't get destroyed while sending the segfault event to user space.
Calling kfd_lookup_process_by_pid as function parameter leaks the kfd
process refcount and miss the NULL pointer check if app process is
already destroyed.
Fixes: 2d274bf709 ("amd/amdkfd: Trigger segfault for early userptr unmmapping")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0e190a0446 ]
Scaling doesn't work on DCE6 at the moment, the current
register programming produces incorrect output when using
fractional scaling (between 100-200%) on resolutions higher
than 1080p.
Disable it until we figure out how to program it properly.
Fixes: 7c15fd86aa ("drm/amd/display: dc/dce: add initial DCE6 support (v10)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a7dc87f344 ]
SCL_SCALER_ENABLE can be used to enable/disable the scaler
on DCE6. Program it to 0 when scaling isn't used, 1 when used.
Additionally, clear some other registers when scaling is
disabled and program the SCL_UPDATE register as recommended.
This fixes visible glitches for users whose BIOS sets up a
mode with scaling at boot, which DC was unable to clean up.
Fixes: b70aaf5586 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3f39f56520 ]
pm_runtime_get_sync() and pm_runtime_put_autosuspend() were previously
called in cmdq_mbox_send_data(), which is under a spinlock in msg_submit()
(mailbox.c). This caused lockdep warnings such as "sleeping function
called from invalid context" when running with lockdebug enabled.
The BUG report:
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1164
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 3616, name: kworker/u17:3
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
CPU: 1 PID: 3616 Comm: kworker/u17:3 Not tainted 6.1.87-lockdep-14133-g26e933aca785 #1
Hardware name: Google Ciri sku0/unprovisioned board (DT)
Workqueue: imgsys_runner imgsys_runner_func
Call trace:
dump_backtrace+0x100/0x120
show_stack+0x20/0x2c
dump_stack_lvl+0x84/0xb4
dump_stack+0x18/0x48
__might_resched+0x354/0x4c0
__might_sleep+0x98/0xe4
__pm_runtime_resume+0x70/0x124
cmdq_mbox_send_data+0xe4/0xb1c
msg_submit+0x194/0x2dc
mbox_send_message+0x190/0x330
imgsys_cmdq_sendtask+0x1618/0x2224
imgsys_runner_func+0xac/0x11c
process_one_work+0x638/0xf84
worker_thread+0x808/0xcd0
kthread+0x24c/0x324
ret_from_fork+0x10/0x20
Additionally, pm_runtime_put_autosuspend() should be invoked from the
GCE IRQ handler to ensure the hardware has actually completed its work.
To resolve these issues, remove the pm_runtime calls from
cmdq_mbox_send_data() and delegate power management responsibilities
to the client driver.
Fixes: 8afe816b0c ("mailbox: mtk-cmdq-mailbox: Implement Runtime PM with autosuspend")
Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 22239eb258 ]
When configuring IPsec packet offload in tunnel mode, the driver tries
to create tunnel reformat objects unconditionally. This is incorrect,
because tunnel mode is only permitted under specific encapsulation
settings, and that decision is already made when the flow table is
created.
The offending commit attempted to block this case in the state add
path, but the check there happens too late and does not prevent the
reformat from being configured.
Fix by taking short reservations for both the eswitch mode and the
encap at the start of state setup. This preserves the block ordering
(mode --> encap) used later: the mode is blocked during RX/TX get, and
the encap is blocked during flow-table creation. This lets us fail
early if either reservation cannot be obtained, it means a mode
transition is underway or a conflicting configuration already owns
encap. If both succeed, the flow-table path later takes the ownership
and the reservations are released on exit.
Fixes: 146c196b60 ("net/mlx5e: Create IPsec table with tunnel support only when encap is disabled")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1759652999-858513-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7593439c13 ]
When creating IPsec flow tables with tunnel mode enabled, the driver
uses mlx5_eswitch_block_encap() to prevent tunnel encapsulation
conflicts across different domains (NIC_RX/NIC_TX and FDB), since the
firmware doesn’t allow both at the same time.
Currently, the driver attempts to reserve tunnel mode unconditionally
for both NIC and FDB IPsec tables. This can lead to conflicting tunnel
mode setups, for example, if a flow table was created in the FDB
domain with tunnel offload enabled, and we later try to create another
one in the NIC, or vice versa.
To resolve this, adjust the blocking logic so that tunnel mode is only
reserved by NIC flows. This ensures that tunnel offload is exclusively
used in either the NIC or the FDB, and avoids unintended offload
conflicts.
Fixes: 1762f132d5 ("net/mlx5e: Support IPsec packet offload for RX in switchdev mode")
Fixes: c6c2bf5db4 ("net/mlx5e: Support IPsec packet offload for TX in switchdev mode")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1759652999-858513-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c9d1b0b542 ]
The sparx5 driver programs UC/MC/BC flooding in sparx5_update_fwd() by
unconditionally applying bridge_fwd_mask to all flood PGIDs. Any bridge
topology change that triggers sparx5_update_fwd() (for example enslaving
another port) therefore reinstalls flooding in hardware for already
bridged ports, regardless of their per-port flood flags.
This results in clobbering of the flood masks, and desynchronization
between software and hardware: the bridge still reports “flood off” for
the port, but hardware has flooding enabled due to unconditional PGID
reprogramming.
Steps to reproduce:
$ ip link add br0 type bridge
$ ip link set br0 up
$ ip link set eth0 master br0
$ ip link set eth0 up
$ bridge link set dev eth0 flood off
$ ip link set eth1 master br0
$ ip link set eth1 up
At this point, flooding is silently re-enabled for eth0. Software still
shows “flood off” for eth0, but hardware has flooding enabled.
To fix this, flooding is now set explicitly during bridge join/leave,
through sparx5_port_attr_bridge_flags():
On bridge join, UC/MC/BC flooding is enabled by default.
On bridge leave, UC/MC/BC flooding is disabled.
sparx5_update_fwd() no longer touches the flood PGIDs, clobbering
the flood masks, and desynchronizing software and hardware.
Initialization of the flooding PGIDs have been moved to
sparx5_start(). This is required as flooding PGIDs defaults to
0x3fffffff in hardware and the initialization was previously handled
in sparx5_update_fwd(), which was removed.
With this change, user-configured flooding flags persist across bridge
updates and are no longer overridden by sparx5_update_fwd().
Fixes: d6fce51419 ("net: sparx5: add switching support")
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251003-fix-flood-fwd-v1-1-48eb478b2904@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4dc8b26a3a ]
When accessing an MDIO register using single-byte smbus accesses, we have to
perform 2 consecutive operations targeting the same address,
first accessing the MSB then the LSB of the 16 bit register:
read_1_byte(addr); <- returns MSB of register at address 'addr'
read_1_byte(addr); <- returns LSB
Some PHY devices present in SFP such as the Broadcom 5461 don't like
seeing foreign i2c transactions in-between these 2 smbus accesses, and
will return the MSB a second time when trying to read the LSB :
read_1_byte(addr); <- returns MSB
i2c_transaction_for_other_device_on_the_bus();
read_1_byte(addr); <- returns MSB again
Given the already fragile nature of accessing PHYs/SFPs with single-byte
smbus accesses, it's safe to say that this Broadcom PHY may not be the
only one acting like this.
Let's therefore hold the i2c bus lock while performing our smbus
transactions to avoid interleaved accesses.
Fixes: d4bd3aca33 ("net: mdio: mdio-i2c: Add support for single-byte SMBus operations")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251003070311.861135-1-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23f3770e1a ]
Cilium has a BPF egress gateway feature which forces outgoing K8s Pod
traffic to pass through dedicated egress gateways which then SNAT the
traffic in order to interact with stable IPs outside the cluster.
The traffic is directed to the gateway via vxlan tunnel in collect md
mode. A recent BPF change utilized the bpf_redirect_neigh() helper to
forward packets after the arrival and decap on vxlan, which turned out
over time that the kmalloc-256 slab usage in kernel was ever-increasing.
The issue was that vxlan allocates the metadata_dst object and attaches
it through a fake dst entry to the skb. The latter was never released
though given bpf_redirect_neigh() was merely setting the new dst entry
via skb_dst_set() without dropping an existing one first.
Fixes: b4ab314149 ("bpf: Add redirect_neigh helper as redirect drop-in")
Reported-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com>
Reported-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jordan Rife <jrife@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jordan Rife <jrife@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20251003073418.291171-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb160e791a ]
The driver incorrectly determines SGI vs SPI interrupts by checking IRQ
number < 16, which fails with dynamic IRQ allocation. During unbind,
this causes improper SGI cleanup leading to kernel crash.
Add explicit irq_type field to pdata for reliable identification of SGI
interrupts (type-2) and only clean up SGI resources when appropriate.
Fixes: 6ffb163534 ("mailbox: zynqmp: handle SGI for shared IPI")
Signed-off-by: Harini T <harini.t@amd.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0aead8197f ]
The cleanup loop was starting at the wrong array index, causing
out-of-bounds access.
Start the loop at the correct index for zero-indexed arrays to prevent
accessing memory beyond the allocated array bounds.
Fixes: 4981b82ba2 ("mailbox: ZynqMP IPI mailbox controller")
Signed-off-by: Harini T <harini.t@amd.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 019e3f4550 ]
The ipi_mbox->dev.parent check is unreliable proxy for registration
status as it fails to protect against probe failures that occur after
the parent is assigned but before device_register() completes.
device_is_registered() is the canonical and robust method to verify the
registration status.
Remove ipi_mbox->dev.parent check in zynqmp_ipi_free_mboxes().
Fixes: 4981b82ba2 ("mailbox: ZynqMP IPI mailbox controller")
Signed-off-by: Harini T <harini.t@amd.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 341867f730 ]
The controller is registered using the device-managed function
'devm_mbox_controller_register()'. As documented in mailbox.c, this
ensures the devres framework automatically calls
mbox_controller_unregister() when device_unregister() is invoked, making
the explicit call unnecessary.
Remove redundant mbox_controller_unregister() call as
device_unregister() handles controller cleanup.
Fixes: 4981b82ba2 ("mailbox: ZynqMP IPI mailbox controller")
Signed-off-by: Harini T <harini.t@amd.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f3b601f900 ]
Since commit 22f72088ff ("tools headers: Update the syscall table with
the kernel sources") the arm64 syscall header is generated at build
time. Later, commit bfb713ea53 ("perf tools: Fix arm64 build by
generating unistd_64.h") added a dependency to libperf to guarantee that
this header was created before building libperf or perf itself.
However, libjvmti also requires this header but does not depend on
libperf, leading to build failures such as:
In file included from /usr/include/sys/syscall.h:24,
from /usr/include/syscall.h:1,
from jvmti/jvmti_agent.c:36:
tools/arch/arm64/include/uapi/asm/unistd.h:2:10: fatal error: asm/unistd_64.h: No such file or directory
2 | #include <asm/unistd_64.h>
Fix this by ensuring that libperf is built before libjvmti, so that
unistd_64.h is always available.
Fixes: 22f72088ff ("tools headers: Update the syscall table with the kernel sources")
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vincent Minet <v.minet@criteo.com>
Link: https://lore.kernel.org/r/20250922053702.2688374-1-v.minet@criteo.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 21b29e74ff ]
Some applications (like selftests/net/tcp_mmap.c) call SO_RCVLOWAT
on their listener, before accept().
This has an unfortunate effect on wscale selection in
tcp_select_initial_window() during 3WHS.
For instance, tcp_mmap was negotiating wscale 4, regardless
of tcp_rmem[2] and sysctl_rmem_max.
Do not change tp->window_clamp if it is zero
or bigger than our computed value.
Zero value is special, it allows tcp_select_initial_window()
to enable autotuning.
Note that SO_RCVLOWAT use on listener is probably not wise,
because tp->scaling_ratio has a default value, possibly wrong.
Fixes: d1361840f8 ("tcp: fix SO_RCVLOWAT and RCVBUF autotuning")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20251003184119.2526655-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2db687f346 ]
When ice_adapter_new() fails, the reserved XArray entry created by
xa_insert() is not released. This causes subsequent insertions at
the same index to return -EBUSY, potentially leading to
NULL pointer dereferences.
Reorder the operations as suggested by Przemek Kitszel:
1. Check if adapter already exists (xa_load)
2. Reserve the XArray slot (xa_reserve)
3. Allocate the adapter (ice_adapter_new)
4. Store the adapter (xa_store)
Fixes: 0f0023c649 ("ice: do not init struct ice_adapter more times than needed")
Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251001115336.1707-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7fc25c5a5a ]
Fix functions that return undefined values. These issues were caught by
running clang using LLVM=1 option.
Clang warnings are as follows:
ovpn-cli.c:1587:6: warning: variable 'ret' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
1587 | if (!sock) {
| ^~~~~
ovpn-cli.c:1635:9: note: uninitialized use occurs here
1635 | return ret;
| ^~~
ovpn-cli.c:1587:2: note: remove the 'if' if its condition is always false
1587 | if (!sock) {
| ^~~~~~~~~~~~
1588 | fprintf(stderr, "cannot allocate netlink socket\n");
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1589 | goto err_free;
| ~~~~~~~~~~~~~~
1590 | }
| ~
ovpn-cli.c:1584:15: note: initialize the variable 'ret' to silence this warning
1584 | int mcid, ret;
| ^
| = 0
ovpn-cli.c:2107:7: warning: variable 'ret' is used uninitialized whenever switch case is taken [-Wsometimes-uninitialized]
2107 | case CMD_INVALID:
| ^~~~~~~~~~~
ovpn-cli.c:2111:9: note: uninitialized use occurs here
2111 | return ret;
| ^~~
ovpn-cli.c:1939:12: note: initialize the variable 'ret' to silence this warning
1939 | int n, ret;
| ^
|
Fixes: 959bc330a4 ("testing/selftests: add test tool and scripts for ovpn module")
Signed-off-by: Sidharth Seela <sidharthseela@gmail.com>
Link: https://patch.msgid.link/20251001123107.96244-2-sidharthseela@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bc9ea78707 ]
The origin code calls cancel_delayed_work() in ocelot_stats_deinit()
to cancel the cyclic delayed work item ocelot->stats_work. However,
cancel_delayed_work() may fail to cancel the work item if it is already
executing. While destroy_workqueue() does wait for all pending work items
in the work queue to complete before destroying the work queue, it cannot
prevent the delayed work item from being rescheduled within the
ocelot_check_stats_work() function. This limitation exists because the
delayed work item is only enqueued into the work queue after its timer
expires. Before the timer expiration, destroy_workqueue() has no visibility
of this pending work item. Once the work queue appears empty,
destroy_workqueue() proceeds with destruction. When the timer eventually
expires, the delayed work item gets queued again, leading to the following
warning:
workqueue: cannot queue ocelot_check_stats_work on wq ocelot-switch-stats
WARNING: CPU: 2 PID: 0 at kernel/workqueue.c:2255 __queue_work+0x875/0xaf0
...
RIP: 0010:__queue_work+0x875/0xaf0
...
RSP: 0018:ffff88806d108b10 EFLAGS: 00010086
RAX: 0000000000000000 RBX: 0000000000000101 RCX: 0000000000000027
RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffff88806d123e88
RBP: ffffffff813c3170 R08: 0000000000000000 R09: ffffed100da247d2
R10: ffffed100da247d1 R11: ffff88806d123e8b R12: ffff88800c00f000
R13: ffff88800d7285c0 R14: ffff88806d0a5580 R15: ffff88800d7285a0
FS: 0000000000000000(0000) GS:ffff8880e5725000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe18e45ea10 CR3: 0000000005e6c000 CR4: 00000000000006f0
Call Trace:
<IRQ>
? kasan_report+0xc6/0xf0
? __pfx_delayed_work_timer_fn+0x10/0x10
? __pfx_delayed_work_timer_fn+0x10/0x10
call_timer_fn+0x25/0x1c0
__run_timer_base.part.0+0x3be/0x8c0
? __pfx_delayed_work_timer_fn+0x10/0x10
? rcu_sched_clock_irq+0xb06/0x27d0
? __pfx___run_timer_base.part.0+0x10/0x10
? try_to_wake_up+0xb15/0x1960
? _raw_spin_lock_irq+0x80/0xe0
? __pfx__raw_spin_lock_irq+0x10/0x10
tmigr_handle_remote_up+0x603/0x7e0
? __pfx_tmigr_handle_remote_up+0x10/0x10
? sched_balance_trigger+0x1c0/0x9f0
? sched_tick+0x221/0x5a0
? _raw_spin_lock_irq+0x80/0xe0
? __pfx__raw_spin_lock_irq+0x10/0x10
? tick_nohz_handler+0x339/0x440
? __pfx_tmigr_handle_remote_up+0x10/0x10
__walk_groups.isra.0+0x42/0x150
tmigr_handle_remote+0x1f4/0x2e0
? __pfx_tmigr_handle_remote+0x10/0x10
? ktime_get+0x60/0x140
? lapic_next_event+0x11/0x20
? clockevents_program_event+0x1d4/0x2a0
? hrtimer_interrupt+0x322/0x780
handle_softirqs+0x16a/0x550
irq_exit_rcu+0xaf/0xe0
sysvec_apic_timer_interrupt+0x70/0x80
</IRQ>
...
The following diagram reveals the cause of the above warning:
CPU 0 (remove) | CPU 1 (delayed work callback)
mscc_ocelot_remove() |
ocelot_deinit() | ocelot_check_stats_work()
ocelot_stats_deinit() |
cancel_delayed_work()| ...
| queue_delayed_work()
destroy_workqueue() | (wait a time)
| __queue_work() //UAF
The above scenario actually constitutes a UAF vulnerability.
The ocelot_stats_deinit() is only invoked when initialization
failure or resource destruction, so we must ensure that any
delayed work items cannot be rescheduled.
Replace cancel_delayed_work() with disable_delayed_work_sync()
to guarantee proper cancellation of the delayed work item and
ensure completion of any currently executing work before the
workqueue is deallocated.
A deadlock concern was considered: ocelot_stats_deinit() is called
in a process context and is not holding any locks that the delayed
work item might also need. Therefore, the use of the _sync() variant
is safe here.
This bug was identified through static analysis. To reproduce the
issue and validate the fix, I simulated ocelot-switch device by
writing a kernel module and prepared the necessary resources for
the virtual ocelot-switch device's probe process. Then, removing
the virtual device will trigger the mscc_ocelot_remove() function,
which in turn destroys the workqueue.
Fixes: a556c76adc ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://patch.msgid.link/20251001011149.55073-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ac2c02790 ]
Check that the resource which is converted to a surface exists before
trying to use the cursor snooper on it.
vmw_cmd_res_check allows explicit invalid (SVGA3D_INVALID_ID) identifiers
because some svga commands accept SVGA3D_INVALID_ID to mean "no surface",
unfortunately functions that accept the actual surfaces as objects might
(and in case of the cursor snooper, do not) be able to handle null
objects. Make sure that we validate not only the identifier (via the
vmw_cmd_res_check) but also check that the actual resource exists before
trying to do something with it.
Fixes unchecked null-ptr reference in the snooping code.
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: c0951b797e ("drm/vmwgfx: Refactor resource management")
Reported-by: Kuzey Arda Bulut <kuzeyardabulut@gmail.com>
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Ian Forbes <ian.forbes@broadcom.com>
Link: https://lore.kernel.org/r/20250917153655.1968583-1-zack.rusin@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9daa5a8795 ]
Starting with 'commit 2297791c92 ("s390/cio: dont unregister
subchannel from child-drivers")', cio no longer unregisters
subchannels when the attached device is invalid or unavailable.
As an unintended side-effect, the cio_ignore purge function no longer
removes subchannels for devices on the cio_ignore list if no CCW device
is attached. This situation occurs when a CCW device is non-operational
or unavailable
To ensure the same outcome of the purge function as when the
current cio_ignore list had been active during boot, update the purge
function to remove I/O subchannels without working CCW devices if the
associated device number is found on the cio_ignore list.
Fixes: 2297791c92 ("s390/cio: dont unregister subchannel from child-drivers")
Suggested-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 08fdfd260e ]
In xe_hw_engine_group_get_mode(), a write lock is acquired before
calling switch_mode(), which in turn invokes
xe_hw_engine_group_suspend_faulting_lr_jobs().
On failure inside xe_hw_engine_group_suspend_faulting_lr_jobs(),
the write lock is released there, and then again in
xe_hw_engine_group_get_mode(), leading to a double release.
Fix this by keeping both acquire and release operation in
xe_hw_engine_group_get_mode().
Fixes: 770bd1d341 ("drm/xe/hw_engine_group: Ensure safe transition between execution modes")
Cc: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://lore.kernel.org/r/20250925023145.1203004-2-shuicheng.lin@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 662d98b8b373007fa1b08ba93fee11f6fd3e387c)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49bdb63ff6 ]
Syzbot reported read of uninitialized variable BUG with following call stack.
lan78xx 8-1:1.0 (unnamed net_device) (uninitialized): EEPROM read operation timeout
=====================================================
BUG: KMSAN: uninit-value in lan78xx_read_eeprom drivers/net/usb/lan78xx.c:1095 [inline]
BUG: KMSAN: uninit-value in lan78xx_init_mac_address drivers/net/usb/lan78xx.c:1937 [inline]
BUG: KMSAN: uninit-value in lan78xx_reset+0x999/0x2cd0 drivers/net/usb/lan78xx.c:3241
lan78xx_read_eeprom drivers/net/usb/lan78xx.c:1095 [inline]
lan78xx_init_mac_address drivers/net/usb/lan78xx.c:1937 [inline]
lan78xx_reset+0x999/0x2cd0 drivers/net/usb/lan78xx.c:3241
lan78xx_bind+0x711/0x1690 drivers/net/usb/lan78xx.c:3766
lan78xx_probe+0x225c/0x3310 drivers/net/usb/lan78xx.c:4707
Local variable sig.i.i created at:
lan78xx_read_eeprom drivers/net/usb/lan78xx.c:1092 [inline]
lan78xx_init_mac_address drivers/net/usb/lan78xx.c:1937 [inline]
lan78xx_reset+0x77e/0x2cd0 drivers/net/usb/lan78xx.c:3241
lan78xx_bind+0x711/0x1690 drivers/net/usb/lan78xx.c:3766
The function lan78xx_read_raw_eeprom failed to properly propagate EEPROM
read timeout errors (-ETIMEDOUT). In the fallthrough path, it first
attempted to restore the pin configuration for LED outputs and then
returned only the status of that restore operation, discarding the
original timeout error.
As a result, callers could mistakenly treat the data buffer as valid
even though the EEPROM read had actually timed out with no data or partial
data.
To fix this, handle errors in restoring the LED pin configuration separately.
If the restore succeeds, return any prior EEPROM timeout error correctly
to the caller.
Reported-by: syzbot+62ec8226f01cb4ca19d9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=62ec8226f01cb4ca19d9
Fixes: 8b1b2ca83b ("net: usb: lan78xx: Improve error handling in EEPROM and OTP operations")
Signed-off-by: Bhanu Seshu Kumar Valluri <bhanuseshukumar@gmail.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20250930084902.19062-1-bhanuseshukumar@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aaab61de1f ]
It is allowed to mix Link and Host DMA channels in a way that their index
is different. In this case we would read the LLP from a channel which is
not used or used for other operation.
Such case can be reproduced on cAVS2.5 or ACE1 platforms with soundwire
configuration:
playback to SDW would take Host channel 0 (stream_tag 1) and no Link DMA
used
Second playback to HDMI (HDA) would use Host channel 1 (stream_tag 2) and
Link channel 0 (stream_tag 1).
In this case reading the LLP from channel 2 is incorrect as that is not the
Link channel used for the HDMI playback.
To correct this, we should look up the BE and get the channel used on the
Link side.
Fixes: 67b182bea0 ("ASoC: SOF: Intel: hda: Implement get_stream_position (Linear Link Position)")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Link: https://patch.msgid.link/20251002074719.2084-6-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 98662be7ef ]
Init acpi_gbl_use_global_lock to false, in order to void error messages
during boot phase:
ACPI Error: Could not enable GlobalLock event (20240827/evxfevnt-182)
ACPI Error: No response from Global Lock hardware, disabling lock (20240827/evglock-59)
Fixes: 628c3bb40e ("LoongArch: Add boot and setup routines")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 19baac378a ]
Commit b15212824a ("LoongArch: Make LTO case independent in Makefile")
moves "KBUILD_LDFLAGS += -mllvm --loongarch-annotate-tablejump" out of
CONFIG_CC_HAS_ANNOTATE_TABLEJUMP, which breaks the build for LLVM-18, as
'--loongarch-annotate-tablejump' is unimplemented there:
ld.lld: error: -mllvm: ld.lld: Unknown command line argument '--loongarch-annotate-tablejump'.
Call ld-option to detect '--loongarch-annotate-tablejump' before use, so
as to fix the build error.
Fixes: b15212824a ("LoongArch: Make LTO case independent in Makefile")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Suggested-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit abb2a55722 ]
Currently, when compiling with GCC, there is no "break 7" instruction
for zero division due to using the option -mno-check-zero-division, but
the compiler still generates "break 0" instruction for zero division.
Here is a simple example:
$ cat test.c
int div(int a)
{
return a / 0;
}
$ gcc -O2 -S test.c -o test.s
GCC generates "break 0" on LoongArch and "ud2" on x86, objtool decodes
"ud2" as INSN_BUG for x86, so decode "break 0" as INSN_BUG can fix the
objtool warnings for LoongArch, but this is not the intention.
When decoding "break 0" as INSN_TRAP in the previous commit, the aim is
to handle "break 0" as a trap. The generated "break 0" for zero division
by GCC is not proper, it should generate a break instruction with proper
bug type, so add the GCC option -fno-isolate-erroneous-paths-dereference
to avoid generating the unexpected "break 0" instruction for now.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202509200413.7uihAxJ5-lkp@intel.com/
Fixes: baad7830ee ("objtool/LoongArch: Mark types based on break immediate code")
Suggested-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b91917c0c6 ]
Don't open evsels on all CPUs, open them just on the CPUs they
support. This avoids opening say an e-core event on a p-core and
getting a failure - achieve this by getting rid of the "all_cpu_map".
In install_pe functions don't use the cpu_map_idx as a CPU number,
translate the cpu_map_idx, which is a dense index into the cpu_map
skipping holes at the beginning, to a proper CPU number.
Before:
```
$ perf stat --bpf-counters -a -e cycles,instructions -- sleep 1
Performance counter stats for 'system wide':
<not supported> cpu_atom/cycles/
566,270,672 cpu_core/cycles/
<not supported> cpu_atom/instructions/
572,792,836 cpu_core/instructions/ # 1.01 insn per cycle
1.001595384 seconds time elapsed
```
After:
```
$ perf stat --bpf-counters -a -e cycles,instructions -- sleep 1
Performance counter stats for 'system wide':
443,299,201 cpu_atom/cycles/
1,233,919,737 cpu_core/cycles/
213,634,112 cpu_atom/instructions/ # 0.48 insn per cycle
2,758,965,527 cpu_core/instructions/ # 2.24 insn per cycle
1.001699485 seconds time elapsed
```
Fixes: 7fac83aaf2 ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: bpf@vger.kernel.org
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0ebac01a00 ]
Check for NEED_RESCHED_LAZY, not just NEED_RESCHED, prior to transferring
control to a guest. Failure to check for lazy resched can unnecessarily
delay rescheduling until the next tick when using a lazy preemption model.
Note, ideally both the checking and processing of TIF bits would be handled
in common code, to avoid having to keep three separate paths synchronized,
but defer such cleanups to the future to keep the fix as standalone as
possible.
Cc: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Cc: Mukesh R <mrathor@linux.microsoft.com>
Fixes: 621191d709 ("Drivers: hv: Introduce mshv_root module to expose /dev/mshv to VMMs")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Tested-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb7663dec6 ]
Call sysfs_update_group() after reading the device descriptor to ensure
the HID sysfs attributes are visible when the feature is supported.
Fixes: ae7795a8c2 ("scsi: ufs: core: Add HID support")
Signed-off-by: Daniel Lee <chullee@google.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 60cd16a3b7 ]
During the detaching of Marvell's SAS/SATA controller, the original code
calls cancel_delayed_work() in mvs_free() to cancel the delayed work
item mwq->work_q. However, if mwq->work_q is already running, the
cancel_delayed_work() may fail to cancel it. This can lead to
use-after-free scenarios where mvs_free() frees the mvs_info while
mvs_work_queue() is still executing and attempts to access the
already-freed mvs_info.
A typical race condition is illustrated below:
CPU 0 (remove) | CPU 1 (delayed work callback)
mvs_pci_remove() |
mvs_free() | mvs_work_queue()
cancel_delayed_work() |
kfree(mvi) |
| mvi-> // UAF
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the delayed work item is properly canceled and any executing
delayed work item completes before the mvs_info is deallocated.
This bug was found by static analysis.
Fixes: 20b09c2992 ("[SCSI] mvsas: add support for 94xx; layout change; bug fixes")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b1bb980fd ]
The original commit set all cores in a cluster to a shared policy, but
did not update set_target to apply a frequency change to all cores for
the policy. This caused most cores to remain stuck at their boot
frequency.
Fixes: be4ae8c194 ("cpufreq: tegra186: Share policy per cluster")
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 632d31067b ]
Device links with DL_FLAG_SYNC_STATE_ONLY should not affect system
suspend and resume, and functions like device_reorder_to_tail() and
device_link_add() don't try to reorder the consumers with that flag.
However, dpm_wait_for_consumers() and dpm_wait_for_suppliers() don't
check thas flag before triggering dpm_wait(), leading to potential hang
during suspend/resume.
This can be reproduced on MT8186 Corsola Chromebook with devicetree like:
usb-a-connector {
compatible = "usb-a-connector";
port {
usb_a_con: endpoint {
remote-endpoint = <&usb_hs>;
};
};
};
usb_host {
compatible = "mediatek,mt8186-xhci", "mediatek,mtk-xhci";
port {
usb_hs: endpoint {
remote-endpoint = <&usb_a_con>;
};
};
};
In this case, the two nodes form a cycle and a SYNC_STATE_ONLY devlink
between usb_host (supplier) and usb-a-connector (consumer) is created.
Address this by exporting device_link_flag_is_sync_state_only() and
making dpm_wait_for_consumers() and dpm_wait_for_suppliers() use it
when deciding if dpm_wait() should be called.
Fixes: 05ef983e0d ("driver core: Add device link support for SYNC_STATE_ONLY flag")
Signed-off-by: Pin-yen Lin <treapking@chromium.org>
Link: https://patch.msgid.link/20250926102320.4053167-1-treapking@chromium.org
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ce3f56999 ]
Add separate macros for walking links to suppliers and consumers of a
device to help device links users to avoid exposing the internals of
struct dev_links_info in their code and possible coding mistakes related
to that.
Accordingly, use the new macros to replace open-coded device links list
walks in the core power management code.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/1944671.tdWV9SEqCh@rafael.j.wysocki
Stable-dep-of: 632d31067b ("PM: sleep: Do not wait on SYNC_STATE_ONLY device links")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fdd9ae23bb ]
Since SRCU is used for the protection of device link lists, the loops
over device link lists in multiple places in drivers/base/power/main.c
and in pm_runtime_get_suppliers() should be annotated as _srcu rather
than as _rcu which is the case currently.
Change the annotations accordingly.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/2393512.ElGaqSPkdT@rafael.j.wysocki
Stable-dep-of: 632d31067b ("PM: sleep: Do not wait on SYNC_STATE_ONLY device links")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fd2f74f8f3 ]
When using bpf_program__attach_kprobe_multi_opts on ARM64 to hook a BPF program
that contains the bpf_get_stackid function, the BPF program fails
to obtain the stack trace and returns -EFAULT.
This is because ftrace_partial_regs omits the configuration of the pstate register,
leaving pstate at the default value of 0. When get_perf_callchain executes,
it uses user_mode(regs) to determine whether it is in kernel mode.
This leads to a misjudgment that the code is in user mode,
so perf_callchain_kernel is not executed and the function returns directly.
As a result, trace->nr becomes 0, and finally -EFAULT is returned.
Therefore, the assignment of the pstate register is added here.
Fixes: b9b55c8912 ("tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs")
Closes: https://lore.kernel.org/bpf/20250919071902.554223-1-yangfeng59949@163.com/
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b40b1ba37a ]
When updating the local timestamps from CB_GETATTR, the updated values
are not being properly vetted.
Compare the update times vs. the saved times in the delegation rather
than the current times in the inode. Also, ensure that the ctime is
properly vetted vs. its original value.
Fixes: 6ae30d6eb2 ("nfsd: add support for delegated timestamps")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3952f1cbcb ]
SETATTRs containing delegated timestamp updates are currently not being
vetted properly. Since we no longer need to compare the timestamps vs.
the current timestamps, move the vetting of delegated timestamps wholly
into nfsd.
Rename the set_cb_time() helper to nfsd4_vet_deleg_time(), and make it
non-static. Add a new vet_deleg_attrs() helper that is called from
nfsd4_setattr that uses nfsd4_vet_deleg_time() to properly validate the
all the timestamps. If the validation indicates that the update should
be skipped, unset the appropriate flags in ia_valid.
Fixes: 7e13f4f8d2 ("nfsd: handle delegated timestamps in SETATTR")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7663e963a5 ]
As Trond points out [1], the "original time" mentioned in RFC 9754
refers to the timestamps on the files at the time that the delegation
was granted, and not the current timestamp of the file on the server.
Store the current timestamps for the file in the nfs4_delegation when
granting one. Add STATX_ATIME and STATX_MTIME to the request mask in
nfs4_delegation_stat(). When granting OPEN_DELEGATE_READ_ATTRS_DELEG, do
a nfs4_delegation_stat() and save the correct atime. If the stat() fails
for any reason, fall back to granting a normal read deleg.
[1]: https://lore.kernel.org/linux-nfs/47a4e40310e797f21b5137e847b06bb203d99e66.camel@kernel.org/
Fixes: 7e13f4f8d2 ("nfsd: handle delegated timestamps in SETATTR")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c066ff58e5 ]
Ensure that notify_change() doesn't clobber a delegated ctime update
with current_time() by setting ATTR_CTIME_SET for those updates.
Don't bother setting the timestamps in cb_getattr_update_times() in the
non-delegated case. notify_change() will do that itself.
Fixes: 7e13f4f8d2 ("nfsd: handle delegated timestamps in SETATTR")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afc5b36e29 ]
When ATTR_ATIME_SET and ATTR_MTIME_SET are set in the ia_valid mask, the
notify_change() logic takes that to mean that the request should set
those values explicitly, and not override them with "now".
With the advent of delegated timestamps, similar functionality is needed
for the ctime. Add a ATTR_CTIME_SET flag, and use that to indicate that
the ctime should be accepted as-is. Also, clean up the if statements to
eliminate the extra negatives.
In setattr_copy() and setattr_copy_mgtime() use inode_set_ctime_deleg()
when ATTR_CTIME_SET is set, instead of basing the decision on ATTR_DELEG.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: c066ff58e5 ("nfsd: use ATTR_CTIME_SET for delegated ctime updates")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5affb498e7 ]
If the only flag left is ATTR_DELEG, then there are no changes to be
made.
Fixes: 7e13f4f8d2 ("nfsd: handle delegated timestamps in SETATTR")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49ef649110 ]
struct tegra_bpmp::clocks is a pointer to a dynamically allocated array
of pointers to 'struct tegra_bpmp_clk'.
But the size of the allocated area is calculated like it is an array
containing actual 'struct tegra_bpmp_clk' objects - it's not true, there
are just pointers.
Found by Linux Verification Center (linuxtesting.org) with Svace static
analysis tool.
Fixes: 2db12b15c6 ("clk: tegra: Register clocks from root to leaf")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1624dead9a ]
The conditional check for the PLL0 multiplier 'm' used a logical AND
instead of OR, making the range check ineffective. This patch replaces
&& with || to correctly reject invalid values of 'm' that are either
less than or equal to 0 or greater than LPC18XX_PLL0_MSEL_MAX.
This ensures proper bounds checking during clk rate setting and rounding.
Fixes: b04e0b8fd5 ("clk: add lpc18xx cgu clk driver")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
[sboyd@kernel.org: 'm' is unsigned so remove < condition]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b46a3d323a ]
The round_rate() clk ops is deprecated, so migrate this driver from
round_rate() to determine_rate() using the Coccinelle semantic patch
on the cover letter of this series.
Signed-off-by: Brian Masney <bmasney@redhat.com>
Stable-dep-of: 1624dead9a ("clk: nxp: Fix pll0 rate check condition in LPC18xx CGU driver")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5e121370a7 ]
The `flags` in |struct mtk_mux| are core clk flags, not mux clk flags.
Passing one to the other is wrong.
Since there aren't any actual users adding CLK_MUX_* flags, just drop it
for now.
Fixes: b05ea33143 ("clk: mediatek: clk-mux: Add .determine_rate() callback")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6c4c26b624 ]
The infrastructure gate for the HDMI specific crystal needs the
top_hdmi_xtal clock to be configured in order to ungate the 26m
clock to the HDMI IP, and it wouldn't work without.
Reparent the infra_ao_hdmi_26m clock to top_hdmi_xtal to fix that.
Fixes: e2edf59dec ("clk: mediatek: Add MT8195 infrastructure clock support")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit edaeb4bcf1 ]
The detection of uncore_imc may happen for free running PMUs and the
clockticks event may be present on uncore_clock. Rewrite the test to
detect duplicated/deduplicated events from perf list, not hardcoded to
uncore_imc.
If perf stat fails then assume it is permissions and skip the test.
Committer testing:
Before:
root@x1:~# perf test -vv uniquifyi
96: perf stat events uniquifying:
--- start ---
test child forked, pid 220851
stat event uniquifying test
grep: Unmatched [, [^, [:, [., or [=
Event is not uniquified [Failed]
perf stat -e clockticks -A -o /tmp/__perf_test.stat_output.X7ChD -- true
# started on Fri Sep 19 16:48:38 2025
Performance counter stats for 'system wide':
CPU0 2,310,956 uncore_clock/clockticks/
0.001746771 seconds time elapsed
---- end(-1) ----
96: perf stat events uniquifying : FAILED!
root@x1:~#
After:
root@x1:~# perf test -vv uniquifyi
96: perf stat events uniquifying:
--- start ---
test child forked, pid 222366
Uniquification of PMU sysfs events test
Testing event uncore_imc_free_running/data_read/ is uniquified to uncore_imc_free_running_0/data_read/
Testing event uncore_imc_free_running/data_total/ is uniquified to uncore_imc_free_running_0/data_total/
Testing event uncore_imc_free_running/data_write/ is uniquified to uncore_imc_free_running_0/data_write/
Testing event uncore_imc_free_running/data_read/ is uniquified to uncore_imc_free_running_1/data_read/
Testing event uncore_imc_free_running/data_total/ is uniquified to uncore_imc_free_running_1/data_total/
Testing event uncore_imc_free_running/data_write/ is uniquified to uncore_imc_free_running_1/data_write/
---- end(0) ----
96: perf stat events uniquifying : Ok
root@x1:~#
Fixes: 070b315333 ("perf test: Restrict uniquifying test to machines with 'uncore_imc'")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 693101792e ]
The PMU name is appearing twice in:
```
$ perf stat -e uncore_imc_free_running/data_total/ -A true
Performance counter stats for 'system wide':
CPU0 1.57 MiB uncore_imc_free_running_0/uncore_imc_free_running,data_total/
CPU0 1.58 MiB uncore_imc_free_running_1/uncore_imc_free_running,data_total/
0.000892376 seconds time elapsed
```
Use the pmu_name_len_no_suffix to avoid this problem.
Committer testing:
After this patch:
root@x1:~# perf stat -e uncore_imc_free_running/data_total/ -A true
Performance counter stats for 'system wide':
CPU0 1.69 MiB uncore_imc_free_running_0/data_total/
CPU0 1.68 MiB uncore_imc_free_running_1/data_total/
0.002141605 seconds time elapsed
root@x1:~#
Fixes: 7d45f402d3 ("perf evlist: Make uniquifying counter names consistent")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 48918cacef ]
The test starts a workload and then opens events. If the events fail
to open, for example because of perf_event_paranoid, the gopipe of the
workload is leaked and the file descriptor leak check fails when the
test exits. To avoid this cancel the workload when opening the events
fails.
Before:
```
$ perf test -vv 7
7: PERF_RECORD_* events & perf_sample fields:
--- start ---
test child forked, pid 1189568
Using CPUID GenuineIntel-6-B7-1
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
exclude_kernel 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
exclude_kernel 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3
Attempt to add: software/cpu-clock/
..after resolving event: software/config=0/
cpu-clock -> software/cpu-clock/
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0x9 (PERF_COUNT_SW_DUMMY)
sample_type IP|TID|TIME|CPU
read_format ID|LOST
disabled 1
inherit 1
mmap 1
comm 1
enable_on_exec 1
task 1
sample_id_all 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
{ wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 1189569 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
perf_evlist__open: Permission denied
---- end(-2) ----
Leak of file descriptor 6 that opened: 'pipe:[14200347]'
---- unexpected signal (6) ----
iFailed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
#0 0x565358f6666e in child_test_sig_handler builtin-test.c:311
#1 0x7f29ce849df0 in __restore_rt libc_sigaction.c:0
#2 0x7f29ce89e95c in __pthread_kill_implementation pthread_kill.c:44
#3 0x7f29ce849cc2 in raise raise.c:27
#4 0x7f29ce8324ac in abort abort.c:81
#5 0x565358f662d4 in check_leaks builtin-test.c:226
#6 0x565358f6682e in run_test_child builtin-test.c:344
#7 0x565358ef7121 in start_command run-command.c:128
#8 0x565358f67273 in start_test builtin-test.c:545
#9 0x565358f6771d in __cmd_test builtin-test.c:647
#10 0x565358f682bd in cmd_test builtin-test.c:849
#11 0x565358ee5ded in run_builtin perf.c:349
#12 0x565358ee6085 in handle_internal_command perf.c:401
#13 0x565358ee61de in run_argv perf.c:448
#14 0x565358ee6527 in main perf.c:555
#15 0x7f29ce833ca8 in __libc_start_call_main libc_start_call_main.h:74
#16 0x7f29ce833d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#17 0x565358e391c1 in _start perf[851c1]
7: PERF_RECORD_* events & perf_sample fields : FAILED!
```
After:
```
$ perf test 7
7: PERF_RECORD_* events & perf_sample fields : Skip (permissions)
```
Fixes: 16d00fee70 ("perf tests: Move test__PERF_RECORD into separate object")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 48314d20fe ]
When not running as root and with higher perf event paranoia values
the perf record LBR tests could fail rather than skipping the
problematic tests.
Add the sensitivity to the test and confirm it passes with paranoia
values from -1 to 2.
Committer testing:
Testing with '$ perf test -vv lbr', i.e. as non root, and then comparing
the output shows the mentioned errors before this patch:
acme@x1:~$ grep -m1 "model name" /proc/cpuinfo
model name : 13th Gen Intel(R) Core(TM) i7-1365U
acme@x1:~$
Before:
132: perf record LBR tests : Skip
After:
132: perf record LBR tests : Ok
Fixes: 32559b99e0 ("perf test: Add set of perf record LBR tests")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 47b13635da ]
determine_rate() is expected to return an error code, or 0 on success.
clk_sam9x5_peripheral_determine_rate() has a branch that returns the
parent rate on a certain case. This is the behavior of round_rate(),
so let's go ahead and fix this by setting req->rate.
Fixes: b4c115c761 ("clk: at91: clk-peripheral: add support for changeable parent rate")
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Brian Masney <bmasney@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c51a37ffea ]
The dpu0_pixelclk and dpu1_pixelclk gates were incorrectly parented to
the video_pll_clk.
According to the TH1520 TRM, the "dpu0_pixelclk" should be sourced from
"DPU0 PLL DIV CLK". In this driver, "DPU0 PLL DIV CLK" corresponds to
the `dpu0_clk` clock, which is a divider whose parent is the
`dpu0_pll_clk`.
This patch corrects the clock hierarchy by reparenting `dpu0_pixelclk`
to `dpu0_clk`. By symmetry, `dpu1_pixelclk` is also reparented to its
correct source, `dpu1_clk`.
Fixes: 50d4b157fa ("clk: thead: Add clock support for VO subsystem in T-HEAD TH1520 SoC")
Reported-by: Icenowy Zheng <uwu@icenowy.me>
Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com>
[Icenowy: add Drew's R-b and rebased atop ccu_gate refactor]
Reviewed-by: Drew Fustini <fustini@kernel.org>
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Signed-off-by: Drew Fustini <fustini@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9e99b992c8 ]
The padctrl0 clock seems to be a child of the perisys_apb4_hclk clock,
gating the later makes padctrl0 registers stuck too.
Fix this relationship.
Fixes: ae81b69fd2 ("clk: thead: Add support for T-Head TH1520 AP_SUBSYS clocks")
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Drew Fustini <fustini@kernel.org>
Reviewed-by: Troy Mitchell <troy.mitchell@linux.dev>
Signed-off-by: Drew Fustini <fustini@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aaa75cbd5d ]
Similar to previous situation of mux clocks, the gate clocks of
clk-th1520-ap drivers are also using a helper that creates a temporary
struct clk_hw and abandons the struct clk_hw in struct ccu_common, which
prevents clock gates to be clock parents.
Do the similar refactor of dropping struct ccu_common and directly use
struct clk_gate here.
This patch mimics the refactor done on struct ccu_mux in 54edba916e
("clk: thead: th1520-ap: Describe mux clocks with clk_mux").
Fixes: ae81b69fd2 ("clk: thead: Add support for T-Head TH1520 AP_SUBSYS clocks")
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Drew Fustini <fustini@kernel.org>
Signed-off-by: Drew Fustini <fustini@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c123519bff ]
There are very rare randconfig builds that turn on this driver but
don't already select CONFIG_AUXILIARY_BUS from another driver, and
this results in a build failure:
arm-linux-gnueabi-ld: drivers/clk/clk-npcm8xx.o: in function `npcm8xx_clock_driver_init':
clk-npcm8xx.c:(.init.text+0x18): undefined reference to `__auxiliary_driver_register'
Select the bus here, as all other clk drivers using it do.
Fixes: e0b255df02 ("clk: npcm8xx: add clock controller")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20250807072241.4190376-1-arnd@kernel.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 8327bd4fcb upstream.
With `CONFIG_TRACE_MMIO_ACCESS=y`, the `{read,write}{b,w,l,q}{_relaxed}()`
mmio accessors unconditionally call `log_{post_}{read,write}_mmio()`
helpers, which in turn call the ftrace ops for `rwmmio` trace events
This adds a performance penalty per mmio accessor call, even when
`rwmmio` events are disabled at runtime (~80% overhead on local
measurement).
Guard these with `tracepoint_enabled()`.
Signed-off-by: Varad Gautam <varadgautam@google.com>
Fixes: 210031971c ("asm-generic/io: Add logging support for MMIO accessors")
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 898374fdd7 upstream.
When a listener is added, a part of creation of transport also registers
program/port with rpcbind. However, when the listener is removed,
while transport goes away, rpcbind still has the entry for that
port/type.
When deleting the transport, unregister with rpcbind when appropriate.
---v2 created a new xpt_flag XPT_RPCB_UNREG to mark TCP and UDP
transport and at xprt destroy send rpcbind unregister if flag set.
Suggested-by: Chuck Lever <chuck.lever@oracle.com>
Fixes: d093c90892 ("nfsd: fix management of listener transports")
Cc: stable@vger.kernel.org
Signed-off-by: Olga Kornievskaia <okorniev@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f97aef092e upstream.
Commit a755d0e2d4 ("cpufreq: Honour transition_latency over
transition_delay_us") caused platforms where cpuinfo.transition_latency
is CPUFREQ_ETERNAL to get a very large transition latency whereas
previously it had been capped at 10 ms (and later at 2 ms).
This led to a user-observable regression between 6.6 and 6.12 as
described by Shawn:
"The dbs sampling_rate was 10000 us on 6.6 and suddently becomes
6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms
because the default transition delay was dropped [...].
It slows down dbs governor's reacting to CPU loading change
dramatically. Also, as transition_delay_us is used by schedutil
governor as rate_limit_us, it shows a negative impact on device
idle power consumption, because the device gets slightly less time
in the lowest OPP."
Evidently, the expectation of the drivers using CPUFREQ_ETERNAL as
cpuinfo.transition_latency was that it would be capped by the core,
but they may as well return a default transition latency value instead
of CPUFREQ_ETERNAL and the core need not do anything with it.
Accordingly, introduce CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS and make
all of the drivers in question use it instead of CPUFREQ_ETERNAL. Also
update the related Rust binding.
Fixes: a755d0e2d4 ("cpufreq: Honour transition_latency over transition_delay_us")
Closes: https://lore.kernel.org/linux-pm/20250922125929.453444-1-shawnguo2@yeah.net/
Reported-by: Shawn Guo <shawnguo@kernel.org>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Jie Zhan <zhanjie9@hisilicon.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 6.6+ <stable@vger.kernel.org> # 6.6+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2264949.irdbgypaU6@rafael.j.wysocki
[ rjw: Fix typo in new symbol name, drop redundant type cast from Rust binding ]
Tested-by: Shawn Guo <shawnguo@kernel.org> # with cpufreq-dt driver
Reviewed-by: Qais Yousef <qyousef@layalina.io>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fed7eaa4f0 upstream.
APIs based on __pm_runtime_idle() (pm_runtime_idle(), pm_request_idle())
do not return 1 when already suspended. They return -EAGAIN. This is
already covered in the docs, so the entry for "1" is redundant and
conflicting.
(pm_runtime_put() and pm_runtime_put_sync() were previously incorrect,
but that's fixed in "PM: runtime: pm_runtime_put{,_sync}() returns 1
when already suspended", to ensure consistency with APIs like
pm_runtime_put_autosuspend().)
RPM_GET_PUT APIs based on __pm_runtime_suspend() do return 1 when
already suspended, but the language is a little unclear -- it's not
really an "error", so it seems better to list as a clarification before
the 0/success case. Additionally, they only actually return 1 when the
refcount makes it to 0; if the usage counter is still non-zero, we
return 0.
pm_runtime_put(), etc., also don't appear at first like they can ever
see "-EAGAIN: Runtime PM usage_count non-zero", because in non-racy
conditions, pm_runtime_put() would drop its reference count, see it's
non-zero, and return early (in __pm_runtime_idle()). However, it's
possible to race with another actor that increments the usage_count
afterward, since rpm_idle() is protected by a separate lock; in such a
case, we may see -EAGAIN.
Because this case is only seen in the presence of concurrent actors, it
makes sense to clarify that this is when "usage_count **became**
non-zero", by way of some racing actor.
Lastly, pm_runtime_put_sync_suspend() duplicated some -EAGAIN language.
Fix that.
Fixes: 271ff96d60 ("PM: runtime: Document return values of suspend-related API functions")
Link: https://lore.kernel.org/linux-pm/aJ5pkEJuixTaybV4@google.com/
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: 6.17+ <stable@vger.kernel.org> # 6.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95920c2ed0 upstream.
Helge reported that the introduction of PP_MAGIC_MASK let to crashes on
boot on his 32-bit parisc machine. The cause of this is the mask is set
too wide, so the page_pool_page_is_pp() incurs false positives which
crashes the machine.
Just disabling the check in page_pool_is_pp() will lead to the page_pool
code itself malfunctioning; so instead of doing this, this patch changes
the define for PP_DMA_INDEX_BITS to avoid mistaking arbitrary kernel
pointers for page_pool-tagged pages.
The fix relies on the kernel pointers that alias with the pp_magic field
always being above PAGE_OFFSET. With this assumption, we can use the
lowest bit of the value of PAGE_OFFSET as the upper bound of the
PP_DMA_INDEX_MASK, which should avoid the false positives.
Because we cannot rely on PAGE_OFFSET always being a compile-time
constant, nor on it always being >0, we fall back to disabling the
dma_index storage when there are not enough bits available. This leaves
us in the situation we were in before the patch in the Fixes tag, but
only on a subset of architecture configurations. This seems to be the
best we can do until the transition to page types in complete for
page_pool pages.
v2:
- Make sure there's at least 8 bits available and that the PAGE_OFFSET
bit calculation doesn't wrap
Link: https://lore.kernel.org/all/aMNJMFa5fDalFmtn@p100/
Fixes: ee62ce7a1d ("page_pool: Track DMA-mapped pages and unmap them when destroying the pool")
Cc: stable@vger.kernel.org # 6.15+
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Tested-by: Helge Deller <deller@gmx.de>
Link: https://patch.msgid.link/20250930114331.675412-1-toke@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1f86d0ac3 upstream.
Massage listmount() and make sure we don't call path_put() under the
namespace semaphore. If we put the last reference we're fscked.
Fixes: b4c2bea8ce ("add listmount(2) syscall")
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8c84e2082 upstream.
Massage statmount() and make sure we don't call path_put() under the
namespace semaphore. If we put the last reference we're fscked.
Fixes: 46eae99ef7 ("add statmount(2) syscall")
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6eb350a223 upstream.
rseq_need_restart() reads and clears task::rseq_event_mask with preemption
disabled to guard against the scheduler.
But membarrier() uses an IPI and sets the PREEMPT bit in the event mask
from the IPI, which leaves that RMW operation unprotected.
Use guard(irq) if CONFIG_MEMBARRIER is enabled to fix that.
Fixes: 2a36ab717e ("rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5973a62efa upstream.
Since the referenced fixes commit, the kernel's .text section is only
mapped starting from _stext; the region [_text, _stext) is omitted. As a
result, other vmalloc/vmap allocations may use the virtual addresses
nominally in the range [_text, _stext). This address reuse confuses
multiple things:
1. crash_prepare_elf64_headers() sets up a segment in /proc/vmcore
mapping the entire range [_text, _end) to
[__pa_symbol(_text), __pa_symbol(_end)). Reading an address in
[_text, _stext) from /proc/vmcore therefore gives the incorrect
result.
2. Tools doing symbolization (either by reading /proc/kallsyms or based
on the vmlinux ELF file) will incorrectly identify vmalloc/vmap
allocations in [_text, _stext) as kernel symbols.
In practice, both of these issues affect the drgn debugger.
Specifically, there were cases where the vmap IRQ stacks for some CPUs
were allocated in [_text, _stext). As a result, drgn could not get the
stack trace for a crash in an IRQ handler because the core dump
contained invalid data for the IRQ stack address. The stack addresses
were also symbolized as being in the _text symbol.
Fix this by bringing back the mapping of [_text, _stext), but now make
it non-executable and read-only. This prevents other allocations from
using it while still achieving the original goal of not mapping
unpredictable data as executable. Other than the changed protection,
this is effectively a revert of the fixes commit.
Fixes: e2a073dde9 ("arm64: omit [_text, _stext) from permanent kernel mapping")
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b26da4074 upstream.
[BUG]
With my local branch to enable bs > ps support for btrfs, sometimes I
hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild
yet, as it requires btrfs bs > ps support, which is not even in the
development branch.
But on the other hand, there is also a very low chance to hit above
ASSERT() with bs < ps cases, so this is an existing bug affect not only
the incoming bs > ps support but also the existing bs < ps support.
[CAUSE]
Firstly that ASSERT() means we're trying to submit a dirty block but
without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is
always larger than the fs block size in my bs > ps case.
(8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following
sequence:
extent_writepage()
| We got a 32K folio (4 fs blocks) at file offset 0, and the fs block
| size is 8K, page size is 4K.
| And there is another 8K folio at file offset 32K, which is also
| dirty.
| So the filemap layout looks like the following:
|
| "||" is the filio boundary in the filemap.
| "//| is the dirty range.
|
| 0 8K 16K 24K 32K 40K
| |////////| |//////////////////////||////////|
|
|- writepage_delalloc()
| |- find_lock_delalloc_range() for [0, 8K)
| | Now range [0, 8K) is properly locked.
| |
| |- find_lock_delalloc_range() for [16K, 40K)
| | |- btrfs_find_delalloc_range() returned range [16K, 40K)
| | |- lock_delalloc_folios() locked folio 0 successfully
| | |
| | | The filemap range [32K, 40K) got dropped from filemap.
| | |
| | |- lock_delalloc_folios() failed with -EAGAIN on folio 32K
| | | As the folio at 32K is dropped.
| | |
| | |- loops = 1;
| | |- max_bytes = PAGE_SIZE;
| | |- goto again;
| | | This will re-do the lookup for dirty delalloc ranges.
| | |
| | |- btrfs_find_delalloc_range() called with @max_bytes == 4K
| | | This is smaller than block size, so
| | | btrfs_find_delalloc_range() is unable to return any range.
| | \- return false;
| |
| \- Now only range [0, 8K) has an OE for it, but for dirty range
| [16K, 32K) it's dirty without an OE.
| This breaks the assumption that writepage_delalloc() will find
| and lock all dirty ranges inside the folio.
|
|- extent_writepage_io()
|- submit_one_sector() for [0, 8K)
| Succeeded
|
|- submit_one_sector() for [16K, 24K)
Triggering the ASSERT(), as there is no OE, and the original
extent map is a hole.
Please note that, this also exposed the same problem for bs < ps
support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;"
branch, we will re-do the search using 64K as max_bytes.
Which may fail again to lock the next folio, and exit early without
handling all dirty blocks inside the folio.
[FIX]
Instead of using the fixed size PAGE_SIZE as @max_bytes, use
@sectorsize, so that we are ensured to find and lock any remaining
blocks inside the folio.
And since we're here, add an extra ASSERT() to
before calling btrfs_find_delalloc_range() to make sure the @max_bytes is
at least no smaller than a block to avoid false negative.
Cc: stable@vger.kernel.org # 5.15+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72d271a7ba upstream.
Userspace generally expects APIs that return -EMSGSIZE to allow for them
to adjust their buffer size and retry the operation. However, the
fscontext log would previously clear the message even in the -EMSGSIZE
case.
Given that it is very cheap for us to check whether the buffer is too
small before we remove the message from the ring buffer, let's just do
that instead. While we're at it, refactor some fscontext_read() into a
separate helper to make the ring buffer logic a bit easier to read.
Fixes: 007ec26cdc ("vfs: Implement logging through fs_context")
Cc: David Howells <dhowells@redhat.com>
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Link: https://lore.kernel.org/20250807-fscontext-log-cleanups-v3-1-8d91d6242dc3@cyphar.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ba7a254af upstream.
hba->pm_qos_mutex is used very early as a part of ufshcd_init(), so it
need to be initialized before that call. This fixes the following
warning:
------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: kernel/locking/mutex.c:577 at __mutex_lock+0x268/0x894, CPU#4: kworker/u32:4/72
Modules linked in:
CPU: 4 UID: 0 PID: 72 Comm: kworker/u32:4 Not tainted 6.17.0-rc7-next-20250926+ #11223 PREEMPT
Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : __mutex_lock+0x268/0x894
lr : __mutex_lock+0x268/0x894
...
Call trace:
__mutex_lock+0x268/0x894 (P)
mutex_lock_nested+0x24/0x30
ufshcd_pm_qos_update+0x30/0x78
ufshcd_setup_clocks+0x2d4/0x3c4
ufshcd_init+0x234/0x126c
ufshcd_pltfrm_init+0x62c/0x82c
ufs_qcom_probe+0x20/0x58
platform_probe+0x5c/0xac
really_probe+0xbc/0x298
__driver_probe_device+0x78/0x12c
driver_probe_device+0x40/0x164
__device_attach_driver+0xb8/0x138
bus_for_each_drv+0x80/0xdc
__device_attach+0xa8/0x1b0
device_initial_probe+0x14/0x20
bus_probe_device+0xb0/0xb4
deferred_probe_work_func+0x8c/0xc8
process_one_work+0x208/0x60c
worker_thread+0x244/0x388
kthread+0x150/0x228
ret_from_fork+0x10/0x20
irq event stamp: 57267
hardirqs last enabled at (57267): [<ffffd761485e868c>] _raw_spin_unlock_irqrestore+0x74/0x78
hardirqs last disabled at (57266): [<ffffd76147b13c44>] clk_enable_lock+0x7c/0xf0
softirqs last enabled at (56270): [<ffffd7614734446c>] handle_softirqs+0x4c4/0x4dc
softirqs last disabled at (56265): [<ffffd76147290690>] __do_softirq+0x14/0x20
---[ end trace 0000000000000000 ]---
Fixes: 79dde5f7dc ("scsi: ufs: core: Fix data race in CPU latency PM QoS request handling")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Message-Id: <20250929112730.3782765-1-m.szyprowski@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e9c206324e upstream.
The cdnsp-pci driver uses pcim_enable_device() to enable a PCI device,
which means the device will be automatically disabled on driver detach
through the managed device framework. The manual pci_disable_device()
call in the error path is therefore redundant.
Found via static anlaysis and this is similar to commit 99ca0b57e4
("thermal: intel: int340x: processor: Fix warning during module unload").
Fixes: 3d82904559 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Cc: stable@vger.kernel.org
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20250903141613.2535472-1-linmq006@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be5ae730ff upstream.
Right now the interrupt handler first reads all updated status registers
and only then clears the interrupts. It's possible that a duplicate
interrupt for a changed register or plug state comes in after the
interrupts have been processed but before they have been cleared:
* plug is inserted, TPS_REG_INT_PLUG_EVENT is set
* TPS_REG_INT_EVENT1 is read
* tps6598x_handle_plug_event() has run and registered the plug
* plug is removed again, TPS_REG_INT_PLUG_EVENT is set (again)
* TPS_REG_INT_CLEAR1 is written, TPS_REG_INT_PLUG_EVENT is cleared
We then have no plug connected and no pending interrupt but the tipd
core still thinks there is a plug. It's possible to trigger this with
e.g. a slightly broken Type-C to USB A converter.
Fix this by first clearing the interrupts and only then reading the
updated registers.
Fixes: 45188f27b3 ("usb: typec: tipd: Add support for Apple CD321X")
Fixes: 0a4c005bd1 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers")
Cc: stable@kernel.org
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: Sven Peter <sven@kernel.org>
Link: https://lore.kernel.org/r/20250914-apple-usb3-tipd-v1-1-4e99c8649024@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d3c4cd5c6 upstream.
Prevent USB runtime PM (autosuspend) for AX88772* in bind.
usbnet enables runtime PM (autosuspend) by default, so disabling it via
the usb_driver flag is ineffective. On AX88772B, autosuspend shows no
measurable power saving with current driver (no link partner, admin
up/down). The ~0.453 W -> ~0.248 W drop on v6.1 comes from phylib powering
the PHY off on admin-down, not from USB autosuspend.
The real hazard is that with runtime PM enabled, ndo_open() (under RTNL)
may synchronously trigger autoresume (usb_autopm_get_interface()) into
asix_resume() while the USB PM lock is held. Resume paths then invoke
phylink/phylib and MDIO, which also expect RTNL, leading to possible
deadlocks or PM lock vs MDIO wake issues.
To avoid this, keep the device runtime-PM active by taking a usage
reference in ax88772_bind() and dropping it in unbind(). A non-zero PM
usage count blocks runtime suspend regardless of userspace policy
(.../power/control - pm_runtime_allow/forbid), making this approach
robust against sysfs overrides.
Holding a runtime-PM usage ref does not affect system-wide suspend;
system sleep/resume callbacks continue to run as before.
Fixes: 4a2c7217cd ("net: usb: asix: ax88772: manage PHY PM from MAC")
Reported-by: Hubert Wiśniewski <hubert.wisniewski.25632@gmail.com>
Closes: https://lore.kernel.org/all/DCGHG5UJT9G3.2K1GHFZ3H87T0@gmail.com
Tested-by: Hubert Wiśniewski <hubert.wisniewski.25632@gmail.com>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/all/b5ea8296-f981-445d-a09a-2f389d7f6fdd@samsung.com
Cc: stable@vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20251005081203.3067982-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c04db81cd0 upstream.
A buffer overflow vulnerability exists in the USB 9pfs transport layer
where inconsistent size validation between packet header parsing and
actual data copying allows a malicious USB host to overflow heap buffers.
The issue occurs because:
- usb9pfs_rx_header() validates only the declared size in packet header
- usb9pfs_rx_complete() uses req->actual (actual received bytes) for
memcpy
This allows an attacker to craft packets with small declared size
(bypassing validation) but large actual payload (triggering overflow
in memcpy).
Add validation in usb9pfs_rx_complete() to ensure req->actual does not
exceed the buffer capacity before copying data.
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Closes: https://lkml.kernel.org/r/20250616132539.63434-1-danisjiang@gmail.com
Fixes: a3be076dc1 ("net/9p/usbg: Add new usb gadget function transport")
Cc: stable@vger.kernel.org
Message-ID: <20250622-9p-usb_overflow-v3-1-ab172691b946@codewreck.org>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4002ee98c0 upstream.
While the API contract in docs doesn't specify it explicitly, the
generic implementation of the get_function_name() callback from struct
pinmux_ops - pinmux_generic_get_function_name() - can fail and return
NULL. This is already checked in pinmux_check_ops() so add a similar
check in pinmux_func_name_to_selector() instead of passing the returned
pointer right down to strcmp() where the NULL can get dereferenced. This
is normal operation when adding new pinfunctions.
Cc: stable@vger.kernel.org
Tested-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 67600ccfc4 upstream.
The original code relies on cancel_delayed_work() in tb_dp_dprx_stop(),
which does not ensure that the delayed work item tunnel->dprx_work has
fully completed if it was already running. This leads to use-after-free
scenarios where tb_tunnel is deallocated by tb_tunnel_put(), while
tunnel->dprx_work remains active and attempts to dereference tb_tunnel
in tb_dp_dprx_work().
A typical race condition is illustrated below:
CPU 0 | CPU 1
tb_dp_tunnel_active() |
tb_deactivate_and_free_tunnel()| tb_dp_dprx_start()
tb_tunnel_deactivate() | queue_delayed_work()
tb_dp_activate() |
tb_dp_dprx_stop() | tb_dp_dprx_work() //delayed worker
cancel_delayed_work() |
tb_tunnel_put(tunnel); |
| tunnel = container_of(...); //UAF
| tunnel-> //UAF
Replacing cancel_delayed_work() with cancel_delayed_work_sync() is
not feasible as it would introduce a deadlock: both tb_dp_dprx_work()
and the cleanup path acquire tb->lock, and cancel_delayed_work_sync()
would wait indefinitely for the work item that cannot proceed.
Instead, implement proper reference counting:
- If cancel_delayed_work() returns true (work is pending), we release
the reference in the stop function.
- If it returns false (work is executing or already completed), the
reference is released in delayed work function itself.
This ensures the tb_tunnel remains valid during work item execution
while preventing memory leaks.
This bug was found by static analysis.
Fixes: d6d458d42e ("thunderbolt: Handle DisplayPort tunnel activation asynchronously")
Cc: stable@vger.kernel.org
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85afa9ea12 upstream.
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be
NULL even after EPF initialization. Then it is prudent to check that
they have non-NULL values before releasing the channels. Add the checks
in pci_epf_test_clean_dma_chan().
Without the checks, NULL pointer dereferences happen and they can lead
to a kernel panic in some cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Call trace:
dma_release_channel+0x2c/0x120 (P)
pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test]
pci_epc_deinit_notify+0x74/0xc0
tegra_pcie_ep_pex_rst_irq+0x250/0x5d8
irq_thread_fn+0x34/0xb8
irq_thread+0x18c/0x2e8
kthread+0x14c/0x210
ret_from_fork+0x10/0x20
Fixes: 8353813c88 ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities")
Fixes: 5ebf3fc59b ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
[mani: trimmed the stack trace]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit deb2f22838 upstream.
When platform firmware supplies error information to the OS, e.g., via the
ACPI APEI GHES mechanism, it may identify an error source device that
doesn't advertise an AER Capability and therefore dev->aer_info, which
contains AER stats and ratelimiting data, is NULL.
pci_dev_aer_stats_incr() already checks dev->aer_info for NULL, but
aer_ratelimit() did not, leading to NULL pointer dereferences like this one
from the URL below:
{1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
{1}[Hardware Error]: event severity: corrected
{1}[Hardware Error]: device_id: 0000:00:00.0
{1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x2020
{1}[Hardware Error]: aer_cor_status: 0x00001000, aer_cor_mask: 0x00002000
BUG: kernel NULL pointer dereference, address: 0000000000000264
RIP: 0010:___ratelimit+0xc/0x1b0
pci_print_aer+0x141/0x360
aer_recover_work_func+0xb5/0x130
[8086:2020] is an Intel "Sky Lake-E DMI3 Registers" device that claims to
be a Root Port but does not advertise an AER Capability.
Add a NULL check in aer_ratelimit() to avoid the NULL pointer dereference.
Note that this also prevents ratelimiting these events from GHES.
Fixes: a57f2bfb4a ("PCI/AER: Ratelimit correctable and non-fatal error logging")
Link: https://lore.kernel.org/r/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/
Signed-off-by: Breno Leitao <leitao@debian.org>
[bhelgaas: add crash details to commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250929-aer_crash_2-v1-1-68ec4f81c356@debian.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6df164e29b upstream.
In xdr_stream_decode_opaque_auth(), zero-length checksum.len causes
checksum.data to be set to NULL. This triggers a NPD when accessing
checksum.data in gss_krb5_verify_mic_v2(). This patch ensures that
the value of checksum.len is not less than XDR_UNIT.
Fixes: 0653028e8f ("SUNRPC: Convert gss_verify_header() to use xdr_stream")
Cc: stable@kernel.org
Signed-off-by: Lei Lu <llfamsec@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3366a0477 upstream.
Struct ff_effect_compat is embedded twice inside
uinput_ff_upload_compat, contains internal padding. In particular, there
is a hole after struct ff_replay to satisfy alignment requirements for
the following union member. Without clearing the structure,
copy_to_user() may leak stack data to userspace.
Initialize ff_up_compat to zero before filling valid fields.
Fixes: 2d56f3a32c ("Input: refactor evdev 32bit compat to be shareable with uinput")
Cc: stable@vger.kernel.org
Signed-off-by: Zhen Ni <zhen.ni@easystack.cn>
Link: https://lore.kernel.org/r/20250928063737.74590-1-zhen.ni@easystack.cn
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9e6aa9949 upstream.
devm_kcalloc() may fail. ndtest_probe() allocates three DMA address
arrays (dcr_dma, label_dma, dimm_dma) and later unconditionally uses
them in ndtest_nvdimm_init(), which can lead to a NULL pointer
dereference under low-memory conditions.
Check all three allocations and return -ENOMEM if any allocation fails,
jumping to the common error path. Do not emit an extra error message
since the allocator already warns on allocation failure.
Fixes: 9399ab61ad ("ndtest: Add dimms to the two buses")
Cc: stable@vger.kernel.org
Signed-off-by: Guangshuo Li <lgs201920130244@gmail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0910dd7c9a upstream.
Skip the WRMSR and HLT fastpaths in SVM's VM-Exit handler if the next RIP
isn't valid, e.g. because KVM is running with nrips=false. SVM must
decode and emulate to skip the instruction if the CPU doesn't provide the
next RIP, and getting the instruction bytes to decode requires reading
guest memory. Reading guest memory through the emulator can fault, i.e.
can sleep, which is disallowed since the fastpath handlers run with IRQs
disabled.
BUG: sleeping function called from invalid context at ./include/linux/uaccess.h:106
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 32611, name: qemu
preempt_count: 1, expected: 0
INFO: lockdep is turned off.
irq event stamp: 30580
hardirqs last enabled at (30579): [<ffffffffc08b2527>] vcpu_run+0x1787/0x1db0 [kvm]
hardirqs last disabled at (30580): [<ffffffffb4f62e32>] __schedule+0x1e2/0xed0
softirqs last enabled at (30570): [<ffffffffb4247a64>] fpu_swap_kvm_fpstate+0x44/0x210
softirqs last disabled at (30568): [<ffffffffb4247a64>] fpu_swap_kvm_fpstate+0x44/0x210
CPU: 298 UID: 0 PID: 32611 Comm: qemu Tainted: G U 6.16.0-smp--e6c618b51cfe-sleep #782 NONE
Tainted: [U]=USER
Hardware name: Google Astoria-Turin/astoria, BIOS 0.20241223.2-0 01/17/2025
Call Trace:
<TASK>
dump_stack_lvl+0x7d/0xb0
__might_resched+0x271/0x290
__might_fault+0x28/0x80
kvm_vcpu_read_guest_page+0x8d/0xc0 [kvm]
kvm_fetch_guest_virt+0x92/0xc0 [kvm]
__do_insn_fetch_bytes+0xf3/0x1e0 [kvm]
x86_decode_insn+0xd1/0x1010 [kvm]
x86_emulate_instruction+0x105/0x810 [kvm]
__svm_skip_emulated_instruction+0xc4/0x140 [kvm_amd]
handle_fastpath_invd+0xc4/0x1a0 [kvm]
vcpu_run+0x11a1/0x1db0 [kvm]
kvm_arch_vcpu_ioctl_run+0x5cc/0x730 [kvm]
kvm_vcpu_ioctl+0x578/0x6a0 [kvm]
__se_sys_ioctl+0x6d/0xb0
do_syscall_64+0x8a/0x2c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f479d57a94b
</TASK>
Note, this is essentially a reapply of commit 5c30e8101e ("KVM: SVM:
Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"), but with
different justification (KVM now grabs SRCU when skipping the instruction
for other reasons).
Fixes: b439eb8ab5 ("Revert "KVM: SVM: Skip WRMSR fastpath on VM-Exit if next RIP isn't valid"")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250805190526.1453366-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit acf943e976 upstream.
When orphan file feature is enabled, inode can be tracked as orphan
either in the standard orphan list or in the orphan file. The first can
be tested by checking ei->i_orphan list head, the second is recorded by
EXT4_STATE_ORPHAN_FILE inode state flag. There are several places where
we want to check whether inode is tracked as orphan and only some of
them properly check for both possibilities. Luckily the consequences are
mostly minor, the worst that can happen is that we track an inode as
orphan although we don't need to and e2fsck then complains (resulting in
occasional ext4/307 xfstest failures). Fix the problem by introducing a
helper for checking whether an inode is tracked as orphan and use it in
appropriate places.
Fixes: 4a79a98c7b ("ext4: Improve scalability of ext4 orphan file handling")
Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Message-ID: <20250925123038.20264-2-jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88daf2f448 upstream.
If client doesn't negotiate with SMB3.1.1 POSIX Extensions,
then proper error code won't be returned due to overwriting.
Return error immediately.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: e2f34481b2 ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Signed-off-by: Matvey Kovalev <matvey.kovalev@ispras.ru>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 305853cce3 upstream.
The 'sess->rpc_handle_list' XArray manages RPC handles within a ksmbd
session. Access to this list is intended to be protected by
'sess->rpc_lock' (an rw_semaphore). However, the locking implementation was
flawed, leading to potential race conditions.
In ksmbd_session_rpc_open(), the code incorrectly acquired only a read lock
before calling xa_store() and xa_erase(). Since these operations modify
the XArray structure, a write lock is required to ensure exclusive access
and prevent data corruption from concurrent modifications.
Furthermore, ksmbd_session_rpc_method() accessed the list using xa_load()
without holding any lock at all. This could lead to reading inconsistent
data or a potential use-after-free if an entry is concurrently removed and
the pointer is dereferenced.
Fix these issues by:
1. Using down_write() and up_write() in ksmbd_session_rpc_open()
to ensure exclusive access during XArray modification, and ensuring
the lock is correctly released on error paths.
2. Adding down_read() and up_read() in ksmbd_session_rpc_method()
to safely protect the lookup.
Fixes: a1f46c99d9 ("ksmbd: fix use-after-free in ksmbd_session_rpc_open")
Fixes: b685757c7b ("ksmbd: Implements sess->rpc_handle_list as xarray")
Cc: stable@vger.kernel.org
Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f04aad36a0 upstream.
syzkaller discovered the following crash: (kernel BUG)
[ 44.607039] ------------[ cut here ]------------
[ 44.607422] kernel BUG at mm/userfaultfd.c:2067!
[ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none)
[ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460
<snip other registers, drop unreliable trace>
[ 44.617726] Call Trace:
[ 44.617926] <TASK>
[ 44.619284] userfaultfd_release+0xef/0x1b0
[ 44.620976] __fput+0x3f9/0xb60
[ 44.621240] fput_close_sync+0x110/0x210
[ 44.622222] __x64_sys_close+0x8f/0x120
[ 44.622530] do_syscall_64+0x5b/0x2f0
[ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 44.623244] RIP: 0033:0x7f365bb3f227
Kernel panics because it detects UFFD inconsistency during
userfaultfd_release_all(). Specifically, a VMA which has a valid pointer
to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags.
The inconsistency is caused in ksm_madvise(): when user calls madvise()
with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR mode,
it accidentally clears all flags stored in the upper 32 bits of
vma->vm_flags.
Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int and
int are 32-bit wide. This setup causes the following mishap during the &=
~VM_MERGEABLE assignment.
VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000.
After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then
promoted to unsigned long before the & operation. This promotion fills
upper 32 bits with leading 0s, as we're doing unsigned conversion (and
even for a signed conversion, this wouldn't help as the leading bit is 0).
& operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff
instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears
the upper 32-bits of its value.
Fix it by changing `VM_MERGEABLE` constant to unsigned long, using the
BIT() macro.
Note: other VM_* flags are not affected: This only happens to the
VM_MERGEABLE flag, as the other VM_* flags are all constants of type int
and after ~ operation, they end up with leading 1 and are thus converted
to unsigned long with leading 1s.
Note 2:
After commit 31defc3b01 ("userfaultfd: remove (VM_)BUG_ON()s"), this is
no longer a kernel BUG, but a WARNING at the same place:
[ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067
but the root-cause (flag-drop) remains the same.
[akpm@linux-foundation.org: rust bindgen wasn't able to handle BIT(), from Miguel]
Link: https://lore.kernel.org/oe-kbuild-all/202510030449.VfSaAjvd-lkp@intel.com/
Link: https://lkml.kernel.org/r/20251001090353.57523-2-acsjakub@amazon.de
Fixes: 7677f7fd8b ("userfaultfd: add minor fault registration mode")
Signed-off-by: Jakub Acs <acsjakub@amazon.de>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: SeongJae Park <sj@kernel.org>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Tested-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Xu Xin <xu.xin16@zte.com.cn>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d770bd11b upstream.
The current implementation of bpf_arch_text_poke() requires 5 nops
at patch site which is not applicable for kernel/module functions.
Because LoongArch reserves ONLY 2 nops at the function entry. With
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y, this can be done by ftrace
instead.
See the following commit for details:
* commit b91e014f07 ("bpf: Make BPF trampoline use register_ftrace_direct() API")
* commit 9cdc3b6a29 ("LoongArch: ftrace: Add direct call support")
Cc: stable@vger.kernel.org
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea645cfd3d upstream.
When attach fentry/fexit BPF programs, __arch_prepare_bpf_trampoline()
is called twice with different `struct bpf_tramp_image *im`:
bpf_trampoline_update()
-> arch_bpf_trampoline_size()
-> __arch_prepare_bpf_trampoline()
-> arch_prepare_bpf_trampoline()
-> __arch_prepare_bpf_trampoline()
Use move_imm() will emit unstable instruction sequences, so let's use
move_addr() instead to prevent subtle bugs.
(I observed this while debugging other issues with printk.)
Cc: stable@vger.kernel.org
Tested-by: Vincent Li <vincent.mc.li@gmail.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8168b4faf upstream.
Automatically disable kaslr when the kernel loads from kexec_file.
kexec_file loads the secondary kernel image to a non-linked address,
inherently providing KASLR-like randomization.
However, on LoongArch where System RAM may be non-contiguous, enabling
KASLR for the second kernel may relocate it to an invalid memory region
and cause a boot failure. Thus, we disable KASLR when "kexec_file" is
detected in the command line.
To ensure compatibility with older kernels loaded via kexec_file, this
patch should be backported to stable branches.
Cc: stable@vger.kernel.org
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d33a030c5 upstream.
There is a race condition between dm device suspend and table load that
can lead to null pointer dereference. The issue occurs when suspend is
invoked before table load completes:
BUG: kernel NULL pointer dereference, address: 0000000000000054
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50
Call Trace:
<TASK>
blk_mq_quiesce_queue+0x2c/0x50
dm_stop_queue+0xd/0x20
__dm_suspend+0x130/0x330
dm_suspend+0x11a/0x180
dev_suspend+0x27e/0x560
ctl_ioctl+0x4cf/0x850
dm_ctl_ioctl+0xd/0x20
vfs_ioctl+0x1d/0x50
__se_sys_ioctl+0x9b/0xc0
__x64_sys_ioctl+0x19/0x30
x64_sys_call+0x2c4a/0x4620
do_syscall_64+0x9e/0x1b0
The issue can be triggered as below:
T1 T2
dm_suspend table_load
__dm_suspend dm_setup_md_queue
dm_mq_init_request_queue
blk_mq_init_allocated_queue
=> q->mq_ops = set->ops; (1)
dm_stop_queue / dm_wait_for_completion
=> q->tag_set NULL pointer! (2)
=> q->tag_set = set; (3)
Fix this by checking if a valid table (map) exists before performing
request-based suspend and waiting for target I/O. When map is NULL,
skip these table-dependent suspend steps.
Even when map is NULL, no I/O can reach any target because there is
no table loaded; I/O submitted in this state will fail early in the
DM layer. Skipping the table-dependent suspend logic in this case
is safe and avoids NULL pointer dereferences.
Fixes: c4576aed8d ("dm: fix request-based dm's use of dm_wait_for_completion")
Cc: stable@vger.kernel.org
Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f597c2cdb upstream.
When suspend and load run concurrently, before q->mq_ops is set in
blk_mq_init_allocated_queue(), __dm_suspend() skip dm_stop_queue(). As a
result, the queue's quiesce depth is not incremented.
Later, once table load has finished and __dm_resume() runs, which triggers
q->quiesce_depth ==0 warning in blk_mq_unquiesce_queue():
Call Trace:
<TASK>
dm_start_queue+0x16/0x20 [dm_mod]
__dm_resume+0xac/0xb0 [dm_mod]
dm_resume+0x12d/0x150 [dm_mod]
do_resume+0x2c2/0x420 [dm_mod]
dev_suspend+0x30/0x130 [dm_mod]
ctl_ioctl+0x402/0x570 [dm_mod]
dm_ctl_ioctl+0x23/0x30 [dm_mod]
Fix this by explicitly tracking whether the request queue was
stopped in __dm_suspend() via a new DMF_QUEUE_STOPPED flag.
Only call dm_start_queue() in __dm_resume() if the queue was
actually stopped.
Fixes: e70feb8b3e ("blk-mq: support concurrent queue quiesce/unquiesce")
Cc: stable@vger.kernel.org
Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64cf7d058a upstream.
It was reported that using __copy_from_user_inatomic() can actually
schedule. Which is bad when preemption is disabled. Even though there's
logic to check in_atomic() is set, but this is a nop when the kernel is
configured with PREEMPT_NONE. This is due to page faulting and the code
could schedule with preemption disabled.
Link: https://lore.kernel.org/all/20250819105152.2766363-1-luogengkun@huaweicloud.com/
The solution was to change the __copy_from_user_inatomic() to
copy_from_user_nofault(). But then it was reported that this caused a
regression in Android. There's several applications writing into
trace_marker() in Android, but now instead of showing the expected data,
it is showing:
tracing_mark_write: <faulted>
After reverting the conversion to copy_from_user_nofault(), Android was
able to get the data again.
Writes to the trace_marker is a way to efficiently and quickly enter data
into the Linux tracing buffer. It takes no locks and was designed to be as
non-intrusive as possible. This means it cannot allocate memory, and must
use pre-allocated data.
A method that is actively being worked on to have faultable system call
tracepoints read user space data is to allocate per CPU buffers, and use
them in the callback. The method uses a technique similar to seqcount.
That is something like this:
preempt_disable();
cpu = smp_processor_id();
buffer = this_cpu_ptr(&pre_allocated_cpu_buffers, cpu);
do {
cnt = nr_context_switches_cpu(cpu);
migrate_disable();
preempt_enable();
ret = copy_from_user(buffer, ptr, size);
preempt_disable();
migrate_enable();
} while (!ret && cnt != nr_context_switches_cpu(cpu));
if (!ret)
ring_buffer_write(buffer);
preempt_enable();
It's a little more involved than that, but the above is the basic logic.
The idea is to acquire the current CPU buffer, disable migration, and then
enable preemption. At this moment, it can safely use copy_from_user().
After reading the data from user space, it disables preemption again. It
then checks to see if there was any new scheduling on this CPU. If there
was, it must assume that the buffer was corrupted by another task. If
there wasn't, then the buffer is still valid as only tasks in preemptable
context can write to this buffer and only those that are running on the
CPU.
By using this method, where trace_marker open allocates the per CPU
buffers, trace_marker writes can access user space and even fault it in,
without having to allocate or take any locks of its own.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Luo Gengkun <luogengkun@huaweicloud.com>
Cc: Wattson CI <wattson-external@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/20251008124510.6dba541a@gandalf.local.home
Fixes: 3d62ab32df ("tracing: Fix tracing_marker may trigger page fault during preempt_disable")
Reported-by: Runping Lai <runpinglai@google.com>
Tested-by: Runping Lai <runpinglai@google.com>
Closes: https://lore.kernel.org/linux-trace-kernel/20251007003417.3470979-2-runpinglai@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 61e19cd2e5 upstream.
When s_start() fails to allocate memory for set_event_iter, it returns NULL
before acquiring event_mutex. However, the corresponding s_stop() function
always tries to unlock the mutex, causing a lock imbalance warning:
WARNING: bad unlock balance detected!
6.17.0-rc7-00175-g2b2e0c04f78c #7 Not tainted
-------------------------------------
syz.0.85611/376514 is trying to release lock (event_mutex) at:
[<ffffffff8dafc7a4>] traverse.part.0.constprop.0+0x2c4/0x650 fs/seq_file.c:131
but there are no more locks to release!
The issue was introduced by commit b355247df1 ("tracing: Cache ':mod:'
events for modules not loaded yet") which added the kzalloc() allocation before
the mutex lock, creating a path where s_start() could return without locking
the mutex while s_stop() would still try to unlock it.
Fix this by unconditionally acquiring the mutex immediately after allocation,
regardless of whether the allocation succeeded.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/20250929113238.3722055-1-sashal@kernel.org
Fixes: b355247df1 ("tracing: Cache ":mod:" events for modules not loaded yet")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64e0d839c5 upstream.
Testing has shown that reading multiple registers at once (for 10-bit
ADC values) does not work. Set the use_single_read regmap_config flag
to make regmap split these for us.
This should fix temperature opregion accesses done by
drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for
the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fd ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC")
Cc: stable@vger.kernel.org
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Hans de Goede <hansg@kernel.org>
Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c328f5474 upstream.
Syzbot reported an uninitialized value bug in nci_init_req, which was
introduced by commit 5aca7966d2 ("Merge tag
'perf-tools-fixes-for-v6.17-2025-09-16' of
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools").
This bug arises due to very limited and poor input validation
that was done at nic_valid_size(). This validation only
validates the skb->len (directly reflects size provided at the
userspace interface) with the length provided in the buffer
itself (interpreted as NCI_HEADER). This leads to the processing
of memory content at the address assuming the correct layout
per what opcode requires there. This leads to the accesses to
buffer of `skb_buff->data` which is not assigned anything yet.
Following the same silent drop of packets of invalid sizes at
`nic_valid_size()`, add validation of the data in the respective
handlers and return error values in case of failure. Release
the skb if error values are returned from handlers in
`nci_nft_packet` and effectively do a silent drop
Possible TODO: because we silently drop the packets, the
call to `nci_request` will be waiting for completion of request
and will face timeouts. These timeouts can get excessively logged
in the dmesg. A proper handling of them may require to export
`nci_request_cancel` (or propagate error handling from the
nft packets handlers).
Reported-by: syzbot+740e04c2a93467a0f8c8@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=740e04c2a93467a0f8c8
Fixes: 6a2968aaf5 ("NFC: basic NCI protocol implementation")
Tested-by: syzbot+740e04c2a93467a0f8c8@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Deepak Sharma <deepak.sharma.472935@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250925132846.213425-1-deepak.sharma.472935@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3bd5e45c2c upstream.
When parsing Allocation Extent Descriptor, lengthAllocDescs comes from
on-disk data and must be validated against the block size. Crafted or
corrupted images may set lengthAllocDescs so that the total descriptor
length (sizeof(allocExtDesc) + lengthAllocDescs) exceeds the buffer,
leading udf_update_tag() to call crc_itu_t() on out-of-bounds memory and
trigger a KASAN use-after-free read.
BUG: KASAN: use-after-free in crc_itu_t+0x1d5/0x2b0 lib/crc-itu-t.c:60
Read of size 1 at addr ffff888041e7d000 by task syz-executor317/5309
CPU: 0 UID: 0 PID: 5309 Comm: syz-executor317 Not tainted 6.12.0-rc4-syzkaller-00261-g850925a8133c #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
crc_itu_t+0x1d5/0x2b0 lib/crc-itu-t.c:60
udf_update_tag+0x70/0x6a0 fs/udf/misc.c:261
udf_write_aext+0x4d8/0x7b0 fs/udf/inode.c:2179
extent_trunc+0x2f7/0x4a0 fs/udf/truncate.c:46
udf_truncate_tail_extent+0x527/0x7e0 fs/udf/truncate.c:106
udf_release_file+0xc1/0x120 fs/udf/file.c:185
__fput+0x23f/0x880 fs/file_table.c:431
task_work_run+0x24f/0x310 kernel/task_work.c:239
exit_task_work include/linux/task_work.h:43 [inline]
do_exit+0xa2f/0x28e0 kernel/exit.c:939
do_group_exit+0x207/0x2c0 kernel/exit.c:1088
__do_sys_exit_group kernel/exit.c:1099 [inline]
__se_sys_exit_group kernel/exit.c:1097 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1097
x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Validate the computed total length against epos->bh->b_size.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Reported-by: syzbot+8743fca924afed42f93e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8743fca924afed42f93e
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Larshin Sergey <Sergey.Larshin@kaspersky.com>
Link: https://patch.msgid.link/20250922131358.745579-1-Sergey.Larshin@kaspersky.com
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bace10b596 upstream.
Assumption that chain DMA module starts the link DMA when 1ms of
data is available from host is not correct. Instead the firmware
chain DMA module fills the link DMA with initial buffer of zeroes
and the host and link DMAs are started at the same time.
This results in a small error in delay calculation. This can become a
more severe problem if host DMA has delays that exceed 1ms. This results
in negative delay to be calculated and bogus values reported to
applications. This can confuse some applications like
alsa_conformance_test.
Fix the issue by correctly calculating the firmware chain DMA
preamble size and initializing the start offset to this value.
Cc: stable@vger.kernel.org
Fixes: a1d203d390 ("ASoC: SOF: ipc4-pcm: Enable delay reporting for ChainDMA streams")
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://patch.msgid.link/20251002074719.2084-3-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e65bda827 upstream.
wcd934x_codec_parse_data() contains a device reference count leak in
of_slim_get_device() where device_find_child() increases the reference
count of the device but this reference is not properly decreased in
the success path. Add put_device() in wcd934x_codec_parse_data() and
add devm_add_action_or_reset() in the probe function, which ensures
that the reference count of the device is correctly managed.
Memory leak in regmap_init_slimbus() as the allocated regmap is not
released when the device is removed. Using devm_regmap_init_slimbus()
instead of regmap_init_slimbus() to ensure automatic regmap cleanup on
device removal.
Calling path: of_slim_get_device() -> of_find_slim_device() ->
device_find_child(). As comment of device_find_child() says, 'NOTE:
you will need to drop the reference with put_device() after use.'.
Found by code review.
Cc: stable@vger.kernel.org
Fixes: a61f3b4f47 ("ASoC: wcd934x: add support to wcd9340/wcd9341 codec")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://patch.msgid.link/20250923065212.26660-1-make24@iscas.ac.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09cfd3c52e upstream.
It's reported that sometimes a zcrx request can receive more than was
requested. It's caused by io_zcrx_recv_skb() adjusting desc->count for
all received buffers including frag lists, but then doing recursive
calls to process frag list skbs, which leads to desc->count double
accounting and underflow.
Reported-and-tested-by: Matthias Jasny <matthiasjasny@gmail.com>
Fixes: 6699ec9a23 ("io_uring/zcrx: add a read limit to recvzc requests")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b15b7d2a1b upstream.
Remove the logic to set interrupt mask by default in uio_hv_generic
driver as the interrupt mask value is supposed to be controlled
completely by the user space. If the mask bit gets changed
by the driver, concurrently with user mode operating on the ring,
the mask bit may be set when it is supposed to be clear, and the
user-mode driver will miss an interrupt which will cause a hang.
For eg- when the driver sets inbound ring buffer interrupt mask to 1,
the host does not interrupt the guest on the UIO VMBus channel.
However, setting the mask does not prevent the host from putting a
message in the inbound ring buffer. So let’s assume that happens,
the host puts a message into the ring buffer but does not interrupt.
Subsequently, the user space code in the guest sets the inbound ring
buffer interrupt mask to 0, saying “Hey, I’m ready for interrupts”.
User space code then calls pread() to wait for an interrupt.
Then one of two things happens:
* The host never sends another message. So the pread() waits forever.
* The host does send another message. But because there’s already a
message in the ring buffer, it doesn’t generate an interrupt.
This is the correct behavior, because the host should only send an
interrupt when the inbound ring buffer transitions from empty to
not-empty. Adding an additional message to a ring buffer that is not
empty is not supposed to generate an interrupt on the guest.
Since the guest is waiting in pread() and not removing messages from
the ring buffer, the pread() waits forever.
This could be easily reproduced in hv_fcopy_uio_daemon if we delay
setting interrupt mask to 0.
Similarly if hv_uio_channel_cb() sets the interrupt_mask to 1,
there’s a race condition. Once user space empties the inbound ring
buffer, but before user space sets interrupt_mask to 0, the host could
put another message in the ring buffer but it wouldn’t interrupt.
Then the next pread() would hang.
Fix these by removing all instances where interrupt_mask is changed,
while keeping the one in set_event() unchanged to enable userspace
control the interrupt mask by writing 0/1 to /dev/uioX.
Fixes: 95096f2fbd ("uio-hv-generic: new userspace i/o driver for VMBus")
Suggested-by: John Starks <jostarks@microsoft.com>
Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Long Li <longli@microsoft.com>
Reviewed-by: Tianyu Lan <tiala@microsoft.com>
Tested-by: Tianyu Lan <tiala@microsoft.com>
Link: https://lore.kernel.org/r/20250828044200.492030-1-namjain@linux.microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74058c0a9f upstream.
Syzkaller reports a "KMSAN: uninit-value in squashfs_get_parent" bug.
This is caused by open_by_handle_at() being called with a file handle
containing an invalid parent inode number. In particular the inode number
is that of a symbolic link, rather than a directory.
Squashfs_get_parent() gets called with that symbolic link inode, and
accesses the parent member field.
unsigned int parent_ino = squashfs_i(inode)->parent;
Because non-directory inodes in Squashfs do not have a parent value, this
is uninitialised, and this causes an uninitialised value access.
The fix is to initialise parent with the invalid inode 0, which will cause
an EINVAL error to be returned.
Regular inodes used to share the parent field with the block_list_start
field. This is removed in this commit to enable the parent field to
contain the invalid inode number 0.
Link: https://lkml.kernel.org/r/20250918233308.293861-1-phillip@squashfs.org.uk
Fixes: 122601408d ("Squashfs: export operations")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: syzbot+157bdef5cf596ad0da2c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68cc2431.050a0220.139b6.0001.GAE@google.com/
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4bddf4587c upstream.
After reading all the feedback, right now disabling the TPM2_TCG_HMAC
is the right call.
Other views discussed:
A. Having a kernel command-line parameter or refining the feature
otherwise. This goes to the area of improvements. E.g., one
example is my own idea where the null key specific code would be
replaced with a persistent handle parameter (which can be
*unambigously* defined as part of attestation process when
done correctly).
B. Removing the code. I don't buy this because that is same as saying
that HMAC encryption cannot work at all (if really nitpicking) in
any form. Also I disagree on the view that the feature could not
be refined to something more reasoable.
Also, both A and B are worst options in terms of backporting.
Thuss, this is the best possible choice.
Cc: stable@vger.kernel.or # v6.10+
Fixes: d2add27cf2 ("tpm: Add NULL primary creation")
Suggested-by: Chris Fenner <cfenn@google.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 55c0ced59f ]
When verifying BPF programs, the check_alu_op() function validates
instructions with ALU operations. The 'offset' field in these
instructions is a signed 16-bit integer.
The existing check 'insn->off > 1' was intended to ensure the offset is
either 0, or 1 for BPF_MOD/BPF_DIV. However, because 'insn->off' is
signed, this check incorrectly accepts all negative values (e.g., -1).
This commit tightens the validation by changing the condition to
'(insn->off != 0 && insn->off != 1)'. This ensures that any value
other than the explicitly permitted 0 and 1 is rejected, hardening the
verifier against malformed BPF programs.
Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Fixes: ec0e2da95f ("bpf: Support new signed div/mod instructions.")
Link: https://lore.kernel.org/r/tencent_70D024BAE70A0A309A4781694C7B764B0608@qq.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c342bfc99 ]
We will segfault once we call realloc in bpf_get_addrs due to
wrong size argument.
Fixes: 6302bdeb91 ("selftests/bpf: Add a kprobe_multi subtest to use addrs instead of syms")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 92e9f4faff ]
The bitmap allocated with bitmap_zalloc() in otx2_probe() was not
released in otx2_remove(). Unbinding and rebinding the driver therefore
triggers a kmemleak warning:
unreferenced object (size 8):
backtrace:
bitmap_zalloc
otx2_probe
Call bitmap_free() in the remove path to fix the leak.
Fixes: efabce2901 ("octeontx2-pf: AF_XDP zero copy receive support")
Signed-off-by: Bo Sun <bo@mboxify.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd9ea7da41 ]
The bitmap allocated with bitmap_zalloc() in otx2vf_probe() was not
released in otx2vf_remove(). Unbinding and rebinding the driver therefore
triggers a kmemleak warning:
unreferenced object (size 8):
backtrace:
bitmap_zalloc
otx2vf_probe
Call bitmap_free() in the remove path to fix the leak.
Fixes: efabce2901 ("octeontx2-pf: AF_XDP zero copy receive support")
Signed-off-by: Bo Sun <bo@mboxify.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 25ba2b84c3 ]
Add nfsd_file_dio_alignment and use it to avoid issuing misaligned IO
using O_DIRECT. Any misaligned DIO falls back to using buffered IO.
Because misaligned DIO is now handled safely, remove the nfs modparam
'localio_O_DIRECT_semantics' that was added to require users opt-in to
the requirement that all O_DIRECT be properly DIO-aligned.
Also, introduce nfs_iov_iter_aligned_bvec() which is a variant of
iov_iter_aligned_bvec() that also verifies the offset associated with
an iov_iter is DIO-aligned. NOTE: in a parallel effort,
iov_iter_aligned_bvec() is being removed along with
iov_iter_is_aligned().
Lastly, add pr_info_ratelimited if underlying filesystem returns
-EINVAL because it was made to try O_DIRECT for IO that is not
DIO-aligned (shouldn't happen, so its best to be louder if it does).
Fixes: 3feec68563 ("nfs/localio: add direct IO enablement with sync and async IO support")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d11f6cd1bb ]
Use STATX_DIOALIGN and STATX_DIO_READ_ALIGN to get DIO alignment
attributes from the underlying filesystem and store them in the
associated nfsd_file. This is done when the nfsd_file is first
opened for each regular file.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Stable-dep-of: 25ba2b84c3 ("nfs/localio: avoid issuing misaligned IO using O_DIRECT")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6f5dacf88a ]
This reverts commit ceddedc969.
Commit in question breaks the mapping of PGs to pools for some SKUs.
Specifically multi-host NICs seem to be shipped with a custom buffer
configuration which maps the lossy PG to pool 4. But the bad commit
overrides this with pool 0 which does not have sufficient buffer space
reserved. Resulting in ~40% packet loss. The commit also breaks BMC /
OOB connection completely (100% packet loss).
Revert, similarly to commit 3fbfe251cc ("Revert "net/mlx5e: Update and
set Xon/Xoff upon port speed set""). The breakage is exactly the same,
the only difference is that quoted commit would break the NIC immediately
on boot, and the currently reverted commit only when MTU is changed.
Note: "good" kernels do not restore the configuration, so downgrade isn't
enough to recover machines. A NIC power cycle seems to be necessary to
return to a healthy state (or overriding the relevant registers using
a custom patch).
Fixes: ceddedc969 ("net/mlx5e: Update and set Xon/Xoff upon MTU set")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250929181529.1848157-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2a918911ed ]
Since the bus ops were retired the iommu subsystem changed to using fwspec
to match the iommu driver to the iommu device. If a device has a NULL
fwspec then it is matched to the first iommu driver with a NULL fwspec,
effectively disabling support for systems with more than one non-fwspec
iommu driver.
Thus, if the iommufd selfest are run in an x86 system that registers a
non-fwspec iommu driver they fail to bind their mock devices to the mock
iommu driver.
Fix this by allocating a software fwnode for mock iommu driver's
iommu_device, and set it to the device which mock iommu driver created.
This is done by adding a new helper iommu_mock_device_add() which abuses
the internals of the fwspec system to establish a fwspec before the device
is added and is careful not to leak it. A matching dummy fwspec is
automatically added to the mock iommu driver.
Test by "make -C toosl/testing/selftests TARGETS=iommu run_tests":
PASSED: 229 / 229 tests passed.
In addition, this issue is also can be found on amd platform, and
also tested on a amd machine.
Link: https://patch.msgid.link/r/20250925054730.3877-1-kanie@linux.alibaba.com
Fixes: 17de3f5fdd ("iommu: Retire bus ops")
Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Qinyun Tan <qinyuntan@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2aff4420ef ]
Software can only initialize the PIR and CIR of the command BD ring after
a FLR, and these two registers can only be set to 0. But the reset values
of these two registers are 0, so software does not need to update them.
If there is no a FLR and PIR and CIR are not 0, resetting them to 0 or
other values by software will cause the command BD ring to work
abnormally. This is because of an internal context in the ring prefetch
logic that will retain the state from the first incarnation of the ring
and continue prefetching from the stale location when the ring is
reinitialized. The internal context can only be reset by the FLR.
In addition, there is a logic error in the implementation, next_to_clean
indicates the software CIR and next_to_use indicates the software PIR.
But the current driver uses next_to_clean to set PIR and use next_to_use
to set CIR. This does not cause a problem in actual use, because the
current command BD ring is only initialized after FLR, and the initial
values of next_to_use and next_to_clean are both 0.
Therefore, this patch removes the initialization of PIR and CIR. Instead,
next_to_use and next_to_clean are initialized by reading the values of
PIR and CIR.
Fixes: 4701073c3d ("net: enetc: add initial netc-lib driver to support NTMP")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20250926013954.2003456-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5b66169f6b ]
The active-backup bonding mode supports XFRM ESP offload. However, when
a bond is added using command like `ip link add bond0 type bond mode 1
miimon 100`, the `ethtool -k` command shows that the XFRM ESP offload is
disabled. This occurs because, in bond_newlink(), we change bond link
first and register bond device later. So the XFRM feature update in
bond_option_mode_set() is not called as the bond device is not yet
registered, leading to the offload feature not being set successfully.
To resolve this issue, we can modify the code order in bond_newlink() to
ensure that the bond device is registered first before changing the bond
link parameters. This change will allow the XFRM ESP offload feature to be
correctly enabled.
Fixes: 007ab53455 ("bonding: fix feature flag setting at init time")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250925023304.472186-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5cfbe7ebfa ]
Add sync reset timeout to stop poll_sync_reset in case there was no
reset done or abort event within timeout. Otherwise poll sync reset will
just continue and in case of fw fatal error no health reporting will be
done.
Fixes: 38b9f903f2 ("net/mlx5: Handle sync reset request event")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79a0e32b32 ]
The reclaim_pages_cmd() function sends a command to the firmware to
reclaim pages if the command interface is active.
A race condition can occur if the command interface goes down (e.g., due
to a PCI error) while the mlx5_cmd_do() call is in flight. In this
case, mlx5_cmd_do() will return an error. The original code would
propagate this error immediately, bypassing the software-based page
reclamation logic that is supposed to run when the command interface is
down.
Fix this by checking whether mlx5_cmd_do() returns -ENXIO, which mark
that command interface is down. If this is the case, fall through to
the software reclamation path. If the command failed for any another
reason, or finished successfully, return as before.
Fixes: b898ce7bcc ("net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1f0349bd6 ]
Stop polling on firmware response to command in polling mode if the
command interface got down. This situation can occur, for example, if a
firmware fatal error is detected during polling.
This change halts the polling process when the command interface goes
down, preventing unnecessary waits.
Fixes: b898ce7bcc ("net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessible")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8169a6011c ]
The driver did not handle failure of `netdev_alloc_skb_ip_align()`.
If the allocation failed, dereferencing `skb->protocol` could lead to
a NULL pointer dereference.
This patch tries to allocate `skb`. If the allocation fails, it falls
back to the normal path.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Tested-on: D-Link DGE-550T Rev-A3
Signed-off-by: Yeounsu Moon <yyyynoom@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250928190124.1156-1-yyyynoom@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f017156aea ]
In EC2 instances where the RSS hash key is not configurable, ethtool
shows bogus RSS hash key since ena_get_rxfh_key_size() unconditionally
returns ENA_HASH_KEY_SIZE.
Commit 6a4f7dc82d ("net: ena: rss: do not allocate key when not
supported") added proper handling for devices that don't support RSS
hash key configuration, but ena_get_rxfh_key_size() has been unchanged.
When the RSS hash key is not configurable, return 0 instead of
ENA_HASH_KEY_SIZE to clarify getting the value is not supported.
Tested on m5 instance families.
Without patch:
# ethtool -x ens5 | grep -A 1 "RSS hash key"
RSS hash key:
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
With patch:
# ethtool -x ens5 | grep -A 1 "RSS hash key"
RSS hash key:
Operation not supported
Fixes: 6a4f7dc82d ("net: ena: rss: do not allocate key when not supported")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Link: https://patch.msgid.link/20250929050247.51680-1-enjuk@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8425161ac1 ]
The nfp_net_get_rxfh_key_size() function returns -EOPNOTSUPP when
devices don't support RSS, and callers treat the negative value as a
large positive value since the return type is u32.
Return 0 when devices don't support RSS, aligning with the ethtool
interface .get_rxfh_key_size() that requires returning 0 in such cases.
Fixes: 9ff304bfaf ("nfp: add support for reporting CRC32 hash function")
Signed-off-by: Kohei Enju <enjuk@amazon.com>
Link: https://patch.msgid.link/20250929054230.68120-1-enjuk@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f017c1f768 ]
Some applications are stuck to the 20th century and still use
small SO_RCVBUF values.
After the blamed commit, we can drop packets especially
when using LRO/hw-gro enabled NIC and small MSS (1500) values.
LRO/hw-gro NIC pack multiple segments into pages, allowing
tp->scaling_ratio to be set to a high value.
Whenever the receive queue gets full, we can receive a small packet
filling RWIN, but with a high skb->truesize, because most NIC use 4K page
plus sk_buff metadata even when receiving less than 1500 bytes of payload.
Even if we refine how tp->scaling_ratio is estimated,
we could have an issue at the start of the flow, because
the first round of packets (IW10) will be sent based on
the initial tp->scaling_ratio (1/2)
Relax tcp_can_ingest() to use skb->len instead of skb->truesize,
allowing the peer to use final RWIN, assuming a 'perfect'
scaling_ratio of 1.
Fixes: 1d2fbaad7c ("tcp: stronger sk_rcvbuf checks")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250927092827.2707901-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ed4728eb9 ]
In case of a jump to the err label due to atmel_nand_create() or
atmel_nand_controller_add_nand() failure, the reference to nand_np
need to be released
Use for_each_child_of_node_scoped() to fix the issue.
Fixes: f88fc122cc ("mtd: nand: Cleanup/rework the atmel_nand driver")
Signed-off-by: Erick Karanja <karanja99erick@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 612b1dfeb4 ]
Fix division by zero in ks_sa_rng_init caused by missing clock
pointer initialization. The clk_get_rate() call is performed on
an uninitialized clk pointer, resulting in division by zero when
calculating delay values.
Add clock initialization code before using the clock.
Fixes: 6d01d8511d ("hwrng: ks-sa - Add minimum sleep time before ready-polling")
Signed-off-by: Nishanth Menon <nm@ti.com>
drivers/char/hw_random/ks-sa-rng.c | 7 +++++++
1 file changed, 7 insertions(+)
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5851afffe2 ]
Fix the X.509 Basic Constraints CA flag parsing to correctly handle
the ASN.1 DER encoded structure. The parser was incorrectly treating
the length field as the boolean value.
Per RFC 5280 section 4.1, X.509 certificates must use ASN.1 DER encoding.
According to ITU-T X.690, a DER-encoded BOOLEAN is represented as:
Tag (0x01), Length (0x01), Value (0x00 for FALSE, 0xFF for TRUE)
The basicConstraints extension with CA:TRUE is encoded as:
SEQUENCE (0x30) | Length | BOOLEAN (0x01) | Length (0x01) | Value (0xFF)
^-- v[2] ^-- v[3] ^-- v[4]
The parser was checking v[3] (the length field, always 0x01) instead
of v[4] (the actual boolean value, 0xFF for TRUE in DER encoding).
Also handle the case where the extension is an empty SEQUENCE (30 00),
which is valid for CA:FALSE when the default value is omitted as
required by DER encoding rules (X.690 section 11.5).
Per ITU-T X.690-0207:
- Section 11.5: Default values must be omitted in DER
- Section 11.1: DER requires TRUE to be encoded as 0xFF
Link: https://datatracker.ietf.org/doc/html/rfc5280
Link: https://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf
Fixes: 30eae2b037 ("KEYS: X.509: Parse Basic Constraints for CA")
Signed-off-by: Fan Wu <wufan@kernel.org>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03ddb4ac25 ]
When creating an advertisement for BIG the address shall not be
non-resolvable since in case of acting as BASS/Broadcast Assistant the
address must be the same as the connection in order to use the PAST
method and even when PAST/BASS are not in the picture a Periodic
Advertisement can still be synchronized thus the same argument as to
connectable advertisements still stand.
Fixes: eca0ae4aea ("Bluetooth: Add initial implementation of BIS connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5bf863f4c5 ]
For ISO_CONT RX, the data from skb is copied to conn->rx_skb, but the
skb is leaked.
Free skb after copying its data.
Fixes: ccf74f2390 ("Bluetooth: Add BTPROTO_ISO socket type")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ba85da580 ]
If iso_conn is freed when RX is incomplete, free any leftover skb piece.
Fixes: dc26097bdb ("Bluetooth: ISO: Use kref to track lifetime of iso_conn")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9950f095d6 ]
This attempt to fix similar issue to sco_conn_free where if the
conn->sk is not set to NULL may lead to UAF on iso_conn_free.
Fixes: ccf74f2390 ("Bluetooth: Add BTPROTO_ISO socket type")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79e562a52a ]
The debug UUID was only getting set if MGMT_OP_READ_EXP_FEATURES_INFO
was not called with a specific index which breaks the likes of
bluetoothd since it only invokes MGMT_OP_READ_EXP_FEATURES_INFO when an
adapter is plugged, so instead of depending hdev not to be set just
enable the UUID on any index like it was done with iso_sock_uuid.
Fixes: e625e50cee ("Bluetooth: Introduce debug feature when dynamic debug is disabled")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 58fddb364d ]
As device coredumps are not HCI traces, maintain the device coredump at
the driver level and eliminate the dependency on hdev_devcd*()
Signed-off-by: Kiran K <kiran.k@intel.com>
Fixes: 07e6bddb54 ("Bluetooth: btintel_pcie: Add support for device coredump")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 78d901897b ]
Move from 2*NUM_QUEUES dma_alloc_coherent() for DMA descriptor rings to
2 calls overall.
Issue is with how all queues share the same register for configuring the
upper 32-bits of Tx/Rx descriptor rings. Taking Tx, notice how TBQPH
does *not* depend on the queue index:
#define GEM_TBQP(hw_q) (0x0440 + ((hw_q) << 2))
#define GEM_TBQPH(hw_q) (0x04C8)
queue_writel(queue, TBQP, lower_32_bits(queue->tx_ring_dma));
#ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
if (bp->hw_dma_cap & HW_DMA_CAP_64B)
queue_writel(queue, TBQPH, upper_32_bits(queue->tx_ring_dma));
#endif
To maximise our chances of getting valid DMA addresses, we do a single
dma_alloc_coherent() across queues. This improves the odds because
alloc_pages() guarantees natural alignment. Other codepaths (IOMMU or
dev/arch dma_map_ops) don't give high enough guarantees
(even page-aligned isn't enough).
Two consideration:
- dma_alloc_coherent() gives us page alignment. Here we remove this
constraint meaning each queue's ring won't be page-aligned anymore.
- This can save some tiny amounts of memory. Fewer allocations means
(1) less overhead (constant cost per alloc) and (2) less wasted bytes
due to alignment constraints.
Example for (2): 4 queues, default ring size (512), 64-bit DMA
descriptors, 16K pages:
- Before: 8 allocs of 8K, each rounded to 16K => 64K wasted.
- After: 2 allocs of 32K => 0K wasted.
Fixes: 02c958dd34 ("net/macb: add TX multiqueue support for gem")
Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Tested-by: Nicolas Ferre <nicolas.ferre@microchip.com> # on sam9x75
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250923-macb-fixes-v6-4-772d655cdeb6@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 92d4256faf ]
The tx/rx ring size calculation is somewhat complex and partially hidden
behind a macro. Move that out of the {RX,TX}_RING_BYTES() macros and
macb_{alloc,free}_consistent() functions into neat separate functions.
In macb_free_consistent(), we drop the size variable and directly call
the size helpers in the arguments list. In macb_alloc_consistent(), we
keep the size variable that is used by netdev_dbg() calls.
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250923-macb-fixes-v6-3-772d655cdeb6@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 78d901897b ("net: macb: single dma_alloc_coherent() for DMA descriptors")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fca3dc859b ]
The MACB driver acts as if TBQPH/RBQPH are configurable on a per queue
basis; this is a lie. A single register configures the upper 32 bits of
each DMA descriptor buffers for all queues.
Concrete actions:
- Drop GEM_TBQPH/GEM_RBQPH macros which have a queue index argument.
Only use MACB_TBQPH/MACB_RBQPH constants.
- Drop struct macb_queue->TBQPH/RBQPH fields.
- In macb_init_buffers(): do a single write to TBQPH and RBQPH for all
queues instead of a write per queue.
- In macb_tx_error_task(): drop the write to TBQPH.
- In macb_alloc_consistent(): if allocations give different upper
32-bits, fail. Previously, it would have lead to silent memory
corruption as queues would have used the upper 32 bits of the alloc
from queue 0 and their own low 32 bits.
- In macb_suspend(): if we use the tie off descriptor for suspend, do
the write once for all queues instead of once per queue.
Fixes: fff8019a08 ("net: macb: Add 64 bit addressing support for GEM")
Fixes: ae1f2a56d2 ("net: macb: Added support for many RX queues")
Reviewed-by: Sean Anderson <sean.anderson@linux.dev>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250923-macb-fixes-v6-2-772d655cdeb6@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 958baf5eae ]
syzbot reported WARNING in rtl8150_start_xmit/usb_submit_urb.
This is the sequence of events that leads to the warning:
rtl8150_start_xmit() {
netif_stop_queue();
usb_submit_urb(dev->tx_urb);
}
rtl8150_set_multicast() {
netif_stop_queue();
netif_wake_queue(); <-- wakes up TX queue before URB is done
}
rtl8150_start_xmit() {
netif_stop_queue();
usb_submit_urb(dev->tx_urb); <-- double submission
}
rtl8150_set_multicast being the ndo_set_rx_mode callback should not be
calling netif_stop_queue and notif_start_queue as these handle
TX queue synchronization.
The net core function dev_set_rx_mode handles the synchronization
for rtl8150_set_multicast making it safe to remove these locks.
Reported-and-tested-by: syzbot+78cae3f37c62ad092caa@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=78cae3f37c62ad092caa
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Tested-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: I Viswanath <viswanathiyyappan@gmail.com>
Link: https://patch.msgid.link/20250924134350.264597-1-viswanathiyyappan@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fdd0fe94d6 ]
In siw_post_send(), any immediate error encountered during processing of
the work request list must be reported to the caller, even if previous
work requests in that list were just accepted and added to the send queue.
Not reporting those errors confuses the caller, which would wait
indefinitely for the failing and potentially subsequently aborted work
requests completion.
This fixes a case where immediate errors were overwritten by subsequent
code in siw_post_send().
Fixes: 303ae1cdfd ("rdma/siw: application interface")
Link: https://patch.msgid.link/r/20250923144536.103825-1-bernard.metzler@linux.dev
Suggested-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Bernard Metzler <bernard.metzler@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d235d8494 ]
Fix to avoid the usage of the `res` variable uninitialized in the
following macro expansions.
It solves the following warning:
In function ‘iommufd_viommu_vdevice_alloc’,
inlined from ‘wrapper_iommufd_viommu_vdevice_alloc’ at iommufd.c:2889:1:
../kselftest_harness.h:760:12: warning: ‘ret’ may be used uninitialized [-Wmaybe-uninitialized]
760 | if (!(__exp _t __seen)) { \
| ^
../kselftest_harness.h:513:9: note: in expansion of macro ‘__EXPECT’
513 | __EXPECT(expected, #expected, seen, #seen, ==, 1)
| ^~~~~~~~
iommufd_utils.h:1057:9: note: in expansion of macro ‘ASSERT_EQ’
1057 | ASSERT_EQ(0, _test_cmd_trigger_vevents(self->fd, dev_id, nvevents))
| ^~~~~~~~~
iommufd.c:2924:17: note: in expansion of macro ‘test_cmd_trigger_vevents’
2924 | test_cmd_trigger_vevents(dev_id, 3);
| ^~~~~~~~~~~~~~~~~~~~~~~~
The issue can be reproduced, building the tests, with the command: make -C
tools/testing/selftests TARGETS=iommu
Link: https://patch.msgid.link/r/20250924171629.50266-1-alessandro.zanni87@gmail.com
Fixes: 97717a1f28 ("iommufd/selftest: Add IOMMU_VEVENTQ_ALLOC test coverage")
Signed-off-by: Alessandro Zanni <alessandro.zanni87@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 57f55048e5 ]
Dirty page tracking relies on the IOMMU atomically updating the dirty bit
in the paging-structure entry. For this operation to succeed, the paging-
structure memory must be coherent between the IOMMU and the CPU. In
another word, if the iommu page walk is incoherent, dirty page tracking
doesn't work.
The Intel VT-d specification, Section 3.10 "Snoop Behavior" states:
"Remapping hardware encountering the need to atomically update A/EA/D bits
in a paging-structure entry that is not snooped will result in a non-
recoverable fault."
To prevent an IOMMU from being incorrectly configured for dirty page
tracking when it is operating in an incoherent mode, mark SSADS as
supported only when both ecap_slads and ecap_smpwc are supported.
Fixes: f35f22cc76 ("iommu/vt-d: Access/Dirty bit support for SS domains")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20250924083447.123224-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2bdf1d428f ]
R-Car V4H Reference Manual R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025 page 4581
Figure 104.3b Initial Setting of PCIEC(example), third quarter of the
figure indicates that register 0xf8 should be polled until bit 18 becomes
set to 1.
Register 0xf8, bit 18 is 0 immediately after write to PCIERSTCTRL1 and is
set to 1 in less than 1 ms afterward. The current readl_poll_timeout()
break condition is inverted and returns when register 0xf8, bit 18 is set
to 0, which in most cases means immediately. In case
CONFIG_DEBUG_LOCK_ALLOC=y, the timing changes just enough for the first
readl_poll_timeout() poll to already read register 0xf8, bit 18 as 1 and
afterward never read register 0xf8, bit 18 as 0, which leads to timeout
and failure to start the PCIe controller.
Fix this by inverting the poll condition to match the reference manual
initialization sequence.
Fixes: faf5a975ee ("PCI: rcar-gen4: Add support for R-Car V4H")
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20250915235910.47768-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0056d29f8c ]
Assure the reset is latched and the core is ready for DBI access. On R-Car
V4H, the PCIe reset is asynchronous and does not take effect immediately,
but needs a short time to complete. In case DBI access happens in that
short time, that access generates an SError. Make sure that condition can
never happen, read back the state of the reset, which should turn the
asynchronous reset into a synchronous one, and wait a little over 1ms to
add additional safety margin.
Fixes: 0d0c551011 ("PCI: rcar-gen4: Add R-Car Gen4 PCIe controller support for host mode")
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20250924005610.96484-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8795b70581 ]
R-Car V4H Reference Manual R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025 page 585
Figure 9.3.2 Software Reset flow (B) indicates that for peripherals in HSC
domain, after reset has been asserted by writing a matching reset bit into
register SRCR, it is mandatory to wait 1ms.
Because it is the controller driver which can determine whether or not the
controller is in HSC domain based on its compatible string, add the missing
delay in the controller driver.
This 1ms delay is documented on R-Car V4H and V4M; it is currently unclear
whether S4 is affected as well. This patch does apply the extra delay on
R-Car S4 as well.
Fixes: 0d0c551011 ("PCI: rcar-gen4: Add R-Car Gen4 PCIe controller support for host mode")
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
[mani: added the missing r-b tag from Krzysztof]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://patch.msgid.link/20250919134644.208098-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e40b984b6c ]
The VHCI platform driver aims to forbid entering system suspend when at
least one of the virtual USB ports are bound to an active USB/IP
connection.
However, in some cases, the detection logic doesn't work reliably, i.e.
when all devices attached to the virtual root hub have been already
suspended, leading to a broken suspend state, with unrecoverable resume.
Ensure the virtually attached devices do not enter suspend by setting
the syscore PM flag. Note this is currently limited to the client side
only, since the server side doesn't implement system suspend prevention.
Fixes: 04679b3489 ("Staging: USB/IP: add client driver")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250902-vhci-hcd-suspend-fix-v3-1-864e4e833559@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79dde5f7dc ]
The cpu_latency_qos_add/remove/update_request interfaces lack internal
synchronization by design, requiring the caller to ensure thread safety.
The current implementation relies on the 'pm_qos_enabled' flag, which is
insufficient to prevent concurrent access and cannot serve as a proper
synchronization mechanism. This has led to data races and list
corruption issues.
A typical race condition call trace is:
[Thread A]
ufshcd_pm_qos_exit()
--> cpu_latency_qos_remove_request()
--> cpu_latency_qos_apply();
--> pm_qos_update_target()
--> plist_del <--(1) delete plist node
--> memset(req, 0, sizeof(*req));
--> hba->pm_qos_enabled = false;
[Thread B]
ufshcd_devfreq_target
--> ufshcd_devfreq_scale
--> ufshcd_scale_clks
--> ufshcd_pm_qos_update <--(2) pm_qos_enabled is true
--> cpu_latency_qos_update_request
--> pm_qos_update_target
--> plist_del <--(3) plist node use-after-free
Introduces a dedicated mutex to serialize PM QoS operations, preventing
data races and ensuring safe access to PM QoS resources, including sysfs
interface reads.
Fixes: 2777e73fc1 ("scsi: ufs: core: Add CPU latency QoS support for UFS driver")
Signed-off-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Huan Tang <tanghuan@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c5ba345b2d ]
ct_seq_show() has an opportunistic garbage collector :
if (nf_ct_should_gc(ct)) {
nf_ct_kill(ct);
goto release;
}
So if one nf_conn is killed there, next time ct_get_next() runs,
we skip the following item in the bucket, even if it should have
been displayed if gc did not take place.
We can decrement st->skip_elems to tell ct_get_next() one of the items
was removed from the chain.
Fixes: 58e207e498 ("netfilter: evict stale entries when user reads /proc/net/nf_conntrack")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09efbac953 ]
During a batch replay, the nlh pointer is not reset until the parsing of
the commands. Since commit bf2ac490d2 ("netfilter: nfnetlink: Handle
ACK flags for batch messages") that is problematic as the condition to
add an ACK for batch begin will evaluate to true even if NLM_F_ACK
wasn't used for batch begin message.
If there is an error during the command processing, netlink is sending
an ACK despite that. This misleads userspace tools which think that the
return code was 0. Reset the nlh pointer to the original one when a
replay is triggered.
Fixes: bf2ac490d2 ("netfilter: nfnetlink: Handle ACK flags for batch messages")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 134121bfd9 ]
On the netns cleanup path, __ip_vs_ftp_exit() may unregister ip_vs_ftp
before connections with valid cp->app pointers are flushed, leading to a
use-after-free.
Fix this by introducing a global `exiting_module` flag, set to true in
ip_vs_ftp_exit() before unregistering the pernet subsystem. In
__ip_vs_ftp_exit(), skip ip_vs_ftp unregister if called during netns
cleanup (when exiting_module is false) and defer it to
__ip_vs_cleanup_batch(), which unregisters all apps after all connections
are flushed. If called during module exit, unregister ip_vs_ftp
immediately.
Fixes: 61b1ab4583 ("IPVS: netns, add basic init per netns.")
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Slavin Liu <slavin452@gmail.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 191512355e ]
When the client max_resp_sz is larger than what the server encodes in
its reply, the nfs4_verify_back_channel_attrs() check fails and this
causes nfs4_proc_create_session() to fail, in cases where the client
page size is larger than that of the server and the server does not want
to negotiate upwards.
While this is not a problem with the linux nfs server that will reflect
the proposed value in its reply irrespective of the local page size,
other nfs server implementations may insist on their own max_resp_sz
value, which could be smaller.
Fix this by accepting smaller max_resp_sz values from the server, as
this does not violate the protocol. The server is allowed to decrease
but not increase proposed the size, and as such values smaller than the
client-proposed ones are valid.
Fixes: 43c2e885be ("nfs4: fix channel attribute sanity-checks")
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 43e0a92c04 ]
Some TPDM devices support both CMB and DSB datasets, requiring
the system to enable the port with both corresponding element sizes.
Currently, the logic treats tpdm_read_element_size as successful if
the CMB element size is retrieved correctly, regardless of whether
the DSB element size is obtained. This behavior causes issues
when parsing data from TPDM devices that depend on both element sizes.
To address this, the function should explicitly fail if the DSB
element size cannot be read correctly.
Fixes: e6d7f5252f ("coresight-tpda: Add support to configure CMB element")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Jie Gan <jie.gan@oss.qualcomm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250906-fix_element_size_issue-v2-1-dbb0ac2541a9@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a8f2d480f1 ]
Some CoreSight components have trace bus clocks 'atclk' and are enabled
using clk_prepare_enable(). These clocks are not disabled when modules
exit.
As atclk is optional, use devm_clk_get_optional_enabled() to manage it.
The benefit is the driver model layer can automatically disable and
release clocks.
Check the returned value with IS_ERR() to detect errors but leave the
NULL pointer case if the clock is not found. And remove the error
handling codes which are no longer needed.
Fixes: d1839e6877 ("coresight: etm: retrieve and handle atclk")
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250731-arm_cs_fix_clock_v4-v6-5-1dfe10bb3f6f@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1abc1b212e ]
Some CoreSight components have programming clocks (pclk) and are enabled
using clk_get() and clk_prepare_enable(). However, in many cases, these
clocks are not disabled when modules exit and only released by clk_put().
To fix the issue, this commit refactors programming clock by replacing
clk_get() and clk_prepare_enable() with devm_clk_get_optional_enabled()
for enabling APB clock. If the "apb_pclk" clock is not found, a NULL
pointer is returned, and the function proceeds to attempt enabling the
"apb" clock.
Since ACPI platforms rely on firmware to manage clocks, returning a NULL
pointer in this case leaves clock management to the firmware rather than
the driver. This effectively avoids a clock imbalance issue during
module removal - where the clock could be disabled twice: once during
the ACPI runtime suspend and again during the devm resource release.
Callers are updated to reuse the returned error value.
With the change, programming clocks are managed as resources in driver
model layer, allowing clock cleanup to be handled automatically. As a
result, manual cleanup operations are no longer needed and are removed
from the Coresight drivers.
Fixes: 73d779a03a ("coresight: etm4x: Change etm4_platform_driver driver for MMIO devices")
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250731-arm_cs_fix_clock_v4-v6-4-1dfe10bb3f6f@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 40c0cdc9cb ]
The atclk is an optional clock for the CoreSight ETMv4, but the driver
misses to initialize it.
This change enables atclk in probe of the ETMv4 driver, and dynamically
control the clock during suspend and resume.
No need to check the driver data and clock pointer in the runtime
suspend and resume, so remove checks. And add error handling in the
resume function.
Add a minor fix to the comment format when adding the atclk field.
Fixes: 2e1cdfe184 ("coresight-etm4x: Adding CoreSight ETM4x driver")
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250731-arm_cs_fix_clock_v4-v6-3-1dfe10bb3f6f@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a79026926 ]
The atclk is an optional clock for the CoreSight TMC, but the driver
misses to initialize it. In most cases, TMC shares the atclk clock with
other CoreSight components. Since these components enable the clock
before the TMC device is initialized, the TMC continues properly,
which is why we don’t observe any lockup issues.
This change enables atclk in probe of the TMC driver. Given the clock
is optional, it is possible to return NULL if the clock does not exist.
IS_ERR() is tolerant for this case.
Dynamically disable and enable atclk during suspend and resume. The
clock pointers will never be error values if the driver has successfully
probed, and the case of a NULL pointer case will be handled by the clock
core layer. The driver data is always valid after probe. Therefore,
remove the related checks. Also in the resume flow adds error handling.
Fixes: bc4bf7fe98 ("coresight-tmc: add CoreSight TMC driver")
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Tested-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250731-arm_cs_fix_clock_v4-v6-1-1dfe10bb3f6f@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ddf6d3fcb ]
The return values of VDO_ASSERT calls that validate metadata are not acted
upon.
Return UDS_CORRUPT_DATA in case of an error.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: a4eb7e2555 ("dm vdo: implement the volume index")
Signed-off-by: Ivan Abramov <i.abramov@mt-integration.ru>
Reviewed-by: Matthew Sakai <msakai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 142964960c ]
The ADSP firmware on X1E has separate firmware binaries for the main
firmware and the DTB. The same applies for the "lite" firmware loaded by
the boot firmware.
When preparing to load the new ADSP firmware we shutdown the lite_pas_id
for the main firmware, but we don't shutdown the corresponding lite pas_id
for the DTB. The fact that we're leaving it "running" forever becomes
obvious if you try to reuse (or just access) the memory region used by the
"lite" firmware: The &adsp_boot_mem is accessible, but accessing the
&adsp_boot_dtb_mem results in a crash.
We don't support reusing the memory regions currently, but nevertheless we
should not keep part of the lite firmware running. Fix this by adding the
lite_dtb_pas_id and shutting it down as well.
We don't have a way to detect if the lite firmware is actually running yet,
so ignore the return status of qcom_scm_pas_shutdown() for now. This was
already the case before, the assignment to "ret" is not used anywhere.
Fixes: 62210f7509 ("remoteproc: qcom_q6v5_pas: Unload lite firmware on ADSP")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Link: https://lore.kernel.org/r/20250820-rproc-qcom-q6v5-fixes-v2-3-910b1a3aff71@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 110be46f5a ]
enable_irq() and disable_irq() are reference counted, so we must make sure
that each enable_irq() is always paired with a single disable_irq(). If we
call disable_irq() twice followed by just a single enable_irq(), the IRQ
will remain disabled forever.
For the error handling path in qcom_q6v5_wait_for_start(), disable_irq()
will end up being called twice, because disable_irq() also happens in
qcom_q6v5_unprepare() when rolling back the call to qcom_q6v5_prepare().
Fix this by dropping disable_irq() in qcom_q6v5_wait_for_start(). Since
qcom_q6v5_prepare() is the function that calls enable_irq(), it makes more
sense to have the rollback handled always by qcom_q6v5_unprepare().
Fixes: 3b415c8fb2 ("remoteproc: q6v5: Extract common resource handling")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Link: https://lore.kernel.org/r/20250820-rproc-qcom-q6v5-fixes-v2-1-910b1a3aff71@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4f152338e3 ]
During PERST# assertion tegra_pcie_bpmp_set_pll_state() is currently
called twice.
pex_ep_event_pex_rst_assert() should do the opposite of
pex_ep_event_pex_rst_deassert(), so it is obvious that the duplicate
tegra_pcie_bpmp_set_pll_state() is a mistake, and that the duplicate
tegra_pcie_bpmp_set_pll_state() call should instead be a call to
tegra_pcie_bpmp_set_ctrl_state().
With this, the uninitialization sequence also matches that of
tegra_pcie_unconfig_controller().
Fixes: a54e190737 ("PCI: tegra194: Add Tegra234 PCIe support")
Signed-off-by: Nagarjuna Kristam <nkristam@nvidia.com>
[cassel: improve commit log]
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20250911093021.1454385-2-cassel@kernel.org
[mani: added Fixes tag]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 570f945117 ]
Lockdep gives a splat [1] when ser_hdl_work item is executed. It is
scheduled at mac80211 workqueue via ieee80211_queue_work() and takes a
wiphy lock inside. However, this workqueue can be flushed when e.g.
closing the interface and wiphy lock is already taken in that case.
Choosing wiphy_work_queue() for SER is likely not suitable. Back on to
the global workqueue.
[1]:
WARNING: possible circular locking dependency detected
6.17.0-rc2 #17 Not tainted
------------------------------------------------------
kworker/u32:1/61 is trying to acquire lock:
ffff88811bc00768 (&rdev->wiphy.mtx){+.+.}-{4:4}, at: ser_state_run+0x5e/0x180 [rtw89_core]
but task is already holding lock:
ffffc9000048fd30 ((work_completion)(&ser->ser_hdl_work)){+.+.}-{0:0}, at: process_one_work+0x7b5/0x1450
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 ((work_completion)(&ser->ser_hdl_work)){+.+.}-{0:0}:
process_one_work+0x7c6/0x1450
worker_thread+0x49e/0xd00
kthread+0x313/0x640
ret_from_fork+0x221/0x300
ret_from_fork_asm+0x1a/0x30
-> #1 ((wq_completion)phy0){+.+.}-{0:0}:
touch_wq_lockdep_map+0x8e/0x180
__flush_workqueue+0x129/0x10d0
ieee80211_stop_device+0xa8/0x110
ieee80211_do_stop+0x14ce/0x2880
ieee80211_stop+0x13a/0x2c0
__dev_close_many+0x18f/0x510
__dev_change_flags+0x25f/0x670
netif_change_flags+0x7b/0x160
do_setlink.isra.0+0x1640/0x35d0
rtnl_newlink+0xd8c/0x1d30
rtnetlink_rcv_msg+0x700/0xb80
netlink_rcv_skb+0x11d/0x350
netlink_unicast+0x49a/0x7a0
netlink_sendmsg+0x759/0xc20
____sys_sendmsg+0x812/0xa00
___sys_sendmsg+0xf7/0x180
__sys_sendmsg+0x11f/0x1b0
do_syscall_64+0xbb/0x360
entry_SYSCALL_64_after_hwframe+0x77/0x7f
-> #0 (&rdev->wiphy.mtx){+.+.}-{4:4}:
__lock_acquire+0x124c/0x1d20
lock_acquire+0x154/0x2e0
__mutex_lock+0x17b/0x12f0
ser_state_run+0x5e/0x180 [rtw89_core]
rtw89_ser_hdl_work+0x119/0x220 [rtw89_core]
process_one_work+0x82d/0x1450
worker_thread+0x49e/0xd00
kthread+0x313/0x640
ret_from_fork+0x221/0x300
ret_from_fork_asm+0x1a/0x30
other info that might help us debug this:
Chain exists of:
&rdev->wiphy.mtx --> (wq_completion)phy0 --> (work_completion)(&ser->ser_hdl_work)
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&ser->ser_hdl_work));
lock((wq_completion)phy0);
lock((work_completion)(&ser->ser_hdl_work));
lock(&rdev->wiphy.mtx);
*** DEADLOCK ***
2 locks held by kworker/u32:1/61:
#0: ffff888103835148 ((wq_completion)phy0){+.+.}-{0:0}, at: process_one_work+0xefa/0x1450
#1: ffffc9000048fd30 ((work_completion)(&ser->ser_hdl_work)){+.+.}-{0:0}, at: process_one_work+0x7b5/0x1450
stack backtrace:
CPU: 0 UID: 0 PID: 61 Comm: kworker/u32:1 Not tainted 6.17.0-rc2 #17 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS edk2-20250523-14.fc42 05/23/2025
Workqueue: phy0 rtw89_ser_hdl_work [rtw89_core]
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
print_circular_bug.cold+0x178/0x1be
check_noncircular+0x14c/0x170
__lock_acquire+0x124c/0x1d20
lock_acquire+0x154/0x2e0
__mutex_lock+0x17b/0x12f0
ser_state_run+0x5e/0x180 [rtw89_core]
rtw89_ser_hdl_work+0x119/0x220 [rtw89_core]
process_one_work+0x82d/0x1450
worker_thread+0x49e/0xd00
kthread+0x313/0x640
ret_from_fork+0x221/0x300
ret_from_fork_asm+0x1a/0x30
</TASK>
Found by Linux Verification Center (linuxtesting.org).
Fixes: ebfc9199df ("wifi: rtw89: add wiphy_lock() to work that isn't held wiphy_lock() yet")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250919210852.823912-5-pchelkin@ispras.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c56325259a ]
The test will fail as below on x86_64 with cpu la57 support (will skip if
no la57 support). Note, the test requries nr_hugepages to be set first.
# running bash ./va_high_addr_switch.sh
# -------------------------------------
# mmap(addr_switch_hint - pagesize, pagesize): 0x7f55b60fa000 - OK
# mmap(addr_switch_hint - pagesize, (2 * pagesize)): 0x7f55b60f9000 - OK
# mmap(addr_switch_hint, pagesize): 0x800000000000 - OK
# mmap(addr_switch_hint, 2 * pagesize, MAP_FIXED): 0x800000000000 - OK
# mmap(NULL): 0x7f55b60f9000 - OK
# mmap(low_addr): 0x40000000 - OK
# mmap(high_addr): 0x1000000000000 - OK
# mmap(high_addr) again: 0xffff55b6136000 - OK
# mmap(high_addr, MAP_FIXED): 0x1000000000000 - OK
# mmap(-1): 0xffff55b6134000 - OK
# mmap(-1) again: 0xffff55b6132000 - OK
# mmap(addr_switch_hint - pagesize, pagesize): 0x7f55b60fa000 - OK
# mmap(addr_switch_hint - pagesize, 2 * pagesize): 0x7f55b60f9000 - OK
# mmap(addr_switch_hint - pagesize/2 , 2 * pagesize): 0x7f55b60f7000 - OK
# mmap(addr_switch_hint, pagesize): 0x800000000000 - OK
# mmap(addr_switch_hint, 2 * pagesize, MAP_FIXED): 0x800000000000 - OK
# mmap(NULL, MAP_HUGETLB): 0x7f55b5c00000 - OK
# mmap(low_addr, MAP_HUGETLB): 0x40000000 - OK
# mmap(high_addr, MAP_HUGETLB): 0x1000000000000 - OK
# mmap(high_addr, MAP_HUGETLB) again: 0xffff55b5e00000 - OK
# mmap(high_addr, MAP_FIXED | MAP_HUGETLB): 0x1000000000000 - OK
# mmap(-1, MAP_HUGETLB): 0x7f55b5c00000 - OK
# mmap(-1, MAP_HUGETLB) again: 0x7f55b5a00000 - OK
# mmap(addr_switch_hint - pagesize, 2*hugepagesize, MAP_HUGETLB): 0x800000000000 - FAILED
# mmap(addr_switch_hint , 2*hugepagesize, MAP_FIXED | MAP_HUGETLB): 0x800000000000 - OK
# [FAIL]
addr_switch_hint is defined as DFEFAULT_MAP_WINDOW in the failed test (for
x86_64, DFEFAULT_MAP_WINDOW is defined as (1UL<<47) - pagesize) in 64 bit.
Before commit cc92882ee2 ("mm: drop hugetlb_get_unmapped_area{_*}
functions"), for x86_64 hugetlb_get_unmapped_area() is handled in arch
code arch/x86/mm/hugetlbpage.c and addr is checked with
map_address_hint_valid() after align with 'addr &= huge_page_mask(h)'
which is a round down way, and it will fail the check because the addr is
within the DEFAULT_MAP_WINDOW but (addr + len) is above the
DFEFAULT_MAP_WINDOW. So it wil go through the
hugetlb_get_unmmaped_area_top_down() to find an area within the
DFEFAULT_MAP_WINDOW.
After commit cc92882ee2 ("mm: drop hugetlb_get_unmapped_area{_*}
functions"). The addr hint for hugetlb_get_unmmaped_area() will be
rounded up and aligned to hugepage size with ALIGN() for all arches. And
after the align, the addr will be above the default MAP_DEFAULT_WINDOW,
and the map_addresshint_valid() check will pass because both aligned addr
(addr0) and (addr + len) are above the DEFAULT_MAP_WINDOW, and the aligned
hint address (0x800000000000) is returned as an suitable gap is found
there, in arch_get_unmapped_area_topdown().
To still cover the case that addr is within the DEFAULT_MAP_WINDOW, and
addr + len is above the DFEFAULT_MAP_WINDOW, change to choose the last
hugepage aligned address within the DEFAULT_MAP_WINDOW as the hint addr,
and the addr + len (2 hugepages) will be one hugepage above the
DEFAULT_MAP_WINDOW. An aligned address won't be affected by the page
round up or round down from kernel, so it's determistic.
Link: https://lkml.kernel.org/r/20250912013711.3002969-4-chuhu@redhat.com
Fixes: cc92882ee2 ("mm: drop hugetlb_get_unmapped_area{_*} functions")
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ca7eada62 ]
When do_task() exhausts its iteration budget (!ret), it sets the state
to TASK_STATE_IDLE to reschedule, without a secondary check on the
current task->state. This can overwrite the TASK_STATE_DRAINING state
set by a concurrent call to rxe_cleanup_task() or rxe_disable_task().
While state changes are protected by a spinlock, both rxe_cleanup_task()
and rxe_disable_task() release the lock while waiting for the task to
finish draining in the while(!is_done(task)) loop. The race occurs if
do_task() hits its iteration limit and acquires the lock in this window.
The cleanup logic may then proceed while the task incorrectly
reschedules itself, leading to a potential use-after-free.
This bug was introduced during the migration from tasklets to workqueues,
where the special handling for the draining case was lost.
Fix this by restoring the original pre-migration behavior. If the state is
TASK_STATE_DRAINING when iterations are exhausted, set cont to 1 to
force a new loop iteration. This allows the task to finish its work, so
that a subsequent iteration can reach the switch statement and correctly
transition the state to TASK_STATE_DRAINED, stopping the task as intended.
Fixes: 9b4b7c1f9f ("RDMA/rxe: Add workqueue support for rxe tasks")
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20250919025212.1682087-1-hanguidong02@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7ca61ed8b3 ]
In ath12k_dp_mon_rx_deliver_msdu(), peer lookup fails because
rxcb->peer_id is not updated with a valid value. This is expected
in monitor mode, where RX frames bypass the regular RX
descriptor path that typically sets rxcb->peer_id.
As a result, the peer is NULL, and link_id and link_valid fields
in the RX status are not populated. This leads to a WARN_ON in
mac80211 when it receives data frame from an associated station
with invalid link_id.
Fix this potential issue by using ppduinfo->peer_id, which holds
the correct peer id for the received frame. This ensures that the
peer is correctly found and the associated link metadata is updated
accordingly.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.4.1-00199-QCAHKSWPL_SILICONZ-1
Fixes: bd00cc7e8a ("wifi: ath12k: replace the usage of rx desc with rx_info")
Signed-off-by: Hari Chandrakanthan <quic_haric@quicinc.com>
Signed-off-by: Aishwarya R <aishwarya.r@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250724040552.1170642-1-aishwarya.r@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f0cafb02de ]
When the initialization of qm->debug.acc_diff_reg fails,
the probe process does not exit. However, after qm->debug.qm_diff_regs is
freed, it is not set to NULL. This can lead to a double free when the
remove process attempts to free it again. Therefore, qm->debug.qm_diff_regs
should be set to NULL after it is freed.
Fixes: 8be0913389 ("crypto: hisilicon/debugfs - Fix debugfs uninit process issue")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f75f66683d ]
In commit 42d9f6c774 ("crypto: acomp - Move scomp stream allocation
code into acomp"), the crypto_acomp_streams struct was made to rely on
having the alloc_ctx and free_ctx operations defined in the same order
as the scomp_alg struct. But in that same commit, the alloc_ctx and
free_ctx members of scomp_alg may be randomized by structure layout
randomization, since they are contained in a pure ops structure
(containing only function pointers). If the pointers within scomp_alg
are randomized, but those in crypto_acomp_streams are not, then
the order may no longer match. This fixes the problem by removing the
union from scomp_alg so that both crypto_acomp_streams and scomp_alg
will share the same definition of alloc_ctx and free_ctx, ensuring
they will always have the same layout.
Signed-off-by: Dan Moulding <dan@danm.net>
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Fixes: 42d9f6c774 ("crypto: acomp - Move scomp stream allocation code into acomp")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit acb59a4bb8 ]
host_seq_bmp is allocated with vzalloc but is currently freed with
bitmap_free, which uses kfree internally. This mismach prevents the
resource from being released properly and may result in memory leaks
or other issues.
Fix this by freeing host_seq_bmp with vfree to match the vzalloc
allocation.
Fixes: f232836a91 ("vfio/pds: Add support for dirty page tracking")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Link: https://lore.kernel.org/r/20250913153154.1028835-1-zilin@seu.edu.cn
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5a746c1a2c ]
The referenced commit introduced exception handlers on user-space memory
references in copy_from_user and copy_to_user. These handlers return from
the respective function and calculate the remaining bytes left to copy
using the current register contents. This commit fixes a bad calculation.
This will fix the return value of copy_to_user in a specific faulting case.
The behaviour of memcpy stays unchanged.
Fixes: 9570770480 ("sparc64: Convert NG4copy_{from,to}_user to accurate exception reporting.")
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> # on Oracle SPARC T4-1
Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Link: https://lore.kernel.org/r/20250905-memcpy_series-v4-4-1ca72dda195b@mkarcher.dialup.fu-berlin.de
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b67c8fc10 ]
The referenced commit introduced exception handlers on user-space memory
references in copy_from_user and copy_to_user. These handlers return from
the respective function and calculate the remaining bytes left to copy
using the current register contents. This commit fixes a couple of bad
calculations and a broken epilogue in the exception handlers. This will
prevent crashes and ensure correct return values of copy_from_user and
copy_to_user in the faulting case. The behaviour of memcpy stays unchanged.
Fixes: 7ae3aaf53f ("sparc64: Convert NGcopy_{from,to}_user to accurate exception reporting.")
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> # on SPARC T4 with modified kernel to use Niagara 1 code
Tested-by: Magnus Lindholm <linmag7@gmail.com> # on Sun Fire T2000
Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
Tested-by: Ethan Hawke <ehawk@ember.systems> # on Sun Fire T2000
Tested-by: Ken Link <iissmart@numberzero.org> # on Sun Fire T1000
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Link: https://lore.kernel.org/r/20250905-memcpy_series-v4-3-1ca72dda195b@mkarcher.dialup.fu-berlin.de
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 47b49c06eb ]
Anthony Yznaga tracked down that a BUG_ON in ext4 code with large folios
enabled resulted from copy_from_user() returning impossibly large values
greater than the size to be copied. This lead to __copy_from_iter()
returning impossible values instead of the actual number of bytes it was
able to copy.
The BUG_ON has been reported in
https://lore.kernel.org/r/b14f55642207e63e907965e209f6323a0df6dcee.camel@physik.fu-berlin.de
The referenced commit introduced exception handlers on user-space memory
references in copy_from_user and copy_to_user. These handlers return from
the respective function and calculate the remaining bytes left to copy
using the current register contents. The exception handlers expect that
%o2 has already been masked during the bulk copy loop, but the masking was
performed after that loop. This will fix the return value of copy_from_user
and copy_to_user in the faulting case. The behaviour of memcpy stays
unchanged.
Fixes: ee841d0aff ("sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.")
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> # on Sun Netra 240
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Tested-by: René Rebe <rene@exactcode.com> # on UltraSparc III+ and UltraSparc IIIi
Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Link: https://lore.kernel.org/r/20250905-memcpy_series-v4-2-1ca72dda195b@mkarcher.dialup.fu-berlin.de
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4fba171300 ]
The referenced commit introduced exception handlers on user-space memory
references in copy_from_user and copy_to_user. These handlers return from
the respective function and calculate the remaining bytes left to copy
using the current register contents. This commit fixes a couple of bad
calculations. This will fix the return value of copy_from_user and
copy_to_user in the faulting case. The behaviour of memcpy stays unchanged.
Fixes: cb736fdbb2 ("sparc64: Convert U1copy_{from,to}_user to accurate exception reporting.")
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> # on QEMU 10.0.3
Tested-by: René Rebe <rene@exactcode.com> # on Ultra 5 UltraSparc IIi
Tested-by: Jonathan 'theJPster' Pallant <kernel@thejpster.org.uk> # on Sun Netra T1
Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Link: https://lore.kernel.org/r/20250905-memcpy_series-v4-1-1ca72dda195b@mkarcher.dialup.fu-berlin.de
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 87cab86925 ]
In create_sdw_dailink() check that sof_end->codec_info->add_sidecar
is not NULL before calling it.
The original code assumed that if include_sidecar is true, the codec
on that link has an add_sidecar callback. But there could be other
codecs on the same link that do not have an add_sidecar callback.
Fixes: da52441802 ("ASoC: Intel: sof_sdw: Add callbacks to register sidecar devices")
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20250919140235.1071941-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 32d340ae67 ]
In ieee80211_rx_handle_packet(), if the caller does not provide pubsta
information, an attempt is made to find the station using the address 2
(source address) field in the header. Since pubsta is missing, link
information such as link_valid and link_id is also unavailable. Now if such
a situation comes, and if a matching ML station entry is found based on
the source address, currently the packet is dropped due to missing link ID
in the status field which is not correct.
Hence, to fix this issue, if link_valid is not set and the station is an
ML station, make an attempt to find a link station entry using the source
address. If a valid link station is found, derive the link ID and proceed
with packet processing. Otherwise, drop the packet as per the existing
flow.
Fixes: ea9d807b56 ("wifi: mac80211: add link information in ieee80211_rx_status")
Suggested-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com>
Link: https://patch.msgid.link/20250917-fix_data_packet_rx_with_mlo_and_no_pubsta-v1-1-8cf971a958ac@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 26f8fc0b24 ]
Currently, ath12k_dp_rx_h_ppdu() determines the band and frequency
based on the channel number and center frequency from the RX descriptor's
PHY metadata. However, in rare cases, it is observed that frequency
retrieved from the metadata may be invalid or unexpected especially for
6 GHz frames.
This can result in a NULL sband, which prevents proper frequency assignment
in rx_status and potentially leading to incorrect RX packet classification.
To fix this potential issue, add a fallback mechanism that uses
ar->rx_channel to populate the band and frequency when the derived
sband is invalid or missing.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.4.1-00199-QCAHKSWPL_SILICONZ-1
Fixes: d889913205 ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: Sriram R <quic_srirrama@quicinc.com>
Co-developed-by: Vinith Kumar R <quic_vinithku@quicinc.com>
Signed-off-by: Vinith Kumar R <quic_vinithku@quicinc.com>
Signed-off-by: Aishwarya R <aishwarya.r@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250723190651.699828-1-aishwarya.r@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7695fa71c1 ]
Currently, host fetches combined rssi from rssi_comb in struct
hal_rx_phyrx_rssi_legacy_info.
rssi_comb is 8th to 15th bits of the second to last variable.
rssi_comb_ppdu is the 0th to 7th of the last variable.
When bandwidth = 20MHz, rssi_comb = rssi_comb_ppdu. But when bandwidth >
20MHz, rssi_comb < rssi_comb_ppdu because rssi_comb only includes power
of primary 20 MHz while rssi_comb_ppdu includes power of active
RUs/subchannels. So should fetch combined rssi from rssi_comb_ppdu.
Also related macro definitions are too long, rename them.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.4.1-00199-QCAHKSWPL_SILICONZ-1
Fixes: d889913205 ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: Kang Yang <kang.yang@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250722095934.67-4-kang.yang@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b46e85129 ]
Current monitor mode will parse TLV HAL_PHYRX_OTHER_RECEIVE_INFO with
struct hal_phyrx_common_user_info.
Obviously, they do not match. The original intention here was to parse
HAL_PHYRX_COMMON_USER_INFO. So fix it by correctly parsing
HAL_PHYRX_COMMON_USER_INFO instead.
Also add LTF parsing and report to radiotap along with GI.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.4.1-00199-QCAHKSWPL_SILICONZ-1
Fixes: d939919a36 ("wifi: ath12k: Add HAL_PHYRX_OTHER_RECEIVE_INFO TLV parsing support")
Signed-off-by: Kang Yang <kang.yang@oss.qualcomm.com>
Reviewed-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250722095934.67-3-kang.yang@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ea2b0af4c9 ]
In ath12k_mac_parse_tx_pwr_env(), for the non-PSD case num_pwr_levels is
limited by ATH12K_NUM_PWR_LEVELS which is 16:
if (tpc_info->num_pwr_levels > ATH12K_NUM_PWR_LEVELS)
tpc_info->num_pwr_levels = ATH12K_NUM_PWR_LEVELS;
Then it is used to iterate entries in local_non_psd->power[] and
reg_non_psd->power[]:
for (i = 0; i < tpc_info->num_pwr_levels; i++) {
tpc_info->tpe[i] = min(local_non_psd->power[i],
reg_non_psd->power[i]) / 2;
Since the two array are of size 5, Smatch warns:
drivers/net/wireless/ath/ath12k/mac.c:9812
ath12k_mac_parse_tx_pwr_env() error: buffer overflow 'local_non_psd->power' 5 <= 15
drivers/net/wireless/ath/ath12k/mac.c:9812
ath12k_mac_parse_tx_pwr_env() error: buffer overflow 'reg_non_psd->power' 5 <= 15
This is a false positive as there is already implicit limitation:
tpc_info->num_pwr_levels = max(local_non_psd->count,
reg_non_psd->count);
meaning it won't exceed 5.
However, to make robot happy, add explicit limit there.
Also add the same to the PSD case, although no warning due to
ATH12K_NUM_PWR_LEVELS equals IEEE80211_TPE_PSD_ENTRIES_320MHZ.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00284-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1
Fixes: cccbb9d0dd ("wifi: ath12k: add parse of transmit power envelope element")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202505180703.Kr9OfQRP-lkp@intel.com/
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250804-ath12k-fix-smatch-warning-on-6g-vlp-v1-2-56f1e54152ab@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bba2f9faf4 ]
Currently, at the end of ath12k_mac_fill_reg_tpc_info(), the
reg_tpc_info struct is populated, including the following:
reg_tpc_info->is_psd_power = is_psd_power;
reg_tpc_info->eirp_power = eirp_power;
Kernel test robot complains on uninitialized symbol:
drivers/net/wireless/ath/ath12k/mac.c:10069
ath12k_mac_fill_reg_tpc_info() error: uninitialized symbol 'eirp_power'
This is because there are some code paths that never set eirp_power, so
the assignment of reg_tpc_info->eirp_power can come from an
uninitialized variable. Functionally this is OK since the eirp_power
only has meaning when is_psd_power is true, and all code paths which set
is_psd_power to true also set eirp_power. However, to keep the robot
happy, always initialize eirp_power before use.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00284-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1
Fixes: aeda163bb0 ("wifi: ath12k: fill parameters for vdev set TPC power WMI command")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202505180927.tbNWr3vE-lkp@intel.com/
Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250804-ath12k-fix-smatch-warning-on-6g-vlp-v1-1-56f1e54152ab@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 62a7b3bbb6 ]
In ipc4_ssp_dai_config_pcm_params_match() when comparing params_channels()
against hw_config->tdm_slots the comparison should be a <= not a ==.
The number of TDM slots must be enough for the number of required channels.
But it can be greater. There are various reason why a I2S/TDM link has more
TDM slots than a particular audio stream needs.
The original comparison would fail on systems that had more TDM slots.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 8a07944a77 ("ASoC: SOF: ipc4-pcm: Look for best matching hw_config for SSP")
Link: https://patch.msgid.link/20250819160525.423416-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 200651b9b8 ]
Currently, if the next-hop netdevice does not support ARP resolution,
the destination MAC address is silently set to zero without reporting
an error. This leads to incorrect behavior and may result in packet
transmission failures.
Fix this by deferring MAC resolution to the IP stack via neighbour
lookup, allowing proper resolution or error reporting as appropriate.
Fixes: 7025fcd36b ("IB: address translation to map IP toIB addresses (GIDs)")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vlad Dumitrescu <vdumitrescu@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20250916111103.84069-3-edwards@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 08fa726e66 ]
This reverts commit 28a76fcc4c.
No actual HW bugs are known where Endpoint Context shows Running state
but Stop Endpoint fails repeatedly with Context State Error and leaves
the endpoint state unchanged. Stop Endpoint retries on Running EPs have
been performed since early 2021 with no such issues reported so far.
Trying to handle this hypothetical case brings a more realistic danger:
if Stop Endpoint fails on an endpoint which hasn't yet started after a
doorbell ring and enough latency occurs before this completion event is
handled, the driver may time out and begin removing cancelled TDs from
a running endpoint, even though one more retry would stop it reliably.
Such high latency is rare but not impossible, and removing TDs from a
running endpoint can cause more damage than not giving back a cancelled
URB (which wasn't happening anyway). So err on the side of caution and
revert to the old policy of always retrying if the EP appears running.
[Remove stable tag as we are dealing with theoretical cases -Mathias]
Fixes: 28a76fcc4c ("usb: xhci: Avoid Stop Endpoint retry loop if the endpoint seems Running")
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20250917210726.97100-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b0e4d51c6 ]
smc_vlan_by_tcpsk() fetches sk_dst_get(sk)->dev before RTNL and
passes it to netdev_walk_all_lower_dev(), which is illegal.
Also, smc_vlan_by_tcpsk_walk() does not require RTNL at all.
Let's use __sk_dst_get(), dst_dev_rcu(), and
netdev_walk_all_lower_dev_rcu().
Note that the returned value of smc_vlan_by_tcpsk() is not used
in the caller.
Fixes: 0cfdd8f92c ("smc: connection and link group creation")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250916214758.650211-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 235f81045c ]
smc_clc_prfx_match() is called from smc_listen_work() and
not under RCU nor RTNL.
Using sk_dst_get(sk)->dev could trigger UAF.
Let's use __sk_dst_get() and dst_dev_rcu().
Note that the returned value of smc_clc_prfx_match() is not
used in the caller.
Fixes: a046d57da1 ("smc: CLC handshake (incl. preparation steps)")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250916214758.650211-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 935d783e5d ]
smc_clc_prfx_set() is called during connect() and not under RCU
nor RTNL.
Using sk_dst_get(sk)->dev could trigger UAF.
Let's use __sk_dst_get() and dev_dst_rcu() under rcu_read_lock()
after kernel_getsockname().
Note that the returned value of smc_clc_prfx_set() is not used
in the caller.
While at it, we change the 1st arg of smc_clc_prfx_set[46]_rcu()
not to touch dst there.
Fixes: a046d57da1 ("smc: CLC handshake (incl. preparation steps)")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250916214758.650211-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 75d5546f60 ]
The handling for variable-length ioctl commands in hidraw_ioctl() is
rather complex and the check for the data direction is incomplete.
Simplify this code by factoring out the various ioctls grouped by dir
and size, and using a switch() statement with the size masked out, to
ensure the rest of the command is correctly matched.
Fixes: 9188e79ec3 ("HID: add phys and name ioctls to hidraw")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9877c004e9 ]
Change the error code EAGAIN to -EAGAIN in qla_nvme_xmt_ls_rsp() to
align with qla2x00_start_sp() returning negative error codes or
QLA_SUCCESS, preventing logical errors.
Fixes: 875386b988 ("scsi: qla2xxx: Add Unsolicited LS Request and Response Support for NVMe")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Message-ID: <20250905075446.381139-4-rongqianfeng@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1f037e3acd ]
Change the error code EAGAIN to -EAGAIN in START_SP_W_RETRIES() to align
with qla2x00_start_sp() returning negative error codes or QLA_SUCCESS,
preventing logical errors. Additionally, the '_rval' variable should
store negative error codes to conform to Linux kernel error code
conventions.
Fixes: 9803fb5d27 ("scsi: qla2xxx: Fix task management cmd failure")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Message-ID: <20250905075446.381139-3-rongqianfeng@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 066b8f3fa8 ]
Change the error code EAGAIN to -EAGAIN in qla24xx_sadb_update() and
qla_edif_process_els() to align with qla2x00_start_sp() returning
negative error codes or QLA_SUCCESS, preventing logical errors.
Fixes: 0b3f3143d4 ("scsi: qla2xxx: edif: Add retry for ELS passthrough")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Message-ID: <20250905075446.381139-2-rongqianfeng@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d1a599a813 ]
There appears to be a cut-n-paste error with the incorrect field
ndr_desc->numa_node being reported for the target node. Fix this by
using ndr_desc->target_node instead.
Fixes: f060db9937 ("ACPI: NFIT: Use fallback node id when numa info in NFIT table is incorrect")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9fc4a3da9a ]
snd_pcm_group_lock_irq() acquires a spinlock_t and disables interrupts
via spin_lock_irq(). This also implicitly disables the handling of
softirqs such as TIMER_SOFTIRQ.
On PREEMPT_RT softirqs are preemptible and spin_lock_irq() does not
disable them. That means a timer can be invoked during spin_lock_irq()
on the same CPU. Due to synchronisations reasons local_bh_disable() has
a per-CPU lock named softirq_ctrl.lock which synchronizes individual
softirq against each other.
syz-bot managed to trigger a lockdep report where softirq_ctrl.lock is
acquired in hrtimer_cancel() in addition to hrtimer_run_softirq(). This
is a possible deadlock.
The softirq_ctrl.lock can not be made part of spin_lock_irq() as this
would lead to too much synchronisation against individual threads on the
system. To avoid the possible deadlock, softirqs must be manually
disabled before the lock is acquired.
Disable softirqs before the lock is acquired on PREEMPT_RT.
Reported-by: syzbot+10b4363fb0f46527f3f3@syzkaller.appspotmail.com
Fixes: d2d6422f8b ("x86: Allow to enable PREEMPT_RT.")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c2f7c32b25 ]
f2fs_zero_post_eof_page() may cuase more overhead due to invalidate_lock
and page lookup, change as below to mitigate its overhead:
- check new_size before grabbing invalidate_lock
- lookup and invalidate pages only in range of [old_size, new_size]
Fixes: ba8dac350f ("f2fs: fix to zero post-eof page")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d625a2b08c ]
It reports a bug from device w/ zufs:
F2FS-fs (dm-64): Inconsistent segment (173822) type [1, 0] in SSA and SIT
F2FS-fs (dm-64): Stopped filesystem due to reason: 4
Thread A Thread B
- f2fs_expand_inode_data
- f2fs_allocate_pinning_section
- f2fs_gc_range
- do_garbage_collect w/ segno #x
- writepage
- f2fs_allocate_data_block
- new_curseg
- allocate segno #x
The root cause is: fallocate on pinning file may race w/ block allocation
as above, result in do_garbage_collect() from fallocate() may migrate
segment which is just allocated by a log, the log will update segment type
in its in-memory structure, however GC will get segment type from on-disk
SSA block, once segment type changes by log, we can detect such
inconsistency, then shutdown filesystem.
In this case, on-disk SSA shows type of segno #173822 is 1 (SUM_TYPE_NODE),
however segno #173822 was just allocated as data type segment, so in-memory
SIT shows type of segno #173822 is 0 (SUM_TYPE_DATA).
Change as below to fix this issue:
- check whether current section is empty before gc
- add sanity checks on do_garbage_collect() to avoid any race case, result
in migrating segment used by log.
- btw, it fixes misc issue in printed logs: "SSA and SIT" -> "SIT and SSA".
Fixes: 9703d69d9d ("f2fs: support file pinning for zoned devices")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae5c2bee16 ]
Rename extra_dw to extra_bytes and document what it's for.
The value is already used as if it were bytes in vcn_v4_0.c
and in amdgpu_ring_init. Just adjust the dword count in
jpeg_v1_0.c so that it becomes a byte count.
v2:
Rename extra_dw to extra_bytes as discussed during review.
Fixes: c8c1a1d2ef ("drm/amdgpu: define and add extra dword for jpeg ring")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6a43aeb71 ]
Currently, the srcu_gp_start_if_needed() is always be invoked in
preempt disable's critical section, this commit therefore remove
redundant preempt_disable/enable() in srcu_gp_start_if_needed()
and adds a call to lockdep_assert_preemption_disabled() in order
to enable lockdep to diagnose mistaken invocations of this function
from preempts-enabled code.
Fixes: 65b4a59557 ("srcu: Make Tiny SRCU explicitly disable preemption")
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2b660ee10a ]
In vendor driver, size of group cal and dpd cal for mt7981 includes 6G
although the chip doesn't support it.
mt76 doesn't take this into account which results in reading from the
incorrect offset.
For devices with precal, this would lead to lower bitrate.
Fix this by aligning groupcal size with vendor driver and switch to
freq_list_v2 in mt7915_dpd_freq_idx in order to get the correct offset.
Below are iwinfo of the test device with two clients connected
(iPhone 16, Intel AX210).
Before :
Mode: Master Channel: 36 (5.180 GHz) HT Mode: HE80
Center Channel 1: 42 2: unknown
Tx-Power: 23 dBm Link Quality: 43/70
Signal: -67 dBm Noise: -92 dBm
Bit Rate: 612.4 MBit/s
Encryption: WPA3 SAE (CCMP)
Type: nl80211 HW Mode(s): 802.11ac/ax/n
Hardware: embedded [MediaTek MT7981]
After:
Mode: Master Channel: 36 (5.180 GHz) HT Mode: HE80
Center Channel 1: 42 2: unknown
Tx-Power: 23 dBm Link Quality: 43/70
Signal: -67 dBm Noise: -92 dBm
Bit Rate: 900.6 MBit/s
Encryption: WPA3 SAE (CCMP)
Type: nl80211 HW Mode(s): 802.11ac/ax/n
Hardware: embedded [MediaTek MT7981]
Tested-on: mt7981 20240823
Fixes: 19a954edec ("wifi: mt76: mt7915: add mt7986, mt7916 and mt7981 pre-calibration")
Signed-off-by: Zhi-Jun You <hujy652@gmail.com>
Link: https://patch.msgid.link/20250909064824.16847-1-hujy652@gmail.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f272210b28 ]
The doorbell feature temporarily overrides the inbound translation to point
to the address stored in epf_test->db_bar.phys_addr, i.e., it calls
set_bar() twice without ever calling clear_bar(), as calling clear_bar()
would clear the BAR's PCI address assigned by the host.
Thus, when disabling the doorbell, restore the inbound translation to point
to the memory allocated for the BAR.
Without this, running the PCI endpoint kselftest doorbell test case more
than once would fail.
Fixes: eff0c286aa ("PCI: endpoint: pci-epf-test: Add doorbell test support")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20250908161942.534799-2-cassel@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7dfd80f70e ]
When the watchdog gets enabled with this driver, it leaves enough time
for the core watchdog subsystem to start pinging it. But when the
watchdog is already started by hardware or by the boot loader, little
time remains before it fires and it happens that the core watchdog
subsystem doesn't have time to start pinging it.
Until commit 19ce9490aa ("watchdog: mpc8xxx: use the core worker
function") pinging was managed by the driver itself and the watchdog
was immediately pinged by setting the timer expiry to 0.
So restore similar behaviour by pinging it when enabling it so that
if it was already enabled the watchdog timer counter is reloaded.
Fixes: 19ce9490aa ("watchdog: mpc8xxx: use the core worker function")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 944b6b216c ]
KCSAN reported a data-race on the `ipvs->enable` flag, which is
written in the control path and read concurrently from many other
contexts.
Following a suggestion by Julian, this patch fixes the race by
converting all accesses to use `WRITE_ONCE()/READ_ONCE()`.
This lightweight approach ensures atomic access and acts as a
compiler barrier, preventing unsafe optimizations where the flag
is checked in loops (e.g., in ip_vs_est.c).
Additionally, the `enable` checks in the fast-path hooks
(`ip_vs_in_hook`, `ip_vs_out_hook`, `ip_vs_forward_icmp`) are
removed. These are unnecessary since commit 857ca89711
("ipvs: register hooks only with services"). The `enable=0`
condition they check for can only occur in two rare and non-fatal
scenarios: 1) after hooks are registered but before the flag is set,
and 2) after hooks are unregistered on cleanup_net. In the worst
case, a single packet might be mishandled (e.g., dropped), which
does not lead to a system crash or data corruption. Adding a check
in the performance-critical fast-path to handle this harmless
condition is not a worthwhile trade-off.
Fixes: 857ca89711 ("ipvs: register hooks only with services")
Reported-by: syzbot+1651b5234028c294c339@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1651b5234028c294c339
Suggested-by: Julian Anastasov <ja@ssi.bg>
Link: https://lore.kernel.org/lvs-devel/2189fc62-e51e-78c9-d1de-d35b8e3657e3@ssi.bg/
Signed-off-by: Zhang Tengfei <zhtfdev@gmail.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba941796d7 ]
Since the ahash_region() macro was redefined to calculate the region
index solely from HTABLE_REGION_BITS, the htable_bits parameter became
unused.
Remove the unused htable_bits argument and its call sites, simplifying
the code without changing semantics.
Fixes: 8478a729c0 ("netfilter: ipset: fix region locking in hash types")
Signed-off-by: Zhen Ni <zhen.ni@easystack.cn>
Reviewed-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 372fdb5c75 ]
When using KSM (Key Scatter-gather Memory) access mode, the HW requires
the IOVA to be aligned to the selected page size.
Without this alignment, the HW may not function correctly.
Currently, mlx5_umem_mkc_find_best_pgsz() does not filter out page sizes
that would result in misaligned IOVAs for KSM mode. This can lead to
selecting page sizes that are incompatible with the given IOVA.
Fix this by filtering the page size bitmap when in KSM mode, keeping
only page sizes to which the IOVA is aligned to.
Fixes: fcfb03597b ("RDMA/mlx5: Align mkc page size capability check to PRM")
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20250824144839.154717-1-edwards@nvidia.com
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e1c4350327 ]
The comparison function cmp_loc_by_count() used for sorting stack trace
locations in debugfs currently returns -1 if a->count > b->count and 1
otherwise. This breaks the antisymmetry property required by sort(),
because when two counts are equal, both cmp(a, b) and cmp(b, a) return
1.
This can lead to undefined or incorrect ordering results. Fix it by
updating the comparison logic to explicitly handle the case when counts
are equal, and use cmp_int() to ensure the comparison function adheres
to the required mathematical properties of antisymmetry.
Fixes: 553c0369b3 ("mm/slub: sort debugfs output by frequency of stack traces")
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0f85406bf8 ]
There is an issue with the handling of negative channel scales
in iio_convert_raw_to_processed_unlocked() when the channel-scale
is of the IIO_VAL_INT_PLUS_[MICRO|NANO] type:
Things work for channel-scale values > -1.0 and < 0.0 because of
the use of signed values in:
*processed += div_s64(raw64 * (s64)scale_val2 * scale, 1000000LL);
Things will break however for scale values < -1.0. Lets for example say
that raw = 2, (caller-provided)scale = 10 and (channel)scale_val = -1.5.
The result should then be 2 * 10 * -1.5 = -30.
channel-scale = -1.5 means scale_val = -1 and scale_val2 = 500000,
now lets see what gets stored in processed:
1. *processed = raw64 * scale_val * scale;
2. *processed += raw64 * scale_val2 * scale / 1000000LL;
1. Sets processed to 2 * -1 * 10 = -20
2. Adds 2 * 500000 * 10 / 1000000 = 10 to processed
And the end result is processed = -20 + 10 = -10, which is not correct.
Fix this by always using the abs value of both scale_val and scale_val2
and if either is negative multiply the end-result by -1.
Note there seems to be an unwritten rule about negative
IIO_VAL_INT_PLUS_[MICRO|NANO] values that:
i. values > -1.0 and < 0.0 are written as val=0 val2=-xxx
ii. values <= -1.0 are written as val=-xxx val2=xxx
But iio_format_value() will also correctly display a third option:
iii. values <= -1.0 written as val=-xxx val2=-xxx
Since iio_format_value() uses abs(val) when val2 < 0.
This fix also makes iio_convert_raw_to_processed() properly handle
channel-scales using this third option.
Fixes: 48e44ce0f8 ("iio:inkern: Add function to read the processed value")
Cc: Matteo Martelli <matteomartelli3@gmail.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Hans de Goede <hansg@kernel.org>
Link: https://patch.msgid.link/20250831104825.15097-2-hansg@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0dc7117da8 ]
Index allocation requires at least one bit in the $BITMAP attribute to
track usage of index entries. If the bitmap is empty while index blocks
are already present, this reflects on-disk corruption.
syzbot triggered this condition using a malformed NTFS image. During a
rename() operation involving a long filename (which spans multiple
index entries), the empty bitmap allowed the name to be added without
valid tracking. Subsequent deletion of the original entry failed with
-ENOENT, due to unexpected index state.
Reject such cases by verifying that the bitmap is not empty when index
blocks exist.
Reported-by: syzbot+b0373017f711c06ada64@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b0373017f711c06ada64
Fixes: d99208b919 ("fs/ntfs3: cancle set bad inode after removing name fails")
Tested-by: syzbot+b0373017f711c06ada64@syzkaller.appspotmail.com
Signed-off-by: Moon Hee Lee <moonhee.lee.ca@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 736fc7bf5f ]
The MFT record relative to the file being opened contains its runlist,
an array containing information about the file's location on the physical
disk. Analysis of all Call Stack paths showed that the values of the
runlist array, from which LCNs are calculated, are not validated before
run_unpack function.
The run_unpack function decodes the compressed runlist data format
from MFT attributes (for example, $DATA), converting them into a runs_tree
structure, which describes the mapping of virtual clusters (VCN) to
logical clusters (LCN). The NTFS3 subsystem also has a shortcut for
deleting files from MFT records - in this case, the RUN_DEALLOCATE
command is sent to the run_unpack input, and the function logic
provides that all data transferred to the runlist about file or
directory is deleted without creating a runs_tree structure.
Substituting the runlist in the $DATA attribute of the MFT record for an
arbitrary file can lead either to access to arbitrary data on the disk
bypassing access checks to them (since the inode access check
occurs above) or to destruction of arbitrary data on the disk.
Add overflow check for addition operation.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 4342306f0f ("fs/ntfs3: Add file operations and implementation")
Signed-off-by: Vitaly Grigoryev <Vitaly.Grigoryev@kaspersky.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eebccbfea4 ]
Currently, sta_set_sinfo() fails to populate link-level station info
when sinfo->valid_links is initially 0 and sta->sta.valid_links has
bits set for links other than link 0. This typically occurs when
association happens on a non-zero link or link 0 deleted dynamically.
In such cases, the for_each_valid_link(sinfo, link_id) loop only
executes for link 0 and terminates early, since sinfo->valid_links
remains 0. As a result, only MLD-level information is reported to
userspace.
Hence to fix, initialize sinfo->valid_links with sta->sta.valid_links
before entering the loop to ensure loop executes for each valid link.
During iteration, mask out invalid links from sinfo->valid_links if
any of sta->link[link_id], sdata->link[link_id], or sinfo->links[link_id]
are not present, to report only valid link information.
Fixes: 505991fba9 ("wifi: mac80211: extend support to fill link level sinfo structure")
Signed-off-by: Sarika Sharma <quic_sarishar@quicinc.com>
Link: https://patch.msgid.link/20250904104054.790321-1-quic_sarishar@quicinc.com
[clarify comment]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4336efb59e ]
When an invalid value is passed via quirk option, currently
bytcr_rt5640 driver just ignores and leaves as is, which may lead to
unepxected results like OOB access.
This patch adds the sanity check and corrects the input mapping to the
certain default value if an invalid value is passed.
Fixes: 64484ccee7 ("ASoC: Intel: bytcr_rt5651: Set card long_name based on quirks")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Message-ID: <20250902171826.27329-4-tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fba404e4b4 ]
When an invalid value is passed via quirk option, currently
bytcr_rt5640 driver only shows an error message but leaves as is.
This may lead to unepxected results like OOB access.
This patch corrects the input mapping to the certain default value if
an invalid value is passed.
Fixes: 063422ca2a ("ASoC: Intel: bytcr_rt5640: Set card long_name based on quirks")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Message-ID: <20250902171826.27329-3-tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b20eb0e8de ]
When an invalid value is passed via quirk option, currently
bytcht_es8316 driver just ignores and leaves as is, which may lead to
unepxected results like OOB access.
This patch adds the sanity check and corrects the input mapping to the
certain default value if an invalid value is passed.
Fixes: 249d2fc9e5 ("ASoC: Intel: bytcht_es8316: Set card long_name based on quirks")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Message-ID: <20250902171826.27329-2-tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c20edbacc0 ]
No idea what the current barrier position was meant for. At that point,
nothing is read from the descriptor, only the pointer to the actual one
is fetched.
The correct barrier usage here is after the generation check, so that
only the first qword is read if the descriptor is not yet ready and we
need to stop polling. Debatable on coherent DMA as the Rx descriptor
size is <= cacheline size, but anyway, the current barrier position
only makes the codegen worse.
Fixes: 3a8845af66 ("idpf: add RX splitq napi poll support")
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Tested-by: Ramu R <ramu.r@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b0531cdba5 ]
Similar to previous commit 2a934fdb01 ("media: v4l2-dev: fix error
handling in __video_register_device()"), the release hook should be set
before device_register(). Otherwise, when device_register() return error
and put_device() try to callback the release function, the below warning
may happen.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 4760 at drivers/base/core.c:2567 device_release+0x1bd/0x240 drivers/base/core.c:2567
Modules linked in:
CPU: 1 UID: 0 PID: 4760 Comm: syz.4.914 Not tainted 6.17.0-rc3+ #1 NONE
RIP: 0010:device_release+0x1bd/0x240 drivers/base/core.c:2567
Call Trace:
<TASK>
kobject_cleanup+0x136/0x410 lib/kobject.c:689
kobject_release lib/kobject.c:720 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0xe9/0x130 lib/kobject.c:737
put_device+0x24/0x30 drivers/base/core.c:3797
pps_register_cdev+0x2da/0x370 drivers/pps/pps.c:402
pps_register_source+0x2f6/0x480 drivers/pps/kapi.c:108
pps_tty_open+0x190/0x310 drivers/pps/clients/pps-ldisc.c:57
tty_ldisc_open+0xa7/0x120 drivers/tty/tty_ldisc.c:432
tty_set_ldisc+0x333/0x780 drivers/tty/tty_ldisc.c:563
tiocsetd drivers/tty/tty_io.c:2429 [inline]
tty_ioctl+0x5d1/0x1700 drivers/tty/tty_io.c:2728
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:598 [inline]
__se_sys_ioctl fs/ioctl.c:584 [inline]
__x64_sys_ioctl+0x194/0x210 fs/ioctl.c:584
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x5f/0x2a0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
Before commit c79a39dc8d ("pps: Fix a use-after-free"),
pps_register_cdev() call device_create() to create pps->dev, which will
init dev->release to device_create_release(). Now the comment is outdated,
just remove it.
Thanks for the reminder from Calvin Owens, 'kfree_pps' should be removed
in pps_register_source() to avoid a double free in the failure case.
Link: https://lore.kernel.org/all/20250827065010.3208525-1-wangliang74@huawei.com/
Fixes: c79a39dc8d ("pps: Fix a use-after-free")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Reviewed-By: Calvin Owens <calvin@wbinvd.org>
Link: https://lore.kernel.org/r/20250830075023.3498174-1-wangliang74@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3cf0b3c243 ]
Currently gsm_queue() processes incoming frames and when opening
a DLC channel it calls gsm_dlci_open() which calls gsm_modem_update().
If basic mode is used it calls gsm_modem_upd_via_msc() and it
cannot block the input queue by waiting the response to come
into the same input queue.
Instead allow sending Modem Status Command without waiting for remote
end to respond. Define a new function gsm_modem_send_initial_msc()
for this purpose. As MSC is only valid for basic encoding, it does
not do anything for advanced or when convergence layer type 2 is used.
Fixes: 4847380250 ("tty: n_gsm: fix missing update of modem controls after DLCI open")
Signed-off-by: Seppo Takalo <seppo.takalo@nordicsemi.no>
Link: https://lore.kernel.org/r/20250827123221.1148666-1-seppo.takalo@nordicsemi.no
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e271cc0d25 ]
Once the use_os_string flag is set to true for some functions
(e.g. adb/mtp) which need to response the OS string, and then
if we re-bind the ConfigFS gadget to use the other functions
(e.g. hid) which should not to response the OS string, however,
because the use_os_string flag is still true, so the usb gadget
response the OS string descriptor incorrectly, this can cause
the USB device to be unrecognizable on the Windows system.
An example of this as follows:
echo 1 > os_desc/use
ln -s functions/ffs.adb configs/b.1/function0
start adbd
echo "<udc device>" > UDC #succeed
stop adbd
rm configs/b.1/function0
echo 0 > os_desc/use
ln -s functions/hid.gs0 configs/b.1/function0
echo "<udc device>" > UDC #fail to connect on Windows
This patch sets the use_os_string flag to false at bind if
the functions not support OS Descriptors.
Signed-off-by: William Wu <william.wu@rock-chips.com>
Fixes: 87213d388e ("usb: gadget: configfs: OS String support")
Link: https://lore.kernel.org/r/1755833769-25434-1-git-send-email-william.wu@rock-chips.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cbda64f3f5 ]
Use negative error code -EINVAL instead of positive EINVAL in the default
case of svm_ioctl() to conform to Linux kernel error code conventions.
Fixes: 42de677f79 ("drm/amdkfd: register svm range")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 37bf0f4e39 ]
Add lane equalization setting for 8.0 GT/s and 32.0 GT/s to enhance link
stability and avoid AER Correctable Errors reported on some platforms
(eg. SA8775P).
8.0 GT/s, 16.0 GT/s and 32.0 GT/s require the same equalization setting.
This setting is programmed into a group of shadow registers, which can be
switched to configure equalization for different speeds by writing 00b,
01b and 10b to `RATE_SHADOW_SEL`.
Hence, program equalization registers in a loop using link speed as index,
so that equalization setting can be programmed for 8.0 GT/s, 16.0 GT/s
and 32.0 GT/s.
Fixes: 489f14be0e ("arm64: dts: qcom: sa8775p: Add pcie0 and pcie1 nodes")
Co-developed-by: Qiang Yu <qiang.yu@oss.qualcomm.com>
Signed-off-by: Qiang Yu <qiang.yu@oss.qualcomm.com>
Signed-off-by: Ziyue Zhang <ziyue.zhang@oss.qualcomm.com>
[mani: wrapped the warning to fit 100 columns, used post-increment for loop]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250904065225.1762793-2-ziyue.zhang@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 36b75dcb1e ]
Commit 78a7a126dc ("wifi: mac80211: validate SCAN_FLAG_AP in scan request
during MLO") introduced a check that rejects scan requests if any link is
already beaconing. This works fine when all links share the same radio, but
breaks down in multi-radio setups.
Consider a scenario where a 2.4 GHz link is beaconing and a scan is
requested on a 5 GHz link, each backed by a different physical radio. The
current logic still blocks the scan, even though it should be allowed. As a
result, interface bring-up fails unnecessarily in valid configurations.
Fix this by checking whether the scan is being requested on the same
underlying radio as the beaconing link. Only reject the scan if it targets
a link that is already beaconing and the NL80211_FEATURE_AP_SCAN is not
set. This ensures correct behavior in multi-radio environments and avoids
false rejections.
Fixes: 78a7a126dc ("wifi: mac80211: validate SCAN_FLAG_AP in scan request during MLO")
Signed-off-by: Aditya Kumar Singh <aditya.kumar.singh@oss.qualcomm.com>
Link: https://patch.msgid.link/20250812-fix_scan_ap_flag_requirement_during_mlo-v4-3-383ffb6da213@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e1a8805e5d ]
Fix incorrect argument order in devm_kcalloc() when allocating port->phys.
The original call used sizeof(phy) as the number of elements and
port->lanes as the element size, which is reversed. While this happens to
produce the correct total allocation size with current pointer size and
lane counts, the argument order is wrong.
Fixes: 6fe7c187e0 ("PCI: tegra: Support per-lane PHYs")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
[mani: added Fixes tag]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250819150436.3105973-1-alok.a.tiwari@oracle.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7009e3af04 ]
Port of commit 227545b9a0 ("drm/radeon/dpm: Disable sclk
switching on Oland when two 4K 60Hz monitors are connected")
This is an ad-hoc DPM fix, necessary because we don't have
proper bandwidth calculation for DCE 6.
We define "high pixelclock" for SI as higher than necessary
for 4K 30Hz. For example, 4K 60Hz and 1080p 144Hz fall into
this category.
When two high pixel clock displays are connected to Oland,
additionally disable shader clock switching, which results in
a higher voltage, thereby addressing some visible flickering.
v2:
Add more comments.
v3:
Split into two commits for easier review.
Fixes: 841686df9f ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ed3803533c ]
According to pp_pm_compute_clocks the non-DC display code
has "issues with mclk switching with refresh rates over 120 hz".
The workaround is to disable MCLK switching in this case.
Do the same for legacy DPM.
Fixes: 6ddbd37f10 ("drm/amd/pm: optimize the amdgpu_pm_compute_clocks() implementations")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9003a07468 ]
Some parts of the code base expect that MCLK switching is turned
off when the vblank time is set to zero.
According to pp_pm_compute_clocks the non-DC code has issues
with MCLK switching with refresh rates over 120 Hz.
v3:
Add code comment to explain this better.
Add an if statement instead of changing the switch_limit.
Fixes: 841686df9f ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ce02513012 ]
Based on some comments in dm_pp_display_configuration
above the crtc_index and line_time fields, these values
are programmed to the SMC to work around an SMC hang
when it switches MCLK.
According to Alex, the Windows driver programs them to:
mclk_change_block_cp_min = 200 / line_time
mclk_change_block_cp_max = 100 / line_time
Let's use the same for the sake of consistency.
Previously we used the watermark values, but it seemed buggy
as the code was mixing up low/high and A/B watermarks, and
was not saving a low watermark value on DCE 6, so
mclk_change_block_cp_max would be always zero previously.
Split this change off from the previous si_upload_smc_data
to make it easier to bisect, in case it causes any issues.
Fixes: 841686df9f ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a43b2cec04 ]
The si_upload_smc_data function uses si_write_smc_soft_register
to set some register values in the SMC, and expects the result
to be PPSMC_Result_OK which is 1.
The PPSMC_Result_OK / PPSMC_Result_Failed values are used for
checking the result of a command sent to the SMC.
However, the si_write_smc_soft_register actually doesn't send
any commands to the SMC and returns zero on success,
so this check was incorrect.
Fix that by not checking the return value, just like other
calls to si_write_smc_soft_register.
v3:
Additionally, when no display is plugged in, there is no need
to restrict MCLK switching, so program the registers to zero.
Fixes: 841686df9f ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3a0c3a4035 ]
Always send PPSMC_MSG_DisableULV to the SMC, even if ULV mode
is unsupported, to make sure it is properly turned off.
v3:
Simplify si_disable_ulv further.
Always check the return value of amdgpu_si_send_msg_to_smc.
Fixes: 841686df9f ("drm/amdgpu: add SI DPM support (v4)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c661219cd7 ]
Unlike later versions, UVD 3 has firmware validation.
For this to work, the UVD should be powered up correctly.
When DPM is enabled and the display clock is off,
the SMU may choose a power state which doesn't power
the UVD, which can result in failure to initialize UVD.
v2:
Add code comments to explain about the UVD power state
and how UVD clock is turned on/off.
Fixes: b38f3e80ec ("drm amdgpu: SI UVD v3_1 (v2)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 12d9a9dd9d ]
Ensure that etm_perf_add_symlink_sink() is only called for devices
that implement the alloc_buffer operation. This prevents invalid
symlink creation for dummy sinks that do not implement alloc_buffer.
Without this check, perf may attempt to use a dummy sink that lacks
alloc_buffer operationsu to initialise perf's ring buffer, leading
to runtime failures.
Fixes: 9d3ba0b6c0 ("Coresight: Add coresight dummy driver")
Signed-off-by: Yuanfang Zhang <quic_yuanfang@quicinc.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20250630-etm_perf_sink-v1-1-e4a7211f9ad7@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9228facb30 ]
The device interrupt vector 3 is an error interrupt for
physical function and a reserved interrupt for virtual function.
However, the driver has not registered the reserved interrupt for
virtual function. When allocating interrupts, the number of interrupts
is allocated based on powers of two, which includes this interrupt.
When the system enables GICv4 and the virtual function passthrough
to the virtual machine, releasing the interrupt in the driver
triggers a warning.
The WARNING report is:
WARNING: CPU: 62 PID: 14889 at arch/arm64/kvm/vgic/vgic-its.c:852 its_free_ite+0x94/0xb4
Therefore, register a reserved interrupt for VF and set the
IRQF_NO_AUTOEN flag to avoid that warning.
Fixes: 3536cc55ca ("crypto: hisilicon/qm - support get device irq information from hardware registers")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6a2c9164b5 ]
Function rate limiting is set through physical function driver.
Users configure by providing function information and rate limit values.
Before configuration, it is necessary to check whether the
provided function and PF belong to the same device.
Fixes: 22d7a6c39c ("crypto: hisilicon/qm - add pci bdf number check")
Signed-off-by: Zhushuai Yin <yinzhushuai@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1f9128f121 ]
After enabling address prefetch, check the sva module status. If all
previous prefetch requests from the sva module are not completed, then
disable the address prefetch to ensure normal execution of new task
operations. After disabling address prefetch, check if all requests
from the sva module have been completed.
Fixes: a5c164b195 ("crypto: hisilicon/qm - support address prefetching")
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0dcd21443d ]
When the device resumes from a suspended state, it will revert to its
initial state and requires re-enabling. Currently, the address prefetch
function is not re-enabled after device resuming. Move the address prefetch
enable to the initialization process. In this way, the address prefetch
can be enabled when the device resumes by calling the initialization
process.
Fixes: 607c191b37 ("crypto: hisilicon - support runtime PM for accelerator device")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4e0815104 ]
When configuring the high-performance mode register, there is no
need to verify whether the register has been successfully
enabled, as there is no possibility of a write failure for this
register.
Fixes: a9864bae18 ("crypto: hisilicon/zip - add zip comp high perf mode configuration")
Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b775ecf165 ]
Refactor icmpv6_xrlim_allow() and ip6_dst_hoplimit()
so that we acquire rcu_read_lock() a bit longer
to be able to use dst_dev_rcu() instead of dst_dev().
__ip6_rt_update_pmtu() and rt6_do_redirect can directly
use dst_dev_rcu() in sections already holding rcu_read_lock().
Small changes to use dst_dev_net_rcu() in
ip6_default_advmss(), ipv6_sock_ac_join(),
ip6_mc_find_dev() and ndisc_send_skb().
Fixes: 4a6ce2b6f2 ("net: introduce a new function dst_dev_put()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250828195823.3958522-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 05e75ac35e ]
People not very intimate with EFI may not know the meaning of the OVMF
acronym. Write it in full, to help users with making good decisions
when configuring their kernels.
Fixes: f393a76176 ("efi: add ovmf debug log driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Richard Lyu <richard.lyu@suse.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 24de3daf61 ]
Change the 'ret' variable from u32 to int to store -EINVAL. Storing the
negative error codes in unsigned type, doesn't cause an issue at runtime
but it's ugly as pants.
Additionally, assigning -EINVAL to u32 ret (i.e., u32 ret = -EINVAL) may
trigger a GCC warning when the -Wsign-conversion flag is enabled.
Fixes: aac243092b ("accel/amdxdna: Add command execution")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250828033917.113364-1-rongqianfeng@vivo.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9c0442286f ]
The patch uses power state of VCN instances for requesting video
profile.
In idle worker of a vcn instance, when there is no outstanding
submisssion or fence, the instance is put to power gated state. When
all instances are powered off that means video profile is no longer
required. A request is made to turn off video profile.
A job submission starts with begin_use of ring, and at that time
vcn instance state is changed to power on. Subsequently a check is
made for active video profile, and if not active, a request is made.
Fixes: 3b669df92c ("drm/amdgpu/vcn: adjust workload profile handling")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Sathishkumar S <sathishkumar.sundararaju@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5954ad7d1a ]
Building with a reduced stack warning limit shows that delta_mjpeg_decode()
copies a giant structure to the stack each time but only uses three of
its members:
drivers/media/platform/st/sti/delta/delta-mjpeg-dec.c: In function 'delta_mjpeg_decode':
drivers/media/platform/st/sti/delta/delta-mjpeg-dec.c:427:1: error: the frame size of 1296 bytes is larger than 1280 bytes [-Werror=frame-larger-than=]
Open-code the passing of the structure members that are actually used here.
Fixes: 433ff5b4a2 ("[media] st-delta: add mjpeg support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4ef353d546 ]
Change the 'ret' variable from u16 to int to store negative error codes or
zero returned by lx_message_send_atomic().
Storing the negative error codes in unsigned type, doesn't cause an issue
at runtime but it's ugly as pants. Additionally, assigning negative error
codes to unsigned type may trigger a GCC warning when the -Wsign-conversion
flag is enabled.
No effect on runtime.
Fixes: 02bec49045 ("ALSA: lx6464es - driver for the digigram lx6464es interface")
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Link: https://patch.msgid.link/20250828081312.393148-1-rongqianfeng@vivo.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ab1d8dda32 ]
x86 does not use CONFIG_GENERIC_MSI_IRQ, and trying to enable it anyway
results in a build failure:
In file included from include/linux/ssb/ssb.h:10,
from drivers/ssb/pcihost_wrapper.c:18:
include/linux/gpio/driver.h:41:33: error: field 'msiinfo' has incomplete type
41 | msi_alloc_info_t msiinfo;
| ^~~~~~~
In file included from include/linux/kvm_host.h:19,
from arch/x86/events/intel/core.c:17:
include/linux/msi.h:528:33: error: field 'alloc_info' has incomplete type
528 | msi_alloc_info_t alloc_info;
Change the driver to actually build without this symbol and remove the
incorrect 'select' statements.
Fixes: e8b18c1173 ("cdx: Fix missing GENERIC_MSI_IRQ on compile test")
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Nikhil Agarwal <nikhil.agarwal@amd.com>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Nipun Gupta <nipun.gupta@amd.com>
Link: https://lore.kernel.org/r/20250826043852.2206008-1-nipun.gupta@amd.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 45df22935b ]
The qcom_pcie_parse_ports() function currently iterates over all available
child nodes of the PCIe controller's device tree node. This includes
unrelated nodes such as OPP (Operating Performance Points) nodes, which do
not contain the expected 'reset' and 'phy' properties. As a result, parsing
fails and the driver falls back to the legacy method of parsing the
controller node directly. However, this fallback also fails when properties
are shifted to the Root Port node, leading to probe failure.
Fix this by restricting the parsing logic to only consider child nodes with
device_type = "pci", which is the expected and required property for PCIe
bridge nodes as per the pci-bus-common.yaml dtschema.
Fixes: a2fbecdbbb ("PCI: qcom: Add support for parsing the new Root Port binding")
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
[mani: reworded subject and description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250826-pakala-v3-3-721627bd5bb0@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6fd26f5085 ]
With the change in aee03ea7ff98 ("fuse: support large folios for
writethrough writes"), this old line for setting ap->descs[0].offset is
now obsolete and unneeded. This should have been removed as part of
aee03ea7ff98.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Fixes: aee03ea7ff98 ("fuse: support large folios for writethrough writes")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d3fee10e40 ]
Starting with commit dd26c1a23f ("PCI: rcar-host: Switch to
msi_create_parent_irq_domain()"), the MSI parent IRQ domain is NULL because
the object of type struct irq_domain_info passed to:
msi_create_parent_irq_domain() ->
irq_domain_instantiate()() ->
__irq_domain_instantiate()
has no reference to the parent IRQ domain. Using msi->domain->parent as an
argument for generic_handle_domain_irq() leads to below error:
"Unable to handle kernel NULL pointer dereference at virtual address"
This error was identified while switching the upcoming RZ/G3S PCIe host
controller driver to msi_create_parent_irq_domain() (which was using a
similar pattern to handle MSIs (see link section)), but it was not tested
on hardware using the pcie-rcar-host controller driver due to lack of
hardware.
Fixes: dd26c1a23f ("PCI: rcar-host: Switch to msi_create_parent_irq_domain()")
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
[mani: reworded subject and description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/all/20250704161410.3931884-6-claudiu.beznea.uj@bp.renesas.com
Link: https://patch.msgid.link/20250809144447.3939284-1-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b4d5cd2050 ]
On soft-reboot, with a reset GPIO defined for an Aeonsemi PHY, the
special match_phy_device fails to correctly identify that the PHY
needs to load the firmware again.
This is caused by the fact that PHY ID is read BEFORE the PHY reset
GPIO (if present) is asserted, so we can be in the scenario where the
phydev have the previous PHY ID (with the PHY firmware loaded) but
after reset the generic AS21xxx PHY is present in the PHY ID registers.
To better handle this, skip reading the PHY ID register only for the PHY
that are not AS21xxx (by matching for the Aeonsemi Vendor) and always
read the PHY ID for the other case to handle both firmware already
loaded or an HW reset.
Fixes: 830877d89e ("net: phy: Add support for Aeonsemi AS21xxx PHYs")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://patch.msgid.link/20250823134431.4854-2-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1abe21ef1a ]
Introduce phy_id_compare_vendor() PHY ID helper to compare a PHY ID with
the PHY ID Vendor using the generic PHY ID Vendor mask.
While at it also rework the PHY_ID_MATCH macro and move the mask to
dedicated define so that PHY driver can make use of the mask if needed.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250823134431.4854-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: b4d5cd2050 ("net: phy: as21xxx: better handle PHY HW reset on soft-reboot")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1a7c18c485 ]
The mlx5 driver currently derives max_qp_wr directly from the
log_max_qp_sz HCA capability:
props->max_qp_wr = 1 << MLX5_CAP_GEN(mdev, log_max_qp_sz);
However, this value represents the number of WQEs in units of Basic
Blocks (see MLX5_SEND_WQE_BB), not actual number of WQEs. Since the size
of a WQE can vary depending on transport type and features (e.g., atomic
operations, UMR, LSO), the actual number of WQEs can be significantly
smaller than the WQEBB count suggests.
This patch introduces a conservative estimation of the worst-case WQE size
— considering largest segments possible with 1 SGE and no inline data or
special features. It uses this to derive a more accurate max_qp_wr value.
Fixes: 938fe83c8d ("net/mlx5_core: New device capabilities handling")
Link: https://patch.msgid.link/r/7d992c9831c997ed5c33d30973406dc2dcaf5e89.1755088725.git.leon@kernel.org
Reported-by: Chuck Lever <cel@kernel.org>
Closes: https://lore.kernel.org/all/20250506142202.GJ2260621@ziepe.ca/
Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cbdd16b818 ]
Introduce a new HID quirk to indicate that this device has to be enabled
after the panel's backlight is enabled, and update the driver data for
the elan devices to enable this quirk. This cannot be a I2C HID quirk
because the kernel needs to acknowledge this before powering up the
device and read the VID/PID. When this quirk is enabled, register
.panel_enabled()/.panel_disabling() instead for the panel follower.
Also rename the *panel_prepare* functions into *panel_follower* because
they could be called in other situations now.
Fixes: bd3cba00dc ("HID: i2c-hid: elan: Add support for Elan eKTH6915 i2c-hid touchscreens")
Fixes: d06651bebf ("HID: i2c-hid: elan: Add elan-ekth6a12nay timing")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Pin-yen Lin <treapking@chromium.org>
Acked-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20250818115015.2909525-2-treapking@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba4b8886c2 ]
The duster register needs to be disabled on test patterns. While the
code is correctly doing so, the register address contained a typo, thus
not disabling the duster correctly. Fix the typo.
Fixes: e56616d7b2 ("media: i2c: Add driver for ST VD55G1 camera sensor")
Signed-off-by: Benjamin Mugnier <benjamin.mugnier@foss.st.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 01a80b6649 ]
IPU7 ISYS and PSYS auxiliary devices are released after
ipu7_bus_del_devices(), so driver can not reference the MMU devices
from ISYS and PSYS auxiliary devices, so move the MMUs cleanup before
releasing the auxiliary devices.
Fixes: b7fe4c0019 ("media: staging/ipu7: add Intel IPU7 PCI device driver")
Signed-off-by: Bingbu Cao <bingbu.cao@intel.com>
[Sakari Ailus: Drop extra newline.]
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b065bd213c ]
Commit fd40a63c63 ("drm/atomic: Let drivers decide which planes to
async flip") unintentionally disallowed no-op changes on non-primary
planes that the driver doesn't allow async flips on. This broke async
flips for compositors that disable the cursor plane in every async
atomic commit. To fix that, change drm_atomic_set_property to again
only run atomic_async_check if the plane would actually be changed by
the atomic commit.
Fixes: fd40a63c63 ("drm/atomic: Let drivers decide which planes to async flip")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4263
Signed-off-by: Xaver Hugl <xaver.hugl@kde.org>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Link: https://lore.kernel.org/r/20250822152849.87843-1-xaver.hugl@kde.org
[andrealmeid: fix checkpatch warning]
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 361fa7f813 ]
In otx2_cpt_dl_custom_egrp_create(), strscpy() is called with the length
of the source string rather than the size of the destination buffer.
This is fine as long as the destination buffer is larger than the source
string, but we should still use the destination buffer size instead to
call strscpy() as intended. And since 'tmp_buf' is a fixed-size buffer,
we can safely omit the size argument and let strscpy() infer it using
sizeof().
Fixes: d9d7749773 ("crypto: octeontx2 - add apis for custom engine groups")
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4aa8961b1b ]
At least some panels using the LSB register are not happy with the
unconditional increase of the command buffer to 3 bytes.
With the BOE NE14QDM in my Dell Latitude 7455, the recent patches for
luminance based brightness have introduced a regression: the brightness
range stopped being contiguous and became nonsensical (it probably was
interpreting the last 2 bytes of the buffer and not the first 2).
Change from using a fixed sizeof() to a length variable that's only
set to 3 when luminance is used. Let's leave the default as 2 even for
the single-byte version, since that's how it worked before.
Fixes: f2db78e37f ("drm/dp: Modify drm_edp_backlight_set_level")
Signed-off-by: Val Packett <val@packett.cool>
Tested-by: Abel Vesa <abel.vesa@linaro.org>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://lore.kernel.org/r/20250706204446.8918-1-val@packett.cool
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e8f4c2b2b ]
mount -t f2fs -o checkpoint=disable:10% /dev/vdb /mnt/f2fs/
mount -t f2fs -o remount,checkpoint=enable /dev/vdb /mnt/f2fs/
kernel log:
F2FS-fs (vdb): Adjust unusable cap for checkpoint=disable = 204440 / 10%
If we has assigned checkpoint=enable mount option, unusable_cap{,_perc}
parameters of checkpoint=disable should be reset, then calculation and
log print could be avoid in adjust_unusable_cap_perc().
Fixes: 1ae18f71cb ("f2fs: fix checkpoint=disable:%u%%")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 68889dfd54 ]
When sk_alloc() allocates a socket, mem_cgroup_sk_alloc() sets
sk->sk_memcg based on the current task.
MPTCP subflow socket creation is triggered from userspace or
an in-kernel worker.
In the latter case, sk->sk_memcg is not what we want. So, we fix
it up from the parent socket's sk->sk_memcg in mptcp_attach_cgroup().
Although the code is placed under #ifdef CONFIG_MEMCG, it is buried
under #ifdef CONFIG_SOCK_CGROUP_DATA.
The two configs are orthogonal. If CONFIG_MEMCG is enabled without
CONFIG_SOCK_CGROUP_DATA, the subflow's memory usage is not charged
correctly.
Let's move the code out of the wrong ifdef guard.
Note that sk->sk_memcg is freed in sk_prot_free() and the parent
sk holds the refcnt of memcg->css here, so we don't need to use
css_tryget().
Fixes: 3764b0c565 ("mptcp: attach subflow socket to parent cgroup")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Link: https://patch.msgid.link/20250815201712.1745332-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ad70c6bc77 ]
For a direct attached device, attached_phy contains the local phy id.
For a device behind an expander, attached_phy contains the remote phy
id, not the local phy id.
The pm8001_ha->phy array only contains the phys of the HBA. It does not
contain the phys of the expander.
Thus, you cannot use attached_phy to index the pm8001_ha->phy array,
without first verifying that the device is directly attached.
Use the pm80xx_get_local_phy_id() helper to make sure that we use the
local phy id to index the array, regardless if the device is directly
attached or not.
Fixes: 869ddbdcae ("scsi: pm80xx: corrected SATA abort handling sequence.")
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20250814173215.1765055-21-cassel@kernel.org
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 251be2f603 ]
Since commit f7b705c238 ("scsi: pm80xx: Set phy_attached to zero when
device is gone") UBSAN reports:
UBSAN: array-index-out-of-bounds in drivers/scsi/pm8001/pm8001_sas.c:786:17
index 28 is out of range for type 'pm8001_phy [16]'
on rmmod when using an expander.
For a direct attached device, attached_phy contains the local phy id.
For a device behind an expander, attached_phy contains the remote phy
id, not the local phy id.
I.e. while pm8001_ha will have pm8001_ha->chip->n_phy local phys, for a
device behind an expander, attached_phy can be much larger than
pm8001_ha->chip->n_phy (depending on the amount of phys of the
expander).
E.g. on my system pm8001_ha has 8 phys with phy ids 0-7. One of the
ports has an expander connected. The expander has 31 phys with phy ids
0-30.
The pm8001_ha->phy array only contains the phys of the HBA. It does not
contain the phys of the expander. Thus, it is wrong to use attached_phy
to index the pm8001_ha->phy array for a device behind an expander.
Thus, we can only clear phy_attached for devices that are directly
attached.
Fixes: f7b705c238 ("scsi: pm80xx: Set phy_attached to zero when device is gone")
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20250814173215.1765055-14-cassel@kernel.org
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f059e4732 ]
Use kvfree() to free memory allocated by kvzalloc() instead of kfree().
Avoid potential memory management issue considering kvzalloc() can
internally choose to use either kmalloc() or vmalloc() based on memory
request and current system memory state. Hence, use more appropriate
kvfree() which automatically determines correct free method to avoid
potential hard to debug memory issues. Fix this issue discovered by
running spatch static analysis tool using coccinelle script -
scripts/coccinelle/api/kfree_mismatch.cocci
Fixes: 52929c2142 ("fwctl/mlx5: Support for communicating with mlx5 fw")
Link: https://patch.msgid.link/r/aKAjCoF9cT3VEbSE@bhairav-test.ee.iitb.ac.in
Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit efaa2d815a ]
Compile-testing this driver is only possible when the AMBA bus driver is
available in the kernel:
x86_64-linux-ld: drivers/char/hw_random/nomadik-rng.o: in function `nmk_rng_remove':
nomadik-rng.c:(.text+0x67): undefined reference to `amba_release_regions'
x86_64-linux-ld: drivers/char/hw_random/nomadik-rng.o: in function `nmk_rng_probe':
nomadik-rng.c:(.text+0xee): undefined reference to `amba_request_regions'
x86_64-linux-ld: nomadik-rng.c:(.text+0x18d): undefined reference to `amba_release_regions'
The was previously implied by the 'depends on ARCH_NOMADIK', but needs to be
specified for the COMPILE_TEST case.
Fixes: d5e93b3374 ("hwrng: Kconfig - Add helper dependency on COMPILE_TEST")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf4e4b97d0 ]
The function dc_stream_set_cursor_attributes() currently dereferences
the `stream` pointer and nested members `stream->ctx->dc->current_state`
without checking for NULL.
All callers of these functions, such as in
`dcn30_apply_idle_power_optimizations()` and
`amdgpu_dm_plane_handle_cursor_update()`, already perform NULL checks
before calling these functions.
Fixes below:
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c:336 dc_stream_program_cursor_attributes()
error: we previously assumed 'stream' could be null (see line 334)
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c
327 bool dc_stream_program_cursor_attributes(
328 struct dc_stream_state *stream,
329 const struct dc_cursor_attributes *attributes)
330 {
331 struct dc *dc;
332 bool reset_idle_optimizations = false;
333
334 dc = stream ? stream->ctx->dc : NULL;
^^^^^^
The old code assumed stream could be NULL.
335
--> 336 if (dc_stream_set_cursor_attributes(stream, attributes)) {
^^^^^^
The refactor added an unchecked dereference.
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_stream.c
313 bool dc_stream_set_cursor_attributes(
314 struct dc_stream_state *stream,
315 const struct dc_cursor_attributes *attributes)
316 {
317 bool result = false;
318
319 if (dc_stream_check_cursor_attributes(stream, stream->ctx->dc->current_state, attributes)) {
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Here.
This function used to check for if stream as NULL and return false at
the start. Probably we should add that back.
Fixes: 4465dd0e41 ("drm/amd/display: Refactor SubVP cursor limiting logic")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Alvin Lee <alvin.lee2@amd.com>
Cc: Ray Wu <ray.wu@amd.com>
Cc: Dillon Varone <dillon.varone@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Wheeler <daniel.wheeler@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Wenjing Liu <wenjing.liu@amd.com>
Cc: Jun Lei <Jun.Lei@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Dillon Varone <Dillon.varone@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1ad82f9db1 ]
Commit eefb83790a ("misc: pci_endpoint_test: Add doorbell test case")
added NO_BAR (-1) to the pci_barno enum which, in practical terms,
changes the enum from an unsigned int to a signed int. If the user
passes a negative number in pci_endpoint_test_ioctl() then it results in
an array underflow in pci_endpoint_test_bar().
Fixes: eefb83790a ("misc: pci_endpoint_test: Add doorbell test case")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/aIzzZ4vc6ZrmM9rI@suswa
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c0485e864a ]
EUD_MODE_MANAGER2 register is mapped to a memory region that is marked
as read-only for operating system running at EL1, enforcing access
restrictions that prohibit direct memory-mapped writes via writel().
Attempts to write to this region from HLOS can result in silent failures
or memory access violations, particularly when toggling EUD (Embedded
USB Debugger) state. To ensure secure register access, modify the driver
to use qcom_scm_io_writel(), which routes the write operation to Qualcomm
Secure Channel Monitor (SCM). SCM has the necessary permissions to access
protected memory regions, enabling reliable control over EUD state.
SC7280, the only user of EUD is also affected, indicating that this could
never have worked on a properly fused device.
Fixes: 9a1bf58ccd ("usb: misc: eud: Add driver support for Embedded USB Debugger(EUD)")
Signed-off-by: Melody Olvera <quic_molvera@quicinc.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Souradeep Chowdhury <quic_schowdhu@quicinc.com>
Signed-off-by: Komal Bajaj <komal.bajaj@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250731-eud_mode_manager_secure_access-v8-1-4a5dcbb79f41@oss.qualcomm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7bb14b61b7 ]
The boot firmware may disable the U3 port early during boot and leave it
up to the controller or PHY driver to re-enable U3 when needed.
The Rockchip USBDP PHY driver currently does this for RK3576 and RK3588,
something the Rockchip Naneng Combo PHY driver never does for RK3568.
This may result in USB 3.0 ports being limited to only using USB 2.0 or
in special cases not working at all on RK3568.
Write to PIPE_GRF USB3OTGx_CON1 reg to ensure the U3 port is enabled
when a PHY with PHY_TYPE_USB3 mode is used.
Fixes: 7160820d74 ("phy: rockchip: add naneng combo phy for RK3568")
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Link: https://lore.kernel.org/r/20250723072324.2246498-1-jonas@kwiboo.se
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dc322d13cf ]
The zoran_fh structure is a wrapper around v4l2_fh. Its usage has been
mostly removed by commit 83f89a8bcb ("media: zoran: convert to vb2"),
but the structure stayed by mistake. It is now used in a single
location, assigned from a void pointer and then recast to a void
pointer, without being every accessed. Drop it.
Fixes: 83f89a8bcb ("media: zoran: convert to vb2")
Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a84eeacbf9 ]
steelseries_srws1_probe() still does not use devm_kzalloc() and
devm_led_classdev_register(), so there is a lot of code to safely manage
heap, which reduces readability and may cause memory leaks due to minor
patch mistakes in the future.
Therefore, it should be changed to use devm_kzalloc() and
devm_led_classdev_register() to easily and safely manage heap.
Also, the current steelseries driver mainly checks sd->quriks to determine
which product a specific HID device is, which is not the correct way.
remove(), unlike probe(), does not receive struct hid_device_id as an
argument, so it must check hdev unconditionally to know which product
it is.
However, since struct steelseries_device and struct steelseries_srws1_data
have different structures, if SRWS1 is removed in remove(), converts
hdev->dev, which is initialized to struct steelseries_srws1_data,
to struct steelseries_device and uses it. This causes various
memory-related bugs as completely unexpected values exist in member
variables of the structure.
Therefore, in order to modify probe() and remove() to work properly,
Arctis 1, 9 should be added to HID_USB_DEVICE and some functions should be
modified to check hdev->product when determining HID device product.
Fixes: a0c76896c3 ("HID: steelseries: Add support for Arctis 1 XBox")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b2cd50921 ]
generic/091 may fail, then it bisects to the bad commit ba8dac350f
("f2fs: fix to zero post-eof page").
What will cause generic/091 to fail is something like below Testcase #1:
1. write 16k as compressed blocks
2. truncate to 12k
3. truncate to 20k
4. verify data in range of [12k, 16k], however data is not zero as
expected
Script of Testcase #1
mkfs.f2fs -f -O extra_attr,compression /dev/vdb
mount -t f2fs -o compress_extension=* /dev/vdb /mnt/f2fs
dd if=/dev/zero of=/mnt/f2fs/file bs=12k count=1
dd if=/dev/random of=/mnt/f2fs/file bs=4k count=1 seek=3 conv=notrunc
sync
truncate -s $((12*1024)) /mnt/f2fs/file
truncate -s $((20*1024)) /mnt/f2fs/file
dd if=/mnt/f2fs/file of=/mnt/f2fs/data bs=4k count=1 skip=3
od /mnt/f2fs/data
umount /mnt/f2fs
Analisys:
in step 2), we will redirty all data pages from #0 to #3 in compressed
cluster, and zero page #3,
in step 3), f2fs_setattr() will call f2fs_zero_post_eof_page() to drop
all page cache post eof, includeing dirtied page #3,
in step 4) when we read data from page #3, it will decompressed cluster
and extra random data to page #3, finally, we hit the non-zeroed data
post eof.
However, the commit ba8dac350f ("f2fs: fix to zero post-eof page") just
let the issue be reproduced easily, w/o the commit, it can reproduce this
bug w/ below Testcase #2:
1. write 16k as compressed blocks
2. truncate to 8k
3. truncate to 12k
4. truncate to 20k
5. verify data in range of [12k, 16k], however data is not zero as
expected
Script of Testcase #2
mkfs.f2fs -f -O extra_attr,compression /dev/vdb
mount -t f2fs -o compress_extension=* /dev/vdb /mnt/f2fs
dd if=/dev/zero of=/mnt/f2fs/file bs=12k count=1
dd if=/dev/random of=/mnt/f2fs/file bs=4k count=1 seek=3 conv=notrunc
sync
truncate -s $((8*1024)) /mnt/f2fs/file
truncate -s $((12*1024)) /mnt/f2fs/file
truncate -s $((20*1024)) /mnt/f2fs/file
echo 3 > /proc/sys/vm/drop_caches
dd if=/mnt/f2fs/file of=/mnt/f2fs/data bs=4k count=1 skip=3
od /mnt/f2fs/data
umount /mnt/f2fs
Anlysis:
in step 2), we will redirty all data pages from #0 to #3 in compressed
cluster, and zero page #2 and #3,
in step 3), we will truncate page #3 in page cache,
in step 4), expand file size,
in step 5), hit random data post eof w/ the same reason in Testcase #1.
Root Cause:
In f2fs_truncate_partial_cluster(), after we truncate partial data block
on compressed cluster, all pages in cluster including the one post eof
will be dirtied, after another tuncation, dirty page post eof will be
dropped, however on-disk compressed cluster is still valid, it may
include non-zero data post eof, result in exposing previous non-zero data
post eof while reading.
Fix:
In f2fs_truncate_partial_cluster(), let change as below to fix:
- call filemap_write_and_wait_range() to flush dirty page
- call truncate_pagecache() to drop pages or zero partial page post eof
- call f2fs_do_truncate_blocks() to truncate non-compress cluster to
last valid block
Fixes: 3265d3db1f ("f2fs: support partial truncation on compressed inode")
Reported-by: Jan Prusakowski <jprusakowski@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0fe1c6bec5 ]
Should cast type of folio->index from pgoff_t to loff_t to avoid overflow
while left shift operation.
Fixes: 3265d3db1f ("f2fs: support partial truncation on compressed inode")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e75ce11790 ]
If reserve_root mount option is not assigned, __allow_reserved_blocks()
will return false, it's not correct, fix it.
Fixes: 7e65be49ed ("f2fs: add reserved blocks for root user")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 260dcf5b06 ]
GCC 16 enables -Werror=unused-but-set-variable= which results in build
error with the following message.
drivers/gpu/drm/radeon/r600_cs.c: In function ‘r600_texture_size’:
drivers/gpu/drm/radeon/r600_cs.c:1411:29: error: variable ‘level’ set but not used [-Werror=unused-but-set-variable=]
1411 | unsigned offset, i, level;
| ^~~~~
cc1: all warnings being treated as errors
make[6]: *** [scripts/Makefile.build:287: drivers/gpu/drm/radeon/r600_cs.o] Error 1
level although is set, but in never used in the function
r600_texture_size. Thus resulting in dead code and this error getting
triggered.
Fixes: 60b212f8dd ("drm/radeon: overhaul texture checking. (v3)")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4d22db6d07 ]
When power management is not enabled in the kernel build, the newly
added hibernation changes cause a link failure:
arm-linux-gnueabi-ld: drivers/gpu/drm/amd/amdgpu/amdgpu_drv.o: in function `amdgpu_pmops_thaw':
amdgpu_drv.c:(.text+0x1514): undefined reference to `pm_hibernate_is_recovering'
Make the power management code in this driver conditional on
CONFIG_PM and CONFIG_PM_SLEEP
Fixes: 530694f54d ("drm/amdgpu: do not resume device in thaw for normal hibernation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20250714081635.4071570-1-arnd@kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1cf1205ef2 ]
The function `dp_retrain_link_dp_test` currently allocates a large
audio_output array on the stack, causing the stack frame size to exceed
the compiler limit (1080 bytes > 1024 bytes).
This change prevents stack overflow issues:
amdgpu/../display/dc/link/accessories/link_dp_cts.c:65:13: warning: stack frame size (1080) exceeds limit (1024) in 'dp_retrain_link_dp_test' [-Wframe-larger-than]
static void dp_retrain_link_dp_test(struct dc_link *link,
v2: Move audio-related data like `audio_output` is kept "per pipe" to
manage the audio for that specific display pipeline/display output path
(stream). (Wenjing)
v3: Update in all the places where `build_audio_output` is currently
called with a separate audio_output variable on the stack & wherever
`audio_output` is passed to other functions
`dce110_apply_single_controller_ctx_to_hw()` &
`dce110_setup_audio_dto()` (like `az_configure`, `wall_dto_setup`)
replace with usage of `pipe_ctx->stream_res.audio_output`
to centralize audio data per pipe.
v4: Remove empty lines before `build_audio_output`. (Alex)
Fixes: 9c6669c2e2 ("drm/amd/display: Fix Link Override Sequencing When Switching Between DIO/HPO")
Cc: Wayne Lin <wayne.lin@amd.com>
Cc: George Shen <george.shen@amd.com>
Cc: Michael Strauss <michael.strauss@amd.com>
Cc: Alvin Lee <Alvin.Lee2@amd.com>
Cc: Ray Wu <ray.wu@amd.com>
Cc: Wenjing Liu <wenjing.liu@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1511d3c4d2 ]
Add 50ms disable delay for NV116WHM-N49, NV122WUM-N41, and MNC207QS1-1
to satisfy T9+T10 timing. Add 50ms disable delay for MNE007JA1-2
as well, since MNE007JA1-2 copies the timing of MNC207QS1-1.
Specifically, it should be noted that the MNE007JA1-2 panel was added
by someone who did not have the panel documentation, so they simply
copied the timing from the MNC207QS1-1 panel. Adding an extra 50 ms
of delay should be safe.
Fixes: 0547692ac1 ("drm/panel-edp: Add several generic edp panels")
Fixes: 50625eab39 ("drm/edp-panel: Add panel used by T14s Gen6 Snapdragon")
Signed-off-by: Langyan Ye <yelangyan@huaqin.corp-partner.google.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20250723072513.2880369-1-yelangyan@huaqin.corp-partner.google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2650bc4007 ]
The "skip reset" test waits for the timeout handler to run for the
duration of 2 * MOCK_TIMEOUT, and because the mock scheduler opted to
remove the "skip reset" flag once it fires, this gives opportunity for the
timeout handler to run twice. Second time the job will be removed from the
mock scheduler job list and the drm_mock_sched_advance() call in the test
will fail.
Fix it by making the "don't reset" flag persist for the lifetime of the
job and add a new flag to verify that the code path had executed as
expected.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 1472e7549f ("drm/sched: Add new test for DRM_GPU_SCHED_STAT_NO_HANG")
Cc: Maíra Canal <mcanal@igalia.com>
Cc: Philipp Stanner <phasta@kernel.org>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250716084817.56797-1-tvrtko.ursulin@igalia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 605c9820e4 ]
Current implementation describes only MFD's own topsys interrupts.
However, max77705 has a register which indicates interrupt source, i.e.
it acts as an interrupt controller. There's 4 interrupt sources in
max77705: topsys, charger, fuelgauge, usb type-c manager.
Setup max77705 MFD parent as an interrupt controller. Delete topsys
interrupts because currently unused.
Remove shared interrupt flag, because we're are an interrupt controller
now, and subdevices should request interrupts from us.
Fixes: c8d50f0297 ("mfd: Add new driver for MAX77705 PMIC")
Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com>
Link: https://lore.kernel.org/r/20250909-max77705-fix_interrupt_handling-v3-1-233c5a1a20b5@gmail.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d6ddd9beb1 ]
Short MMIO transfers that are not a multiple of four bytes in size need
a special case for the final bytes, however the existing implementation
is not endian-safe and introduces an incorrect byteswap on big-endian
kernels.
This usually does not cause problems because most systems are
little-endian and most transfers are multiple of four bytes long, but
still needs to be fixed to avoid the extra byteswap.
Change the special case for both i3c_writel_fifo() and i3c_readl_fifo()
to use non-byteswapping writesl() and readsl() with a single element
instead of the byteswapping writel()/readl() that are meant for individual
MMIO registers. As data is copied between a FIFO and a memory buffer,
the writesl()/readsl() loops are typically based on __raw_readl()/
__raw_writel(), resulting in the order of bytes in the FIFO to match
the order in the buffer, regardless of the CPU endianess.
The earlier versions in the dw-i3c and i3c-master-cdns had a correct
implementation, but the generic version that was recently added broke it.
Fixes: 733b439375 ("i3c: master: Add inline i3c_readl_fifo() and i3c_writel_fifo()")
Cc: Manikanta Guntupalli <manikanta.guntupalli@amd.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jorge Marques <jorge.marques@analog.com>
Link: https://lore.kernel.org/r/20250924201837.3691486-1-arnd@kernel.org
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4540aed51b ]
Yinhao et al. recently reported:
Our fuzzer tool discovered an uninitialized pointer issue in the
bpf_prog_test_run_xdp() function within the Linux kernel's BPF subsystem.
This leads to a NULL pointer dereference when a BPF program attempts to
deference the txq member of struct xdp_buff object.
The test initializes two programs of BPF_PROG_TYPE_XDP: progA acts as the
entry point for bpf_prog_test_run_xdp() and its expected_attach_type can
neither be of be BPF_XDP_DEVMAP nor BPF_XDP_CPUMAP. progA calls into a slot
of a tailcall map it owns. progB's expected_attach_type must be BPF_XDP_DEVMAP
to pass xdp_is_valid_access() validation. The program returns struct xdp_md's
egress_ifindex, and the latter is only allowed to be accessed under mentioned
expected_attach_type. progB is then inserted into the tailcall which progA
calls.
The underlying issue goes beyond XDP though. Another example are programs
of type BPF_PROG_TYPE_CGROUP_SOCK_ADDR. sock_addr_is_valid_access() as well
as sock_addr_func_proto() have different logic depending on the programs'
expected_attach_type. Similarly, a program attached to BPF_CGROUP_INET4_GETPEERNAME
should not be allowed doing a tailcall into a program which calls bpf_bind()
out of BPF which is only enabled for BPF_CGROUP_INET4_CONNECT.
In short, specifying expected_attach_type allows to open up additional
functionality or restrictions beyond what the basic bpf_prog_type enables.
The use of tailcalls must not violate these constraints. Fix it by enforcing
expected_attach_type in __bpf_prog_map_compatible().
Note that we only enforce this for tailcall maps, but not for BPF devmaps or
cpumaps: There, the programs are invoked through dev_map_bpf_prog_run*() and
cpu_map_bpf_prog_run*() which set up a new environment / context and therefore
these situations are not prone to this issue.
Fixes: 5e43f899b0 ("bpf: Check attach type at prog load time")
Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20250926171201.188490-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0cc114dc35 ]
When a module registers a struct_ops, the struct_ops type and its
corresponding map_value type ("bpf_struct_ops_") may reside in different
btf objects, here are four possible case:
+--------+---------------+-------------+---------------------------------+
| |bpf_struct_ops_| xxx_ops | |
+--------+---------------+-------------+---------------------------------+
| case 0 | btf_vmlinux | btf_vmlinux | be used and reg only in vmlinux |
+--------+---------------+-------------+---------------------------------+
| case 1 | btf_vmlinux | mod_btf | INVALID |
+--------+---------------+-------------+---------------------------------+
| case 2 | mod_btf | btf_vmlinux | reg in mod but be used both in |
| | | | vmlinux and mod. |
+--------+---------------+-------------+---------------------------------+
| case 3 | mod_btf | mod_btf | be used and reg only in mod |
+--------+---------------+-------------+---------------------------------+
Currently we figure out the mod_btf by searching with the struct_ops type,
which makes it impossible to figure out the mod_btf when the struct_ops
type is in btf_vmlinux while it's corresponding map_value type is in
mod_btf (case 2).
The fix is to use the corresponding map_value type ("bpf_struct_ops_")
as the lookup anchor instead of the struct_ops type to figure out the
`btf` and `mod_btf` via find_ksym_btf_id(), and then we can locate
the kern_type_id via btf__find_by_name_kind() with the `btf` we just
obtained from find_ksym_btf_id().
With this change the lookup obtains the correct btf and mod_btf for case 2,
preserves correct behavior for other valid cases, and still fails as
expected for the invalid scenario (case 1).
Fixes: 590a008882 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20250926071751.108293-1-alibuda@linux.alibaba.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 70e633bede ]
When the driver is removed, the clocks are first enabled by
calling pm_runtime_get_sync(), and then disabled with
pm_runtime_put_sync().
If CONFIG_PM=y, clocks for this controller are disabled when it's in
the idle state. So the clocks are properly disabled when the driver
exits.
Othewise, the clocks are always enabled and the PM functions have
no effect. Therefore, the driver exits without disabling the clocks.
# cat /sys/kernel/debug/clk/clk-pclk/clk_enable_count
18
# echo 1214a000.i2c > /sys/bus/platform/drivers/i2c_designware/bind
# cat /sys/kernel/debug/clk/clk-pclk/clk_enable_count
20
# echo 1214a000.i2c > /sys/bus/platform/drivers/i2c_designware/unbind
# cat /sys/kernel/debug/clk/clk-pclk/clk_enable_count
20
To ensure that the clocks can be disabled correctly even without
CONFIG_PM=y, should add the following fixes:
- Replace with pm_runtime_put_noidle(), which only decrements the runtime
PM usage count.
- Call i2c_dw_prepare_clk(false) to explicitly disable the clocks.
Fixes: 7272194ed3 ("i2c-designware: add minimal support for runtime PM")
Co-developed-by: Kohei Ito <ito.kohei@socionext.com>
Signed-off-by: Kohei Ito <ito.kohei@socionext.com>
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0de6194324 ]
After performing a conditional bus reset, the controller must ensure
that the SDA line is actually released.
Previously, the reset routine only performed a single check,
which could leave the bus in a locked state in some situations.
This patch introduces a loop that toggles the reset cycle and issues
a reset request up to SPACEMIT_BUS_RESET_CLK_CNT_MAX times, checking
SDA after each attempt. If SDA is released before the maximum count,
the function returns early. Otherwise, a warning is emitted.
This change improves bus recovery reliability.
Fixes: 5ea558473f ("i2c: spacemit: add support for SpacemiT K1 SoC")
Signed-off-by: Troy Mitchell <troy.mitchell@linux.spacemit.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db7720ef50 ]
After calling spacemit_i2c_conditionally_reset_bus(),
the controller should ensure that the SDA line is release
before proceeding.
Previously, the driver checked the SCL line instead,
which does not guarantee that the bus is truly idle.
This patch changes the check to verify SDA. This ensures
proper bus recovery and avoids potential communication errors
after a conditional reset.
Fixes: 5ea558473f ("i2c: spacemit: add support for SpacemiT K1 SoC")
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Troy Mitchell <troy.mitchell@linux.spacemit.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11f40684cc ]
The K1 I2C controller has an SDA glitch fix that introduces a small
delay on restart signals. While this feature can suppress glitches
on SDA when SCL = 0, it also delays the restart signal, which may
cause unexpected behavior in some transfers.
The glitch itself does not affect normal I2C operation, because
the I2C specification allows SDA to change while SCL is low.
To ensure correct transmission for every message, we disable the
SDA glitch fix by setting the RCR.SDA_GLITCH_NOFIX bit during
initialization.
This guarantees that restarts are issued promptly without
unintended delays.
Fixes: 5ea558473f ("i2c: spacemit: add support for SpacemiT K1 SoC")
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Troy Mitchell <troy.mitchell@linux.spacemit.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 445522fe7a ]
Previously, STOP handling was split into two separate steps:
1) clear TB/STOP/START/ACK bits
2) issue STOP by calling spacemit_i2c_stop()
This left a small window where the control register was updated
twice, which can confuse the controller. While this race has not
been observed with interrupt-driven transfers, it reliably causes
bus errors in PIO mode.
Inline the STOP sequence into the IRQ handler and ensure that
control register bits are updated atomically in a single writel().
Fixes: 5ea558473f ("i2c: spacemit: add support for SpacemiT K1 SoC")
Signed-off-by: Troy Mitchell <troy.mitchell@linux.spacemit.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 41d6f90ef5 ]
spacemit_i2c_wait_bus_idle() only returns 0 on success or a negative
error code on failure.
Since 'ret' can never be positive, the final 'else' branch was
unreachable, and spacemit_i2c_check_bus_release() was never called.
This commit guarantees we attempt to release the bus whenever waiting for
an idle bus fails.
Fixes: 5ea558473f ("i2c: spacemit: add support for SpacemiT K1 SoC")
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Troy Mitchell <troy.mitchell@linux.spacemit.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b492183652 ]
The old IC does not support the I2C_MASTER_WRRD (write-then-read)
function, but the current code’s handling of i2c->auto_restart may
potentially lead to entering the I2C_MASTER_WRRD software flow,
resulting in unexpected bugs.
Instead of repurposing the auto_restart flag, add a separate flag
to signal I2C_MASTER_WRRD operations.
Also fix handling of msgs. If the operation (i2c->op) is
I2C_MASTER_WRRD, then the msgs pointer is incremented by 2.
For all other operations, msgs is simply incremented by 1.
Fixes: b2ed11e224 ("I2C: mediatek: Add driver for MediaTek MT8173 I2C controller")
Signed-off-by: Leilk.Liu <leilk.liu@mediatek.com>
Suggested-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c02e4644f8 ]
Distinct between fan speed setting request coming for hwmon and
thermal subsystems.
There are fields 'last_hwmon_state' and 'last_thermal_state' in the
structure 'mlxreg_fan_pwm', which respectively store the cooling state
set by the 'hwmon' and 'thermal' subsystem.
The purpose is to make arbitration of fan speed setting. For example, if
fan speed required to be not lower than some limit, such setting is to
be performed through 'hwmon' subsystem, thus 'thermal' subsystem will
not set fan below this limit.
Currently, the 'last_thermal_state' is also be updated by 'hwmon' causing
cooling state to never be set to a lower value.
Eliminate update of 'last_thermal_state', when request is coming from
'hwmon' subsystem.
Fixes: da74944d3a ("hwmon: (mlxreg-fan) Use pwm attribute for setting fan speed low limit")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20250113084859.27064-2-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit edcc8a38b5 ]
The commit c2c60ea37e ("once: use __section(".data.once")") moved
DO_ONCE's ___done variable to .data.once section, which conflicts with
DO_ONCE_LITE() that also uses the same section.
This creates a race condition when clear_warn_once is used:
Thread 1 (DO_ONCE) Thread 2 (DO_ONCE)
__do_once_start
read ___done (false)
acquire once_lock
execute func
__do_once_done
write ___done (true) __do_once_start
release once_lock // Thread 3 clear_warn_once reset ___done
read ___done (false)
acquire once_lock
execute func
schedule once_work __do_once_done
once_deferred: OK write ___done (true)
static_branch_disable release once_lock
schedule once_work
once_deferred:
BUG_ON(!static_key_enabled)
DO_ONCE_LITE() in once_lite.h is used by WARN_ON_ONCE() and other warning
macros. Keep its ___done flag in the .data..once section and allow resetting
by clear_warn_once, as originally intended.
In contrast, DO_ONCE() is used for functions like get_random_once() and
relies on its ___done flag for internal synchronization. We should not reset
DO_ONCE() by clear_warn_once.
Fix it by isolating DO_ONCE's ___done into a separate .data..do_once section,
shielding it from clear_warn_once.
Fixes: c2c60ea37e ("once: use __section(".data.once")")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Qi Xi <xiqi2@huawei.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4680a11e1 ]
Some distributions (e.g., CachyOS) support building the kernel with -O3,
but doing so may break kfuncs, resulting in their symbols not being
properly exported.
In fact, with gcc -O3, some kfuncs may be optimized away despite being
annotated as noinline. This happens because gcc can still clone the
function during IPA optimizations, e.g., by duplicating or inlining it
into callers, and then dropping the standalone symbol. This breaks BTF
ID resolution since resolve_btfids relies on the presence of a global
symbol for each kfunc.
Currently, this is not an issue for upstream, because we don't allow
building the kernel with -O3, but it may be safer to address it anyway,
to prevent potential issues in the future if compilers become more
aggressive with optimizations.
Therefore, add __noclone to __bpf_kfunc to ensure kfuncs are never
cloned and remain distinct, globally visible symbols, regardless of
the optimization level.
Fixes: 57e7c169cd ("bpf: Add __bpf_kfunc tag for marking kernel functions as kfuncs")
Acked-by: David Vernet <void@manifault.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Link: https://lore.kernel.org/r/20250924081426.156934-1-arighi@nvidia.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 409f8fe03e ]
The newly added function causes a build failure on 32-bit targets with
older compiler version such as gcc-10:
arm-linux-gnueabi-ld: drivers/clocksource/timer-tegra186.o: in function `tegra186_wdt_get_timeleft':
timer-tegra186.c:(.text+0x3c2): undefined reference to `__aeabi_uldivmod'
The calculation can trivially be changed to avoid the division entirely,
as USEC_PER_SEC is a multiple of 5. Change both such calculation for
consistency, even though gcc apparently managed to optimize the other one
properly already.
[dlezcano : Fixed conflict with 20250614175556.922159-2-linux@roeck-us.net ]
Fixes: 28c842c8b0 ("clocksource/drivers/timer-tegra186: Add WDIOC_GETTIMELEFT support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20250620111939.3395525-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0ff52df6b3 ]
Commit d5094bcb5b ("tools/nolibc: define time_t in terms of
__kernel_old_time_t") made nolibc use the kernel's time type so that
`time_t` matches `timespec::tv_sec` on all ABIs (notably x32).
But since __kernel_old_time_t is fairly new, notably from 2020 in commit
94c467ddb2 ("y2038: add __kernel_old_timespec and __kernel_old_time_t"),
nolibc builds that rely on host headers may fail.
Switch to __kernel_time_t, which is the same as __kernel_old_time_t and
has existed for longer.
Tested in PPC VM of Open Source Lab of Oregon State University
(./tools/testing/selftests/rcutorture/bin/mkinitrd.sh)
Fixes: d5094bcb5b ("tools/nolibc: define time_t in terms of __kernel_old_time_t")
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
[Thomas: Reformat commit and its message a bit]
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 12a1185a06 ]
Current implementation uses handle_post_irq to actually handle chgin
irq. This is not how things are meant to work in regmap-irq.
Remove handle_post_irq, and request a threaded interrupt for chgin.
Fixes: a6a494c8e3 ("power: supply: max77705: Add charger driver for Maxim 77705")
Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c24928ac69 ]
Active discharge setting is a part of MFD top level i2c device, hence
cannot be controlled by charger. Writing to MAX77705_PMIC_REG_MAINCTRL1
register from charger driver is a mistake.
Move active discharge setting to MFD parent driver.
Fixes: a6a494c8e3 ("power: supply: max77705: Add charger driver for Maxim 77705")
Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com>
Acked-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ccf09357ff ]
The smp_call_function_many() kerneldoc comment got out of sync with the
function definition (bool parameter "wait" is incorrectly described as a
bitmask in it), so fix it up by copying the "wait" description from the
smp_call_function() kerneldoc and add information regarding the handling
of the local CPU to it.
Fixes: 49b3bd213a ("smp: Fix all kernel-doc warnings")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a3c73d629e ]
Syzbot generated a program that triggers a verifier_bug() call in
maybe_exit_scc(). maybe_exit_scc() assumes that, when called for a
state with insn_idx in some SCC, there should be an instance of struct
bpf_scc_visit allocated for that SCC. Turns out the assumption does
not hold for speculative execution paths. See example in the next
patch.
maybe_scc_exit() is called from update_branch_counts() for states that
reach branch count of zero, meaning that path exploration for a
particular path is finished. Path exploration can finish in one of
three ways:
a. Verification error is found. In this case, update_branch_counts()
is called only for non-speculative paths.
b. Top level BPF_EXIT is reached. Such instructions are never a part of
an SCC, so compute_scc_callchain() in maybe_scc_exit() will return
false, and maybe_scc_exit() will return early.
c. A checkpoint is reached and matched. Checkpoints are created by
is_state_visited(), which calls maybe_enter_scc(), which allocates
bpf_scc_visit instances for checkpoints within SCCs.
Hence, for non-speculative symbolic execution paths, the assumption
still holds: if maybe_scc_exit() is called for a state within an SCC,
bpf_scc_visit instance must exist.
This patch removes the verifier_bug() call for speculative paths.
Fixes: c9e31900b5 ("bpf: propagate read/precision marks over state graph backedges")
Reported-by: syzbot+3afc814e8df1af64b653@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/68c85acd.050a0220.2ff435.03a4.GAE@google.com/
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250916212251.3490455-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ed323aeda5 ]
After using numa_set_mempolicy_home_node() the test fails to compile on
systems with libnuma library versioned lower than 2.0.16.
In order to allow lower library version add a pkg-config related check
and exclude that part of the code. Without the proper MPOL setup it
can't be tested.
Make a total number of tests two. The first one is the first batch and
the second is the MPOL related one. The goal is to let the user know if
it has been skipped due to library limitation.
Remove test_futex_mpol(), it was unused and it is now complained by the
compiler if the part is not compiled.
Fixes: 0ecb4232fc ("selftests/futex: Set the home_node in futex_numa_mpol")
Closes: https://lore.kernel.org/oe-lkp/202507150858.bedaf012-lkp@intel.com
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c1c8634577 ]
Instead of just checking if the syscall failed as expected, check as
well if the returned error code matches the expected error code.
[ bigeasy: reword the commmit message ]
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Stable-dep-of: ed323aeda5 ("selftest/futex: Compile also with libnuma < 2.0.16")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 237bfb76c9 ]
On 32bit ARM systems gcc-12 will use 32bit timestamps while gcc-13 and later
will use 64bit timestamps. The problem is that SYS_futex will continue
pointing at the 32bit system call. This makes the futex_wait test fail like
this:
waiter failed errno 110
not ok 1 futex_wake private returned: 0 Success
waiter failed errno 110
not ok 2 futex_wake shared (page anon) returned: 0 Success
waiter failed errno 110
not ok 3 futex_wake shared (file backed) returned: 0 Success
Instead of compiling differently depending on the gcc version, use the
-D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 options to ensure that 64bit timestamps
are used. Then use ifdefs to make SYS_futex point to the 64bit system call.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/20250827130011.677600-6-bigeasy@linutronix.de
Stable-dep-of: ed323aeda5 ("selftest/futex: Compile also with libnuma < 2.0.16")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 336aec7b06 ]
Tightening the throttle activation check in blk_throtl_activated() to
require both q->td presence and policy bit set introduced a memory leak
during disk release:
blkg_destroy_all() clears the policy bit first during queue deactivation,
causing subsequent blk_throtl_exit() to skip throtl_data cleanup when
blk_throtl_activated() fails policy check.
Idealy we should avoid modifying blk_throtl_exit() activation check because
it's intuitive that blk-throtl start from blk_throtl_init() and end in
blk_throtl_exit(). However, call blk_throtl_exit() before
blkg_destroy_all() will make a long term deadlock problem easier to
trigger[1], hence fix this problem by checking if q->td is NULL from
blk_throtl_exit(), and remove policy deactivation as well since it's
useless.
[1] https://lore.kernel.org/all/CAHj4cs9p9H5yx+ywsb3CMUdbqGPhM+8tuBvhW=9ADiCjAqza9w@mail.gmail.com/#t
Fixes: bd9fd5be6b ("blk-throttle: fix access race during throttle policy activation")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Closes: https://lore.kernel.org/all/CAHj4cs-p-ZwBEKigBj7T6hQCOo-H68-kVwCrV6ZvRovrr9Z+HA@mail.gmail.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8cfc524ea ]
Check if watchdog device supports WDIOF_KEEPALIVEPING option before
entering keep_alive() ping test loop. Fix watchdog-test silently looping
if ioctl based ping is not supported by the device. Exit from test in
such case instead of getting stuck in loop executing failing keep_alive()
watchdog_info:
identity: m41t93 rtc Watchdog
firmware_version: 0
Support/Status: Set timeout (in seconds)
Support/Status: Watchdog triggers a management or other external alarm not a reboot
Watchdog card disabled.
Watchdog timeout set to 5 seconds.
Watchdog ping rate set to 2 seconds.
Watchdog card enabled.
WDIOC_KEEPALIVE not supported by this device
without this change
Watchdog card disabled.
Watchdog timeout set to 5 seconds.
Watchdog ping rate set to 2 seconds.
Watchdog card enabled.
Watchdog Ticking Away!
(Where test stuck here forver silently)
Updated change log at commit time:
Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20250914152840.GA3047348@bhairav-test.ee.iitb.ac.in
Fixes: d89d08ffd2 ("selftests: watchdog: Fix ioctl SET* error paths to take oneshot exit path")
Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f2d8c5a2f7 ]
Atomic writes support may not always be possible when stacking devices
which support atomic writes. Such as case is a different atomic write
boundary between stacked devices (which is not supported).
In the case that atomic writes cannot supported, the top device queue HW
limits are set to 0.
However, in blk_stack_atomic_writes_limits(), we detect that we are
stacking the first bottom device by checking the top device
atomic_write_hw_max value == 0. This get confused with the case of atomic
writes not supported, above.
Make the distinction between stacking the first bottom device and no
atomics supported by initializing stacked device atomic_write_hw_max =
UINT_MAX and checking that for stacking the first bottom device.
Fixes: d7f36dc446 ("block: Support atomic writes limits for stacked devices")
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bfd4037296 ]
In commit 63d092d1c1 ("block: use chunk_sectors when evaluating stacked
atomic write limits"), it was missed to use a chunk sectors limit check
in blk_stack_atomic_writes_boundary_head(), so update that function to
do the proper check.
Fixes: 63d092d1c1 ("block: use chunk_sectors when evaluating stacked atomic write limits")
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a7869b0a25 ]
Driver wants to nack the IBI request when the target is not in the
known address list. In below code, svc_i3c_master_nack_ibi() will
cause undefined behavior when using AUTOIBI with auto response rule,
because hw always auto ack the IBI request.
switch (ibitype) {
case SVC_I3C_MSTATUS_IBITYPE_IBI:
dev = svc_i3c_master_dev_from_addr(master, ibiaddr);
if (!dev || !is_events_enabled(master, SVC_I3C_EVENT_IBI))
svc_i3c_master_nack_ibi(master);
...
break;
AutoIBI has another issue that the controller doesn't quit AutoIBI state
after IBIWON polling timeout when there is a SDA glitch(high->low->high).
1. SDA high->low: raising an interrupt to execute IBI ISR
2. SDA low->high
3. Driver writes an AutoIBI request
4. AutoIBI process does not start because SDA is not low
5. IBIWON polling times out
6. Controller reamins in AutoIBI state and doesn't accept EmitStop request
Emitting broadcast address with IBIRESP_MANUAL avoids both issues.
Fixes: dd3c52846d ("i3c: master: svc: Add Silvaco I3C master driver")
Signed-off-by: Stanley Chu <yschu@nuvoton.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20250829012309.3562585-2-yschu@nuvoton.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df4666a490 ]
In addition to sending permitted commands such as connect/auth
over the initial unencrypted admin connection as part of secure
channel concatenation, the host also sends commands such as
Property Get and Identify on the same. This is a spec violation
leading to secure concat failures. Fix this by ensuring these
additional commands are avoided on this connection.
Fixes: 104d0e2f62 ("nvme-fabrics: reset admin connection for secure concatenation")
Signed-off-by: Martin George <marting@netapp.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db5a5406fb ]
It’s possible for more than one async command to be in flight from
__nvmet_fc_send_ls_req. For each command, a tgtport reference is taken.
In the current code, only one put work item is queued at a time, which
results in a leaked reference.
To fix this, move the work item to the nvmet_fc_ls_req_op struct, which
already tracks all resources related to the command.
Fixes: 710c69dbac ("nvmet-fc: avoid deadlock on delete association path")
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ff1bd7846 ]
While setting chap->s2 to zero as part of secure channel
concatenation, the host missed out to disable the bi_directional
flag to indicate that controller authentication is not requested.
Fix the same.
Fixes: e88a7595b5 ("nvme-tcp: request secure channel concatenation")
Signed-off-by: Martin George <marting@netapp.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 00f83f0e07 ]
The function set_prescale_div() is responsible for calculating the clock
divisor settings such that the input clock rate is divided down such that
the required period length is at most 0x10000 clock ticks. If period_cycles
is an integer multiple of 0x10000, the divisor period_cycles / 0x10000 is
good enough. So round up in the calculation of the required divisor and
compare it using >= instead of >.
Fixes: 19891b20e7 ("pwm: pwm-tiehrpwm: PWM driver support for EHRPWM")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/85488616d7bfcd9c32717651d0be7e330e761b9c.1754927682.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bc7ce5bfc5 ]
In Up-Count Mode the timer is reset to zero one tick after it reaches
TBPRD, so the period length is (TBPRD + 1) * T_TBCLK. This matches both
the documentation and measurements. So the value written to the TBPRD has
to be one less than the calculated period_cycles value.
A complication here is that for a 100% relative duty-cycle the value
written to the CMPx register has to be TBPRD + 1 which might overflow if
TBPRD is 0xffff. To handle that the calculation of the AQCTLx register
has to be moved to ehrpwm_pwm_config() and the edge at CTR = CMPx has to
be skipped.
Additionally the AQCTL_PRD register field has to be 0 because that defines
the hardware's action when the maximal counter value is reached, which is
(as above) one clock tick before the period's end. The period start edge
has to happen when the counter is reset and so is defined in the AQCTL_ZRO
field.
Fixes: 19891b20e7 ("pwm: pwm-tiehrpwm: PWM driver support for EHRPWM")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/dc818c69b7cf05109ecda9ee6b0043a22de757c1.1754927682.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 21a5e91fda ]
The pwm driver calls pm_runtime_get_sync() when the hardware becomes
enabled and pm_runtime_put_sync() when it becomes disabled. The PWM's
state is kept when a consumer goes away, so the call to
pm_runtime_put_sync() in the .free() callback is unbalanced resulting in
a non-functional device and a reference underlow for the second consumer.
The easiest fix for that issue is to just not drop the runtime PM
reference in .free(), so do that.
Fixes: 19891b20e7 ("pwm: pwm-tiehrpwm: PWM driver support for EHRPWM")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/bbb089c4b5650cc1f7b25cf582d817543fd25384.1754927682.git.u.kleine-koenig@baylibre.com
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bd1ce7ef6a ]
When the board was added, its external 32.768 KHz crystal was described
but not hooked up correctly. This meant the device had to fall back to
the SoC's internal oscillator or divide a 32 KHz clock from the main
oscillator, neither of which are accurate for the RTC. As a result the
RTC clock will drift badly.
Hook the crystal up to the RTC block and request the correct clock rate.
Fixes: de713ccb99 ("arm64: dts: allwinner: t527: Add OrangePi 4A board")
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20250913102450.3935943-3-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d5e1ba00a ]
When the board was added, its external 32.768 KHz crystal was described
but not hooked up correctly. This meant the device had to fall back to
the SoC's internal oscillator or divide a 32 KHz clock from the main
oscillator, neither of which are accurate for the RTC. As a result the
RTC clock will drift badly.
Hook the crystal up to the RTC block and request the correct clock rate.
Fixes: dbe54efa32 ("arm64: dts: allwinner: a523: add Avaota-A1 router support")
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20250913102450.3935943-2-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f01e1e14e ]
The Radxa Cubie A5E has empty pads for a 32.768 KHz crystal, but it is
left unpopulated, as per the schematics and seen on board images. A dead
give away is the RTC's LOSC auto switch register showing the external
OSC to be abnormal.
Drop the external crystal from the device tree. It was not referenced
anyway.
Fixes: c2520cd032 ("arm64: dts: allwinner: a523: add Radxa A5E support")
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20250913102450.3935943-1-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4184f01907 ]
The Radxa Cubie A5E has a 3-color LED. The green and blue LEDs are wired
to GPIO pins on the SoC, and the green one is lit by default to serve as
a power indicator. The red LED is wired to the M.2 slot.
Add device nodes for the green and blue LEDs.
A default "heartbeat" trigger is set for the green power LED, though in
practice it might be better if it were inverted, i.e. lit most of the
time.
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20250812175927.2199219-1-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Stable-dep-of: 9f01e1e14e ("arm64: dts: allwinner: a527: cubie-a5e: Drop external 32.768 KHz crystal")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 072755cca7 ]
Rename the inner 'frm' variable to 'resp_frm' in the write path of
mmc_route_rpmb_frames() to avoid shadowing the outer 'frm' variable.
The function declares 'frm' at function scope pointing to the request
frame, but then redeclares another 'frm' variable inside the write
block pointing to the response frame. This shadowing makes the code
confusing and error-prone.
Using 'resp_frm' for the response frame makes the distinction clear
and improves code readability.
Fixes: 7852028a35 ("mmc: block: register RPMB partition with the RPMB subsystem")
Reviewed-by: Avri Altman <avri.altman@sandisk.com>
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0370911565 ]
The touchscreen controller used with the original Krabby design is the
Elan eKTH6918, which is in the same family as eKTH6915, but supporting
a larger screen size with more sense lines.
OTOH, the touchscreen controller that actually shipped on the Tentacruel
devices is the Elan eKTH6A12NAY. A compatible string was added for it
specifically because it has different power sequencing timings.
Fix up the touchscreen nodes for both these. This also includes adding
a previously missing reset line. Also add "no-reset-on-power-off" since
the power is always on, and putting it in reset would consume more
power.
Fixes: 8855d01fb8 ("arm64: dts: mediatek: Add MT8186 Krabby platform based Tentacruel / Tentacool")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://lore.kernel.org/r/20250812090135.3310374-1-wenst@chromium.org
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c881d1c37b ]
The efuse block in the MT8188 contains the GPU speed bin cell, and like
the MT8186 one, has the same conversion scheme to work with the GPU OPP
binding. This was reflected in a corresponding change to the efuse DT
binding.
Change the fallback compatible of the MT8188's efuse block from the
generic one to the MT8186 one. This also makes GPU DVFS work properly.
Fixes: d39aacd102 ("arm64: dts: mediatek: mt8188: add lvts definitions")
Fixes: 50e7592cb6 ("arm64: dts: mediatek: mt8188: Add GPU speed bin NVMEM cells")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20250610063431.2955757-3-wenst@chromium.org
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 932424a925 ]
This reverts commit 1a314099b7.
The C6x carveouts are reversed intentionally. This is due to the
requirement to keep the DMA memory region as non-cached, however the
minimum granular cache region for C6x is 16MB. So, C66x_0 marks the
entire C66x_1 16MB memory carveouts as non-cached, and uses the DMA
memory region of C66x_1 as its own, and vice-versa.
This was also called out in the original commit which introduced these
reversed carveouts:
"The minimum granularity on the Cache settings on C66x DSP
cores is 16MB, so the DMA memory regions are chosen such that
they are in separate 16MB regions for each DSP, while reserving
a total of 16 MB for each DSP and not changing the overall DSP
remoteproc carveouts."
Fixes: 1a314099b7 ("arm64: dts: ti: k3-j721e-beagleboneai64: Fix reversed C6x carveout locations")
Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
Acked-by: Andrew Davis <afd@ti.com>
Link: https://patch.msgid.link/20250908142826.1828676-23-b-padhi@ti.com
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79a1778c78 ]
This reverts commit 9f3814a7c0.
The C6x carveouts are reversed intentionally. This is due to the
requirement to keep the DMA memory region as non-cached, however the
minimum granular cache region for C6x is 16MB. So, C66x_0 marks the
entire C66x_1 16MB memory carveouts as non-cached, and uses the DMA
memory region of C66x_1 as its own, and vice-versa.
This was also called out in the original commit which introduced these
reversed carveouts:
"The minimum granularity on the Cache settings on C66x DSP cores
is 16MB, so the DMA memory regions are chosen such that they are
in separate 16MB regions for each DSP, while reserving a total
of 16 MB for each DSP and not changing the overall DSP
remoteproc carveouts."
Fixes: 9f3814a7c0 ("arm64: dts: ti: k3-j721e-sk: Fix reversed C6x carveout locations")
Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
Acked-by: Andrew Davis <afd@ti.com>
Link: https://patch.msgid.link/20250908142826.1828676-22-b-padhi@ti.com
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aee0678597 ]
Currently, the reserved memory carveouts used by remote processors are
named like 'rproc-name-<dma>-memory-region@addr'. While it is
descriptive, the node label already serves that purpose. Rename reserved
memory nodes to generic 'memory@addr' to align with the device tree
specifications. This is done for all TI K3 based boards.
Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Link: https://patch.msgid.link/20250908142826.1828676-14-b-padhi@ti.com
Signed-off-by: Nishanth Menon <nm@ti.com>
Stable-dep-of: 79a1778c78 ("Revert "arm64: dts: ti: k3-j721e-sk: Fix reversed C6x carveout locations"")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 00c8fdc280 ]
The J742S2 SoC reuses the common k3-j784s4-j742s2-mcu-wakeup-common.dtsi
for its MCU domain, but it does not override the firmware-name property
for its R5F cores. This causes the wrong firmware binaries to be
referenced.
Introduce a new k3-j742s2-mcu-wakeup.dtsi file to override the
firmware-name property with correct names for J742s2.
Fixes: 38fd90a3e1 ("arm64: dts: ti: Introduce J742S2 SoC family")
Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
Reviewed-by: Udit Kumar <u-kumar1@ti.com>
Link: https://patch.msgid.link/20250823163111.2237199-1-b-padhi@ti.com
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 843367c7ed ]
The RK3576 EVB1 has a RTL8211F PHY for each GMAC interface with
a dedicated reset line and the 25MHz clock provided by the SoC.
The current description results in non-working Ethernet as the
clocks are only enabled by the PHY driver, but probing the right
PHY driver currently requires that the PHY ID register can be read
for automatic identification.
This fixes up the network description to get the network functionality
working reliably and cleans up usage of deprecated DT properties while
at it.
Fixes: f135a1a073 ("arm64: dts: rockchip: Add rk3576 evb1 board")
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Link: https://lore.kernel.org/r/20250910-rk3576-evb-network-v1-1-68ed4df272a2@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6293e336f6 ]
This helper only support to allocate the default number of requests,
add a new parameter to support specific number of requests.
Prepare to fix potential deadlock in the case nr_requests grow.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: b86433721f ("blk-mq: fix potential deadlock while nr_requests grown")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e632004044 ]
No functional changes are intended, make code cleaner and prepare to fix
the grow case in following patches.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: b86433721f ("blk-mq: fix potential deadlock while nr_requests grown")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f2799c546 ]
For shared tags case, all hctx->sched_tags/tags are the same, it doesn't
make sense to call into blk_mq_tag_update_depth() multiple times for the
same tags.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: b86433721f ("blk-mq: fix potential deadlock while nr_requests grown")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 626ff4f8eb ]
request_queue->nr_requests can be changed by:
a) switch elevator by updating nr_hw_queues
b) switch elevator by elevator sysfs attribute
c) configue queue sysfs attribute nr_requests
Current lock order is:
1) update_nr_hwq_lock, case a,b
2) freeze_queue
3) elevator_lock, case a,b,c
And update nr_requests is seriablized by elevator_lock() already,
however, in the case c, we'll have to allocate new sched_tags if
nr_requests grow, and do this with elevator_lock held and queue
freezed has the risk of deadlock.
Hence use update_nr_hwq_lock instead, make it possible to allocate
memory if tags grow, meanwhile also prevent nr_requests to be changed
concurrently.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: b86433721f ("blk-mq: fix potential deadlock while nr_requests grown")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b46d4c447d ]
queue_requests_store() is the only caller of
blk_mq_update_nr_requests(), and blk_mq_update_nr_requests() is the
only caller of blk_mq_tag_update_depth(), however, they all have
checkings for nr_requests input by user.
Make code cleaner by moving all the checkings to the top function:
1) nr_requests > reserved tags;
2) if there is elevator, 4 <= nr_requests <= 2048;
3) if elevator is none, 4 <= nr_requests <= tag_set->queue_depth;
Meanwhile, case 2 is the only case tags can grow and -ENOMEM might be
returned.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: b86433721f ("blk-mq: fix potential deadlock while nr_requests grown")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8bd7195fea ]
1) queue_requests_store() is the only caller of
blk_mq_update_nr_requests(), where queue is already freezed, no need to
check mq_freeze_depth;
2) q->tag_set must be set for request based device, and queue_is_mq() is
already checked in blk_mq_queue_attr_visible(), no need to check
q->tag_set.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: b86433721f ("blk-mq: fix potential deadlock while nr_requests grown")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2f5974079 ]
Currently, split bio will be chained to original bio, and original bio
will be resubmitted to the tail of current->bio_list, waiting for
split bio to be issued. However, if split bio get split again, the IO
order will be messed up. This problem, on the one hand, will cause
performance degradation, especially for mdraid with large IO size; on
the other hand, will cause write errors for zoned block devices[1].
For example, in raid456 IO will first be split by max_sector from
md_submit_bio(), and then later be split again by chunksize for internal
handling:
For example, assume max_sectors is 1M, and chunksize is 512k
1) issue a 2M IO:
bio issuing: 0+2M
current->bio_list: NULL
2) md_submit_bio() split by max_sector:
bio issuing: 0+1M
current->bio_list: 1M+1M
3) chunk_aligned_read() split by chunksize:
bio issuing: 0+512k
current->bio_list: 1M+1M -> 512k+512k
4) after first bio issued, __submit_bio_noacct() will contuine issuing
next bio:
bio issuing: 1M+1M
current->bio_list: 512k+512k
bio issued: 0+512k
5) chunk_aligned_read() split by chunksize:
bio issuing: 1M+512k
current->bio_list: 512k+512k -> 1536k+512k
bio issued: 0+512k
6) no split afterwards, finally the issue order is:
0+512k -> 1M+512k -> 512k+512k -> 1536k+512k
This behaviour will cause large IO read on raid456 endup to be small
discontinuous IO in underlying disks. Fix this problem by placing split
bio to the head of current->bio_list.
Test script: test on 8 disk raid5 with 64k chunksize
dd if=/dev/md0 of=/dev/null bs=4480k iflag=direct
Test results:
Before this patch
1) iostat results:
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
md0 52430.00 3276.87 0.00 0.00 0.62 64.00 32.60 80.10
sd* 4487.00 409.00 2054.00 31.40 0.82 93.34 3.68 71.20
2) blktrace G stage:
8,0 0 486445 11.357392936 843 G R 14071424 + 128 [dd]
8,0 0 486451 11.357466360 843 G R 14071168 + 128 [dd]
8,0 0 486454 11.357515868 843 G R 14071296 + 128 [dd]
8,0 0 486468 11.357968099 843 G R 14072192 + 128 [dd]
8,0 0 486474 11.358031320 843 G R 14071936 + 128 [dd]
8,0 0 486480 11.358096298 843 G R 14071552 + 128 [dd]
8,0 0 486490 11.358303858 843 G R 14071808 + 128 [dd]
3) io seek for sdx:
Noted io seek is the result from blktrace D stage, statistic of:
ABS((offset of next IO) - (offset + len of previous IO))
Read|Write seek
cnt 55175, zero cnt 25079
>=(KB) .. <(KB) : count ratio |distribution |
0 .. 1 : 25079 45.5% |########################################|
1 .. 2 : 0 0.0% | |
2 .. 4 : 0 0.0% | |
4 .. 8 : 0 0.0% | |
8 .. 16 : 0 0.0% | |
16 .. 32 : 0 0.0% | |
32 .. 64 : 12540 22.7% |##################### |
64 .. 128 : 2508 4.5% |##### |
128 .. 256 : 0 0.0% | |
256 .. 512 : 10032 18.2% |################# |
512 .. 1024 : 5016 9.1% |######### |
After this patch:
1) iostat results:
Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz aqu-sz %util
md0 87965.00 5271.88 0.00 0.00 0.16 61.37 14.03 90.60
sd* 6020.00 658.44 5117.00 45.95 0.44 112.00 2.68 86.50
2) blktrace G stage:
8,0 0 206296 5.354894072 664 G R 7156992 + 128 [dd]
8,0 0 206305 5.355018179 664 G R 7157248 + 128 [dd]
8,0 0 206316 5.355204438 664 G R 7157504 + 128 [dd]
8,0 0 206319 5.355241048 664 G R 7157760 + 128 [dd]
8,0 0 206333 5.355500923 664 G R 7158016 + 128 [dd]
8,0 0 206344 5.355837806 664 G R 7158272 + 128 [dd]
8,0 0 206353 5.355960395 664 G R 7158528 + 128 [dd]
8,0 0 206357 5.356020772 664 G R 7158784 + 128 [dd]
3) io seek for sdx
Read|Write seek
cnt 28644, zero cnt 21483
>=(KB) .. <(KB) : count ratio |distribution |
0 .. 1 : 21483 75.0% |########################################|
1 .. 2 : 0 0.0% | |
2 .. 4 : 0 0.0% | |
4 .. 8 : 0 0.0% | |
8 .. 16 : 0 0.0% | |
16 .. 32 : 0 0.0% | |
32 .. 64 : 7161 25.0% |############## |
BTW, this looks like a long term problem from day one, and large
sequential IO read is pretty common case like video playing.
And even with this patch, in this test case IO is merged to at most 128k
is due to block layer plug limit BLK_PLUG_FLUSH_SIZE, increase such
limit can get even better performance. However, we'll figure out how to do
this properly later.
[1] https://lore.kernel.org/all/e40b076d-583d-406b-b223-005910a9f46f@acm.org/
Fixes: d89d87965d ("When stacked block devices are in-use (e.g. md or dm), the recursive calls")
Reported-by: Tie Ren <tieren@fnnas.com>
Closes: https://lore.kernel.org/all/7dro5o7u5t64d6bgiansesjavxcuvkq5p2pok7dtwkav7b7ape@3isfr44b6352/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b64682e78 ]
Lots of checks are already done while submitting this bio the first
time, and there is no need to check them again when this bio is
resubmitted after split.
Hence open code should_fail_bio() and blk_throtl_bio() that are still
necessary from submit_bio_split_bioset().
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: b2f5974079 ("block: fix ordering of recursive split IO")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e37b5596a1 ]
No functional changes are intended, some drivers like mdraid will split
bio by internal processing, prepare to unify bio split codes.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: b2f5974079 ("block: fix ordering of recursive split IO")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1f963bdd64 ]
bio->issue_time_ns is only used by blk-iolatency, which can only be
enabled for rq-based disk, hence it's not necessary to initialize
the time for bio-based disk.
Meanwhile, if bio is split by blk_crypto_fallback_split_bio_if_needed(),
the issue time is not initialized for new split bio, this can be fixed
as well.
Noted the next patch will optimize better that bio issue time will
only be used when blk-iolatency is really enabled by the disk.
Fixes: 488f6682c8 ("block: blk-crypto-fallback for Inline Encryption")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1733e88874 ]
Now that bio->bi_issue is only used by blk-iolatency to get bio issue
time, replace bio_issue with u64 time directly and remove bio_issue to
make code cleaner.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: 1f963bdd64 ("block: initialize bio issue time in blk_mq_submit_bio()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cdc06f9126 ]
Make sure to drop the reference to the saw device taken by
of_find_device_by_node() after retrieving its driver data during
probe().
Also drop the reference to the CPU node sooner to avoid leaking it in
case there is no saw node or device.
Fixes: 60f3692b5f ("cpuidle: qcom_spm: Detach state machine from main SPM handling")
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f89c7fb83a ]
On RK3588 with LPDDR4X memory, the cycle count as returned by
perf stat -a -e rockchip_ddr/cycles/ sleep 1
consistently reads half as much as what the actual DDR frequency is at.
For a LPDDR4X module running at 2112MHz, I get more like 1056059916
cycles per second, which is almost bang-on half what it should be. No,
I'm not mixing up megatransfers and megahertz.
Consulting the downstream driver, this appears to be because the RK3588
hardware specifically (and RK3528 as well, for future reference) needs a
multiplier of 2 to get to the correct frequency with everything but
LPDDR5.
The RK3588's actual memory bandwidth measurements in MB/s are correct
however, as confirmed with stress-ng --stream. This makes me think the
access counters are not affected in the same way. This tracks with the
vendor kernel not multiplying the access counts either.
Solve this by adding a new member to the dfi struct, which each SoC can
set to whatever they want, but defaults to 1 if left unset by the SoC
init functions. The event_get_count op can then use this multiplier if
the cycle count is requested.
The cycle multiplier is not used in rockchip_dfi_get_event because the
vendor driver doesn't use it there either, and we don't do other actual
bandwidth unit conversion stuff in there anyway.
Fixes: 481d97ba61 ("PM / devfreq: rockchip-dfi: add support for RK3588")
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Link: https://lore.kernel.org/lkml/20250530-rk3588-dfi-improvements-v1-1-6e077c243a95@collabora.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0aeb7ed4bc ]
A value of 10 is not valid for "mediatek,pull-down-adv" and
"mediatek,pull-up-adv" properties which have defined values of 0-3. It
appears the "10" was written as a binary value. The driver only looks at
the lowest 2 bits, so the value "10" decimal works out the same as if
"2" was used.
Fixes: cd894e274b ("arm64: dts: mt8183: Add krane-sku176 board")
Fixes: 19b6403f1e ("arm64: dts: mt8183: add mt8183 pumpkin board")
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20250722171152.58923-2-robh@kernel.org
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3374b5fb26 ]
When test suspend resume with 6.8 based kernel, system can't resume
and I got below error which can be also reproduced with 6.16 rc6+
kernel.
mtk-pcie-gen3 112f0000.pcie: PCIe link down, current LTSSM state: detect.quiet (0x0)
mtk-pcie-gen3 112f0000.pcie: PM: dpm_run_callback(): genpd_resume_noirq returns -110
mtk-pcie-gen3 112f0000.pcie: PM: failed to resume noirq: error -110
After investigation, looks pcie0 has the same problem as pcie1 as
decribed in commit 3d7fdd8e38 ("arm64: dts: mediatek: mt8195:
Remove suspend-breaking reset from pcie1").
Fixes: ecc0af6a3f ("arm64: dts: mt8195: Add pcie and pcie phy nodes")
Signed-off-by: Guoqing Jiang <guoqing.jiang@canonical.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Macpaul Lin <macpaul.lin@mediatek.com>
Link: https://lore.kernel.org/r/20250721095959.57703-1-guoqing.jiang@canonical.com
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe2a449a45 ]
tick_shutdown() sets the state of the clockevent device to detached
first and the invokes clockevents_exchange_device(), which in turn
invokes clockevents_switch_state().
But clockevents_switch_state() returns without invoking the device shutdown
callback as the device is already in detached state. As a consequence the
timer device is not shutdown when a CPU goes offline.
tick_shutdown() does this because it was originally invoked on a online CPU
and not on the outgoing CPU. It therefore could not access the clockevent
device of the already offlined CPU and just set the state.
Since commit 3b1596a21f tick_shutdown() is called on the outgoing CPU, so
the hardware device can be accessed.
Remove the state set before calling clockevents_exchange_device(), so that
the subsequent clockevents_switch_state() handles the state transition and
invokes the shutdown callback of the clockevent device.
[ tglx: Massaged change log ]
Fixes: 3b1596a21f ("clockevents: Shutdown and unregister current clockevents at CPUHP_AP_TICK_DYING")
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/all/20250906064952.3749122-2-maobibo@loongson.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a186120c78 ]
Code in gicv5_its_irq_domain_alloc() has two issues:
- it checks the wrong return value/variable when calling gicv5_alloc_lpi()
- The cleanup code does not take previous loop iterations into account
Fix both issues at once by adding the right gicv5_alloc_lpi() variable
check and by reworking the function cleanup code to take into account
current and previous iterations.
[ lpieralisi: Reworded commit message ]
Fixes: 57d72196df ("irqchip/gic-v5: Add GICv5 ITS support")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/all/20250908082745.113718-4-lpieralisi@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4b59a9f762 ]
If AT_SYSINFO_EHDR is missing the whole test needs to be skipped.
Currently this results in the following output:
TAP version 13
1..16
# AT_SYSINFO_EHDR is not present!
This output is incorrect, as "1..16" still requires the subtest lines to
be printed, which isn't done however.
Switch to the correct skipping functions, so the output now correctly
indicates that no subtests are being run:
TAP version 13
1..0 # SKIP AT_SYSINFO_EHDR is not present!
Fixes: 693f5ca08c ("kselftest: Extend vDSO selftest")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-2-90f499dd35f8@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f15e0f9ef ]
The _rval register variable is meant to be an output operand of the asm
statement but is instead used as input operand.
clang 20.1 notices this and triggers -Wuninitialized warnings:
tools/testing/selftests/timers/auxclock.c:154:10: error: variable '_rval' is uninitialized when used here [-Werror,-Wuninitialized]
154 | return VDSO_CALL(self->vdso_clock_gettime64, 2, clockid, ts);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tools/testing/selftests/timers/../vDSO/vdso_call.h:59:10: note: expanded from macro 'VDSO_CALL'
59 | : "r" (_rval) \
| ^~~~~
tools/testing/selftests/timers/auxclock.c:154:10: note: variable '_rval' is declared here
tools/testing/selftests/timers/../vDSO/vdso_call.h:47:2: note: expanded from macro 'VDSO_CALL'
47 | register long _rval asm ("r3"); \
| ^
It seems the list of input and output operands have been switched around.
However as the argument registers are not always initialized they can not
be marked as pure inputs as that would trigger -Wuninitialized warnings.
Adding _rval as another input and output operand does also not work as it
would collide with the existing _r3 variable.
Instead reuse _r3 for both the argument and the return value.
Fixes: 6eda706a53 ("selftests: vDSO: fix the way vDSO functions are called for powerpc")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/all/20250812-vdso-tests-fixes-v2-1-90f499dd35f8@linutronix.de
Closes: https://lore.kernel.org/oe-kbuild-all/202506180223.BOOk5jDK-lkp@intel.com/
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bd9fd5be6b ]
On repeated cold boots we occasionally hit a NULL pointer crash in
blk_should_throtl() when throttling is consulted before the throttle
policy is fully enabled for the queue. Checking only q->td != NULL is
insufficient during early initialization, so blkg_to_pd() for the
throttle policy can still return NULL and blkg_to_tg() becomes NULL,
which later gets dereferenced.
Unable to handle kernel NULL pointer dereference
at virtual address 0000000000000156
...
pc : submit_bio_noacct+0x14c/0x4c8
lr : submit_bio_noacct+0x48/0x4c8
sp : ffff800087f0b690
x29: ffff800087f0b690 x28: 0000000000005f90 x27: ffff00068af393c0
x26: 0000000000080000 x25: 000000000002fbc0 x24: ffff000684ddcc70
x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000000000
x20: 0000000000080000 x19: ffff000684ddcd08 x18: ffffffffffffffff
x17: 0000000000000000 x16: ffff80008132a550 x15: 0000ffff98020fff
x14: 0000000000000000 x13: 1fffe000d11d7021 x12: ffff000688eb810c
x11: ffff00077ec4bb80 x10: ffff000688dcb720 x9 : ffff80008068ef60
x8 : 00000a6fb8a86e85 x7 : 000000000000111e x6 : 0000000000000002
x5 : 0000000000000246 x4 : 0000000000015cff x3 : 0000000000394500
x2 : ffff000682e35e40 x1 : 0000000000364940 x0 : 000000000000001a
Call trace:
submit_bio_noacct+0x14c/0x4c8
verity_map+0x178/0x2c8
__map_bio+0x228/0x250
dm_submit_bio+0x1c4/0x678
__submit_bio+0x170/0x230
submit_bio_noacct_nocheck+0x16c/0x388
submit_bio_noacct+0x16c/0x4c8
submit_bio+0xb4/0x210
f2fs_submit_read_bio+0x4c/0xf0
f2fs_mpage_readpages+0x3b0/0x5f0
f2fs_readahead+0x90/0xe8
Tighten blk_throtl_activated() to also require that the throttle policy
bit is set on the queue:
return q->td != NULL &&
test_bit(blkcg_policy_throtl.plid, q->blkcg_pols);
This prevents blk_should_throtl() from accessing throttle group state
until policy data has been attached to blkgs.
Fixes: a3166c5170 ("blk-throttle: delay initialization until configuration")
Co-developed-by: Liang Jie <liangjie@lixiang.com>
Signed-off-by: Liang Jie <liangjie@lixiang.com>
Signed-off-by: Han Guangjiang <hanguangjiang@lixiang.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7942b226e6 ]
When executing modinfo null_blk, there is an error in the description
of module parameter mbps, and the output information of cache_size is
incomplete.The output of modinfo before and after applying this patch
is as follows:
Before:
[...]
parm: cache_size:ulong
[...]
parm: mbps:Cache size in MiB for memory-backed device.
Default: 0 (none) (uint)
[...]
After:
[...]
parm: cache_size:Cache size in MiB for memory-backed device.
Default: 0 (none) (ulong)
[...]
parm: mbps:Limit maximum bandwidth (in MiB/s).
Default: 0 (no limit) (uint)
[...]
Fixes: 058efe000b ("null_blk: add module parameters for 4 options")
Signed-off-by: Genjian Zhang <zhanggenjian@kylinos.cn>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6a2f50ab1 ]
Smatch reported the following warning in eic7700_pinctrl_probe():
drivers/pinctrl/pinctrl-eic7700.c:638 eic7700_pinctrl_probe()
warn: passing zero to 'PTR_ERR'
The root cause is that devm_regulator_get() may return NULL when
CONFIG_REGULATOR is disabled. In such case, IS_ERR_OR_NULL() triggers
PTR_ERR(NULL) which evaluates to 0, leading to passing a success code
as an error.
However, this driver cannot work without a regulator. To fix this:
- Change the check from IS_ERR_OR_NULL() to IS_ERR()
- Update Kconfig to explicitly select REGULATOR and
REGULATOR_FIXED_VOLTAGE, ensuring that the regulator framework is
always available.
This resolves the Smatch warning and enforces the correct dependency.
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 5b797bcc00 ("pinctrl: eswin: Add EIC7700 pinctrl driver")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-gpio/aKRGiZ-fai0bv0tG@stanley.mountain/
Signed-off-by: Yulin Lu <luyulin@eswincomputing.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit def5612170 ]
Fix the checkpatch warning:
CHECK: Alignment should match open parenthesis
Fixes: 0cb172a491 ("power: supply: cw2015: Use device managed API to simplify the code")
Signed-off-by: Andy Yan <andyshrk@163.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27322753c8 ]
The dtbs_check validation for am335x-cm-t335.dtb flags an error
for an unevaluated 'num-serializer' property in the mcasp0 node.
This property is obsolete; it is not defined in the davinci-mcasp-audio
schema and is not used by the corresponding (or any) driver.
Remove this unused property to fix the schema validation warning.
Fixes: 48ab364478 ("ARM: dts: cm-t335: add audio support")
Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Link: https://lore.kernel.org/r/20250830215957.285694-1-jihed.chaibi.dev@gmail.com
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5af5b85505 ]
The ti,keep-vref-on property, defined as a boolean flag in the Device
Tree schema, was incorrectly assigned a value (<1>) in the DTS file,
causing a validation error: "size (4) error for type flag". Remove
the value to match the schema and ensure compatibility with the driver
using device_property_read_bool(). This fixes the dtbs_check error.
Fixes: ed05637c30 ("ARM: dts: omap3-devkit8000: Add ADS7846 Touchscreen support")
Signed-off-by: Jihed Chaibi <jihed.chaibi.dev@gmail.com>
Link: https://lore.kernel.org/r/20250822225052.136919-1-jihed.chaibi.dev@gmail.com
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7d337eef4a ]
Current depth_updated has some problems:
1) depth_updated() will be called for each hctx, while all elevators
will update async_depth for the disk level, this is not related to hctx;
2) In blk_mq_update_nr_requests(), if previous hctx update succeed and
this hctx update failed, q->nr_requests will not be updated, while
async_depth is already updated with new nr_reqeuests in previous
depth_updated();
3) All elevators are using q->nr_requests to calculate async_depth now,
however, q->nr_requests is still the old value when depth_updated() is
called from blk_mq_update_nr_requests();
Those problems are first from error path, then mq-deadline, and recently
for bfq and kyber, fix those problems by:
- pass in request_queue instead of hctx;
- move depth_updated() after q->nr_requests is updated in
blk_mq_update_nr_requests();
- add depth_updated() call inside init_sched() method to initialize
async_depth;
- remove init_hctx() method for mq-deadline and bfq that is useless now;
Fixes: 77f1e0a52d ("bfq: update internal depth state when queue depth changes")
Fixes: 39823b47bb ("block/mq-deadline: Fix the tag reservation code")
Fixes: 42e6c6ce03 ("lib/sbitmap: convert shallow_depth from one word to the whole sbitmap")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Li Nan <linan122@huawei.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Link: https://lore.kernel.org/r/20250821060612.1729939-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit be82483d1b ]
If system suspend is aborted in the "noirq" phase (for instance, due to
an error returned by one of the device callbacks), power.is_noirq_suspended
will not be set for some devices and device_resume_noirq() will return
early for them. Consequently, noirq resume callbacks will not run for
them at all because the noirq suspend callbacks have not run for them
yet.
If any of them has power.must_resume set and late suspend has been
skipped for it (due to power.smart_suspend), early resume should be
skipped for it either, or its state may become inconsistent (for
instance, if the early resume assumes that it will always follow
noirq resume).
Make that happen by clearing power.must_resume in device_resume_noirq()
for devices with power.is_noirq_suspended clear that have been left in
suspend by device_suspend_late(), which will subsequently cause
device_resume_early() to leave the device in suspend and avoid
changing its state.
Fixes: 0d4b54c6fe ("PM / core: Add LEAVE_SUSPENDED driver flag")
Link: https://lore.kernel.org/linux-pm/5d692b81-6f58-4e86-9cb0-ede69a09d799@rowland.harvard.edu/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://patch.msgid.link/3381776.aeNJFYEL58@rafael.j.wysocki
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ad25ebfa7 ]
It's possible to run these tests on platforms that think they have a
hotpluggable CPU1, but for whatever reason, CPU1 is not online and can't be
brought online:
# irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:210
Expected remove_cpu(1) == 0, but
remove_cpu(1) == 1 (0x1)
CPU1: failed to boot: -38
# irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:214
Expected add_cpu(1) == 0, but
add_cpu(1) == -38 (0xffffffffffffffda)
Check that CPU1 is actually online before trying to run the test.
Fixes: 66067c3c8a ("genirq: Add kunit tests for depth counts")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/all/20250822190140.2154646-7-briannorris@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit add03fdb9d ]
Not all platforms use the generic IRQ migration code, even if they select
GENERIC_IRQ_MIGRATION. (See, for example, powerpc / pseries_cpu_disable().)
If such platforms don't perform managed shutdown the same way, the interrupt
may not actually shut down, and these tests fail:
[ 4.357022][ T101] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:211
[ 4.357022][ T101] Expected irqd_is_activated(data) to be false, but is true
[ 4.358128][ T101] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:212
[ 4.358128][ T101] Expected irqd_is_started(data) to be false, but is true
[ 4.375558][ T101] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:216
[ 4.375558][ T101] Expected irqd_is_activated(data) to be false, but is true
[ 4.376088][ T101] # irq_cpuhotplug_test: EXPECTATION FAILED at kernel/irq/irq_test.c:217
[ 4.376088][ T101] Expected irqd_is_started(data) to be false, but is true
[ 4.377851][ T1] # irq_cpuhotplug_test: pass:0 fail:1 skip:0 total:1
[ 4.377901][ T1] not ok 4 irq_cpuhotplug_test
[ 4.378073][ T1] # irq_test_cases: pass:3 fail:1 skip:0 total:4
Rather than test that PowerPC performs migration the same way as the
unterrupt core, just drop the state checks. The point of the test was to
ensure that the code kept |depth| balanced, which still can be tested for.
Fixes: 66067c3c8a ("genirq: Add kunit tests for depth counts")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/all/20250822190140.2154646-6-briannorris@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c888bc86d ]
Some architectures have a static interrupt layout, with a limited number of
interrupts. Without SPARSE_IRQ, the test may not be able to allocate any
fake interrupts, and the test will fail. (This occurs on ARCH=m68k, for
example.)
Additionally, managed-affinity is only supported with CONFIG_SPARSE_IRQ=y,
so irq_shutdown_depth_test() and irq_cpuhotplug_test() would fail without
it.
Add a 'SPARSE_IRQ' dependency to avoid these problems.
Many architectures 'select SPARSE_IRQ', so this is easy to miss.
Notably, this also excludes ARCH=um from running any of these tests, even
though some of them might work.
Fixes: 66067c3c8a ("genirq: Add kunit tests for depth counts")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/all/20250822190140.2154646-5-briannorris@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c9163915a9 ]
The new irq KUnit tests fail on some architectures (notably PowerPC and
32-bit ARM), as the request_irq() call fails due to the ARCH_IRQ_INIT_FLAGS
containing IRQ_NOREQUEST, yielding the following errors:
[10:17:45] # irq_free_disabled_test: EXPECTATION FAILED at kernel/irq/irq_test.c:88
[10:17:45] Expected ret == 0, but
[10:17:45] ret == -22 (0xffffffffffffffea)
[10:17:45] # irq_free_disabled_test: EXPECTATION FAILED at kernel/irq/irq_test.c:90
[10:17:45] Expected desc->depth == 0, but
[10:17:45] desc->depth == 1 (0x1)
[10:17:45] # irq_free_disabled_test: EXPECTATION FAILED at kernel/irq/irq_test.c:93
[10:17:45] Expected desc->depth == 1, but
[10:17:45] desc->depth == 2 (0x2)
By clearing IRQ_NOREQUEST from the interrupt descriptor, these tests now
pass on ARM and PowerPC.
Fixes: 66067c3c8a ("genirq: Add kunit tests for depth counts")
Signed-off-by: David Gow <davidgow@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Link: https://lore.kernel.org/all/20250816094528.3560222-2-davidgow@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4ed4607327 ]
Add various vendor prefixes which are in use in compatible strings
already. These were found by modifying vendor-prefixes.yaml into a
schema to check compatible strings.
The added prefixes doesn't include various duplicate prefixes in use
such as "lge".
Link: https://lore.kernel.org/r/20250821222136.1027269-1-robh@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d6058316d1 ]
Memory programming doesn't work for devices without page support.
For example, LP5562 has 3 engines but doesn't support pages,
the start address is changed depending on engine number.
According to datasheet [1], the PROG MEM register addresses for each
engine are as follows:
Engine 1: 0x10
Engine 2: 0x30
Engine 3: 0x50
However, the current implementation incorrectly calculates the address
of PROG MEM register using the engine index starting from 1:
prog_mem_base = 0x10
LP55xx_BYTES_PER_PAGE = 0x20
Engine 1: 0x10 + 0x20 * 1 = 0x30
Engine 2: 0x10 + 0x20 * 2 = 0x50
Engine 3: 0x10 + 0x20 * 3 = 0x70
This results in writing to the wrong engine memory, causing pattern
programming to fail.
To correct it, the engine index should be decreased:
Engine 1: 0x10 + 0x20 * 0 = 0x10
Engine 2: 0x10 + 0x20 * 1 = 0x30
Engine 3: 0x10 + 0x20 * 2 = 0x50
1 - https://www.ti.com/lit/ds/symlink/lp5562.pdf
Fixes: 31379a57cf ("leds: leds-lp55xx: Generalize update_program_memory function")
Signed-off-by: Andrei Lalaev <andrei.lalaev@anton-paar.com>
Link: https://lore.kernel.org/r/20250820-lp5562-prog-mem-address-v1-1-8569647fa71d@anton-paar.com
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8e2f91999 ]
The "Memory out of range" subtest of futex_numa_mpol assumes that memory
access outside of the mmap'ed area is invalid. That may not be the case
depending on the actual memory layout of the test application. When that
subtest was run on an x86-64 system with latest upstream kernel, the test
passed as an error was returned from futex_wake(). On another PowerPC system,
the same subtest failed because futex_wake() returned 0.
Bail out! futex2_wake(64, 0x86) should fail, but didn't
Looking further into the passed subtest on x86-64, it was found that an
-EINVAL was returned instead of -EFAULT. The -EINVAL error was returned
because the node value test with FLAGS_NUMA set failed with a node value
of 0x7f7f. IOW, the futex memory was accessible and futex_wake() failed
because the supposed node number wasn't valid. If that memory location
happens to have a very small value (e.g. 0), the test will pass and no
error will be returned.
Since this subtest is non-deterministic, drop it unless a guard page beyond
the mmap region is explicitly set.
The other problematic test is the "Memory too small" test. The futex_wake()
function returns the -EINVAL error code because the given futex address isn't
8-byte aligned, not because only 4 of the 8 bytes are valid and the other
4 bytes are not. So change the name of this subtest to "Mis-aligned futex" to
reflect the reality.
[ bp: Massage commit message. ]
Fixes: 3163369407 ("selftests/futex: Add futex_numa_mpol")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/20250827130011.677600-3-bigeasy@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d35d068fb ]
Change the 'ret' variable from u32 to int to store negative error codes or
zero returned by of_property_read_u32().
Storing the negative error codes in unsigned type, doesn't cause an issue
at runtime but it's ugly as pants. Additionally, assigning negative error
codes to unsigned type may trigger a GCC warning when the -Wsign-conversion
flag is enabled.
No effect on runtime.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Fixes: 0fbeae70ee ("regulator: add SCMI driver")
Link: https://patch.msgid.link/20250829101411.625214-1-rongqianfeng@vivo.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6e08cdd604 ]
PCIe `port01` of t8103-j457 (iMac, M1, 2 USB-C ports, 2021) is unused
and disabled. Linux' PCI subsystem assigns the ethernet nic from
`port02` to bus 02. This results into assigning `pcie0_dart_1` from the
disabled port as iommu. The `pcie0_dart_1` instance is disabled and
probably fused off (it is on the M2 Pro Mac mini which has a disabled
PCIe port as well).
Without iommu the ethernet nic is not expected work.
Adjusts the "bus-range" and the PCIe devices "reg" property to PCI
subsystem's bus number.
Fixes: 7c77ab91b3 ("arm64: dts: apple: Add missing M1 (t8103) devices")
Reviewed-by: Neal Gompa <neal@gompa.dev>
Reviewed-by: Sven Peter <sven@kernel.org>
Signed-off-by: Janne Grunau <j@jannau.net>
Link: https://lore.kernel.org/r/20250823-apple-dt-sync-6-17-v2-1-6dc0daeb4786@jannau.net
Signed-off-by: Sven Peter <sven@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 296302d3d8 ]
The at91_mckx_ps_restore() assembly function is responsible for setting
back MCKx system bus clocks after exiting low power modes.
Fix a typo and use tmp3 variable instead of tmp2 to correctly set MCKx
to previously saved state.
Tmp2 was used without the needed changes in CSS and DIV. Moreover the
required bit 7, telling that MCR register's content is to be changed
(CMD/write), was not set.
Fix function comment to match tmp variables actually used.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Fixes: 28eb1d40fe ("ARM: at91: pm: add support for MCK1..4 save/restore for ulp modes")
Link: https://lore.kernel.org/r/20250827145427.46819-3-nicolas.ferre@microchip.com
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
[claudiu.beznea: s/sate/state in commit description]
Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e62688d58 ]
The -g parameter was meant to the test the immutable global hash instead of the
private hash which has been made immutable. The global hash is tested as part
at the end of the regular test. The immutable private hash been removed.
Remove last traces of the immutable private hash.
Fixes: 16adc7f136 ("selftests/futex: Remove support for IMMUTABLE")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Link: https://lore.kernel.org/20250827130011.677600-2-bigeasy@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4c7ef92f6d ]
In __blk_mq_update_nr_hw_queues() the return value of
blk_mq_sysfs_register_hctxs() is not checked. If sysfs creation for hctx
fails, later changing the number of hw_queues or removing disk will
trigger the following warning:
kernfs: can not remove 'nr_tags', no directory
WARNING: CPU: 2 PID: 637 at fs/kernfs/dir.c:1707 kernfs_remove_by_name_ns+0x13f/0x160
Call Trace:
remove_files.isra.1+0x38/0xb0
sysfs_remove_group+0x4d/0x100
sysfs_remove_groups+0x31/0x60
__kobject_del+0x23/0xf0
kobject_del+0x17/0x40
blk_mq_unregister_hctx+0x5d/0x80
blk_mq_sysfs_unregister_hctxs+0x94/0xd0
blk_mq_update_nr_hw_queues+0x124/0x760
nullb_update_nr_hw_queues+0x71/0xf0 [null_blk]
nullb_device_submit_queues_store+0x92/0x120 [null_blk]
kobjct_del() was called unconditionally even if sysfs creation failed.
Fix it by checkig the kobject creation statusbefore deleting it.
Fixes: 477e19dedc ("blk-mq: adjust debugfs and sysfs register when updating nr_hw_queues")
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20250826084854.1030545-1-linan666@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f87412d18e ]
Unconditionally clear the TCS_AMC_MODE_TRIGGER bit when a
transaction completes. Previously this bit was only cleared when
a wake TCS was borrowed as an AMC TCS but not for dedicated
AMC TCS. Leaving this bit set for AMC TCS and entering deeper low
power modes can generate a false completion IRQ.
Prevent this scenario by always clearing the TCS_AMC_MODE_TRIGGER
bit upon receiving a completion IRQ.
Fixes: 15b3bf61b8 ("soc: qcom: rpmh-rsc: Clear active mode configuration for wake TCS")
Signed-off-by: Sneh Mankad <sneh.mankad@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250825-rpmh_rsc_change-v1-1-138202c31bf6@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d8c41816b ]
When using NVME on SG2044, the NVME drvier always complains about "I/O tag
XXX (XXX) QID XX timeout, completion polled", which is caused by the broken
affinity setting mechanism of the sg2042-msi driver.
The PLIC driver can only the set the affinity when enabled, but the
sg2042-msi driver invokes the affinity setter in disabled state, which
causes the change to be lost.
Cure this by implementing the irq_startup()/shutdown() callbacks, which
allow to startup (enabled) the underlying PLIC first.
Fixes: e96b93a97c ("irqchip/sg2042-msi: Add the Sophgo SG2044 MSI interrupt controller")
Reported-by: Han Gao <rabenda.cn@gmail.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/all/20250813232835.43458-4-inochiama@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 54f45a30c0 ]
As the RISC-V PLIC cannot apply affinity settings without invoking
irq_enable(), it will make the interrupt unavailble when used as an
underlying interrupt chip for the MSI controller.
Implement the irq_startup() and irq_shutdown() callbacks for the PCI MSI
and MSI-X templates.
For chips that specify MSI_FLAG_PCI_MSI_STARTUP_PARENT, the parent startup
and shutdown functions are invoked. That allows the interrupt on the parent
chip to be enabled if the interrupt has not been enabled during
allocation. This is necessary for MSI controllers which use PLIC as
underlying parent interrupt chip.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/all/20250813232835.43458-3-inochiama@gmail.com
Stable-dep-of: 9d8c41816b ("irqchip/sg2042-msi: Fix broken affinity setting")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7a721a2fee ]
As the MSI controller on SG2044 uses PLIC as the underlying interrupt
controller, it needs to call irq_enable() and irq_disable() to
startup/shutdown interrupts. Otherwise, the MSI interrupt can not be
startup correctly and will not respond any incoming interrupt.
Introduce irq_chip_startup_parent() and irq_chip_shutdown_parent() to allow
the interrupt controller to call the irq_startup()/irq_shutdown() callbacks
of the parent interrupt chip.
In case the irq_startup()/irq_shutdown() callbacks are not implemented for
the parent interrupt chip, this will fallback to irq_chip_enable_parent()
or irq_chip_disable_parent().
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/all/20250813232835.43458-2-inochiama@gmail.com
Link: https://lore.kernel.org/lkml/20250722224513.22125-1-inochiama@gmail.com/
Stable-dep-of: 9d8c41816b ("irqchip/sg2042-msi: Fix broken affinity setting")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6fdaf3b183 ]
According to the imx95 RM, the lpuart7 rx and tx DMA's srcid are 88 and 87,
and the lpuart8 rx and tx DMA's srcid are 90 and 89. So correct them.
Fixes: 915fd2e127 ("arm64: dts: imx95: add edma[1..3] nodes")
Signed-off-by: Joy Zou <joy.zou@nxp.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c94737568b ]
The assignment of the USB ports is wrong and needs to be swapped.
The OTG (USB-C) port is on the first port and the host port with
the onboard hub is on the second port.
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Fixes: 2b52fd6035 ("arm64: dts: Add support for Kontron i.MX93 OSM-S SoM and BL carrier board")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8faa8a466 ]
After commit 20bda12a0e (“firmware: arm_scmi: Make VirtIO transport a
standalone driver”), the VirtIO transport probes independently. During
scmi_virtio_probe, scmi_probe() is called, which intune invokes
scmi_protocol_acquire() that sends a message over the virtqueue and
waits for a reply.
Previously, DRIVER_OK was only set after scmi_vio_probe, in the core
virtio via virtio_dev_probe(). According to the Virtio spec (3.1 Device
Initialization):
| The driver MUST NOT send any buffer available notifications to the
| device before setting DRIVER_OK.
Some type-1 hypervisors block available-buffer notifications until the
driver is marked OK. In such cases, scmi_vio_probe stalls in
scmi_wait_for_reply(), and the probe never completes.
Resolve this by setting DRIVER_OK immediately after the device-specific
setup, so scmi_probe() can safely send notifications.
Note after splitting the transports into modules, the probe sequence
changed a bit. We can no longer rely on virtio_device_ready() being
called by the core in virtio_dev_probe(), because scmi_vio_probe()
doesn’t complete until the core SCMI stack runs scmi_probe(), which
immediately issues the initial BASE protocol exchanges.
Fixes: 20bda12a0e ("firmware: arm_scmi: Make VirtIO transport a standalone driver")
Signed-off-by: Junnan Wu <junnan01.wu@samsung.com>
Message-Id: <20250812075343.3201365-1-junnan01.wu@samsung.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7d1e3aa282 ]
The Retronix R-Car V4H Sparrow Hawk EVTB1 uses 1V8 IO voltage supply
for VDDQ18_25_AVB power rail. Update the AVB0 pinmux to reflect the
change in IO voltage. Since the VDDQ18_25_AVB power rail is shared,
all four AVB0, AVB1, AVB2, TSN0 PFC/GPIO POC[7..4] registers have to
be configured the same way. As the EVTA1 boards are from a limited run
and generally not available, update the DT to make it compatible with
EVTB1 IO voltage settings.
Fixes: a719915e76 ("arm64: dts: renesas: r8a779g3: Add Retronix R-Car V4H Sparrow Hawk board support")
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250806192821.133302-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae95807b00 ]
Invert the polarity of microSD voltage selector on Retronix R-Car V4H
Sparrow Hawk board. The voltage selector was not populated on prototype
EVTA1 boards, and is implemented slightly different on EVTB1 boards. As
the EVTA1 boards are from a limited run and generally not available,
update the DT to make it compatible with EVTB1 microSD voltage selector.
Fixes: a719915e76 ("arm64: dts: renesas: r8a779g3: Add Retronix R-Car V4H Sparrow Hawk board support")
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250727235905.290427-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd5d4621ba ]
Broadcom STB platforms were early adopters (2017) of the SCMI framework and as
a result, not all deployed systems have a Device Tree entry where SCMI
protocol 0x13 (PERFORMANCE) is declared as a clock provider, nor are the
CPU Device Tree node(s) referencing protocol 0x13 as their clock
provider. This was clarified in commit e11c480b6d ("dt-bindings:
firmware: arm,scmi: Extend bindings for protocol@13") in 2023.
For those platforms, we allow the checks done by scmi_dev_used_by_cpus()
to continue, and in the event of not having done an early return, we key
off the documented compatible string and give them a pass to continue to
use scmi-cpufreq.
Fixes: 6c9bb86922 ("cpufreq: scmi: Skip SCMI devices that aren't used by the CPUs")
Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bc3905a71f ]
The tailcall_bpf2bpf_hierarchy_fentry test hangs on s390. Its call
graph is as follows:
entry()
subprog_tail()
trampoline()
fentry()
the rest of subprog_tail() # via BPF_TRAMP_F_CALL_ORIG
return to entry()
The problem is that the rest of subprog_tail() increments the tail call
counter, but the trampoline discards the incremented value. This
results in an astronomically large number of tail calls.
Fix by making the trampoline write the incremented tail call counter
back.
Fixes: 528eb2cb87 ("s390/bpf: Implement arch_prepare_bpf_trampoline()")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20250813121016.163375-4-iii@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c861a6b147 ]
The tailcall_bpf2bpf_hierarchy_1 test hangs on s390. Its call graph is
as follows:
entry()
subprog_tail()
bpf_tail_call_static(0) -> entry + tail_call_start
subprog_tail()
bpf_tail_call_static(0) -> entry + tail_call_start
entry() copies its tail call counter to the subprog_tail()'s frame,
which then increments it. However, the incremented result is discarded,
leading to an astronomically large number of tail calls.
Fix by writing the incremented counter back to the entry()'s frame.
Fixes: dd691e847d ("s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20250813121016.163375-3-iii@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6e3779e3c6 ]
Coverity noticed that assigning value -EINVAL to 'ret' in the if
statement is useless because 'ret' is overwritten a few lines later.
However, after inspect the code, this warning reveals that we need to
return -EINVAL instead of the variable assignment. So, fix it.
Coverity-id: 1646104
Fixes: aebb5fc9a0 ("leds: max77705: Add LEDs support")
Signed-off-by: Len Bao <len.bao@gmx.us>
Link: https://lore.kernel.org/r/20250727075649.34496-1-len.bao@gmx.us
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit abdaf49be5 ]
Graph tracer framework ensures we won't migrate, kprobe_multi_link_prog_run
called all the way from graph tracer, which disables preemption in
function_graph_enter_regs, as Jiri and Yonghong suggested, there is no
need to use migrate_disable. As a result, some overhead may will be reduced.
And add cant_sleep check for __this_cpu_inc_return.
Fixes: 0dcac27254 ("bpf: Add multi kprobe link")
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250814121430.2347454-1-chen.dylane@linux.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c80d797206 ]
Based on a bisect, it appears that commit 7ee9887703 ("timers:
Implement the hierarchical pull model") has somehow inadvertently
broken BPF selftest test_tcpnotify_user. The error that is being
generated by this test is as follows:
FAILED: Wrong stats Expected 10 calls, got 8
It looks like the change allows timer functions to be run on CPUs
different from the one they are armed on. The test had pinned itself
to CPU 0, and in the past the retransmit attempts also occurred on CPU
0. The test had set the max_entries attribute for
BPF_MAP_TYPE_PERF_EVENT_ARRAY to 2 and was calling
bpf_perf_event_output() with BPF_F_CURRENT_CPU, so the entry was
likely to be in range. With the change to allow timers to run on other
CPUs, the current CPU tasked with performing the retransmit might be
bumped and in turn fall out of range, as the event will be filtered
out via __bpf_perf_event_output() using:
if (unlikely(index >= array->map.max_entries))
return -E2BIG;
A possible change would be to explicitly set the max_entries attribute
for perf_event_map in test_tcpnotify_kern.c to a value that's at least
as large as the number of CPUs. As it turns out however, if the field
is left unset, then the libbpf will determine the number of CPUs available
on the underlying system and update the max_entries attribute accordingly
in map_set_def_max_entries().
A further problem with the test is that it has a thread that continues
running up until the program exits. The main thread cleans up some
LIBBPF data structures, while the other thread continues to use them,
which inevitably will trigger a SIGSEGV. This can be dealt with by
telling the thread to run for as long as necessary and doing a
pthread_join on it before exiting the program.
Finally, I don't think binding the process to CPU 0 is meaningful for
this test any more, so get rid of that.
Fixes: 435f90a338 ("selftests/bpf: add a test case for sock_ops perf-event notification")
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/aJ8kHhwgATmA3rLf@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23fca458f6 ]
Unsafe code in CpumaskVar's methods assumes that the type has the same
layout as `bindings::cpumask_var_t`. This is not guaranteed by
the default struct representation in Rust, but requires specifying the
`transparent` representation.
Fixes: 8961b8cb30 ("rust: cpumask: Add initial abstractions")
Signed-off-by: Baptiste Lepers <baptiste.lepers@gmail.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 07866544e4 ]
Commit d6212d82bf ("selftests/bpf: Consolidate kernel modules into
common directory") consolidated the Makefile of test_kmods. However,
since it removed test_kmods from TEST_GEN_PROGS_EXTENDED, the kernel
modules required by bpf selftests are now missing from kselftest_install
when "make install". Fix it by adding test_kmod to TEST_GEN_FILES.
Fixes: d6212d82bf ("selftests/bpf: Consolidate kernel modules into common directory")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250812175039.2323570-1-ameryhung@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c93c59baa5 ]
Yonghong noticed that error messages for potential verifier bugs often
have a '(1)' at the end. This is happening because verifier_bug_if(cond,
env, fmt, args...) prints "(" #cond ")\n" as part of the message and
verifier_bug() is defined as:
#define verifier_bug(env, fmt, args...) verifier_bug_if(1, env, fmt, ##args)
Hence, verifier_bug() always ends up displaying '(1)'. This small patch
fixes it by having verifier_bug_if conditionally call verifier_bug
instead of the other way around.
Fixes: 1cb0f56d96 ("bpf: WARN_ONCE on verifier bugs")
Reported-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/aJo9THBrzo8jFXsh@mail.gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8912b2862b ]
rzg3s_oen_read() returns a u32 value, but previously propagated a negative
error code from rzg3s_pin_to_oen_bit(), resulting in an unintended large
positive value due to unsigned conversion. This caused incorrect
output-enable reporting for certain pins.
Without this patch, pins P1_0-P1_4 and P7_0-P7_4 are incorrectly reported
as "output enabled" in the pinconf-pins debugfs file. With this fix, only
P1_0-P1_1 and P7_0-P7_1 are shown as "output enabled", which matches the
hardware manual.
Fix this by returning 0 when the OEN bit lookup fails, treating the pin
as output-disabled by default.
Fixes: a9024a323a ("pinctrl: renesas: rzg2l: Clean up and refactor OEN read/write functions")
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/20250709160819.306875-2-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67378b7546 ]
[BUG DURING BS > PS TEST]
When running the following script on a btrfs whose block size is larger
than page size, e.g. 8K block size and 4K page size, it will trigger a
kernel BUG:
# mkfs.btrfs -s 8k $dev
# mount $dev $mnt
# mkdir $mnt/dir
# ln -s dir $mnt/link
# ls $mnt/link
The call trace looks like this:
BTRFS warning (device dm-2): support for block size 8192 with page size 4096 is experimental, some features may be missing
BTRFS info (device dm-2): checking UUID tree
BTRFS info (device dm-2): enabling ssd optimizations
BTRFS info (device dm-2): enabling free space tree
------------[ cut here ]------------
kernel BUG at /home/adam/linux/include/linux/highmem.h:275!
Oops: invalid opcode: 0000 [#1] SMP
CPU: 8 UID: 0 PID: 667 Comm: ls Tainted: G OE 6.17.0-rc4-custom+ #283 PREEMPT(full)
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022
RIP: 0010:zero_user_segments.constprop.0+0xdc/0xe0 [btrfs]
Call Trace:
<TASK>
btrfs_get_extent.cold+0x85/0x101 [btrfs 7453c70c03e631c8d8bfdd4264fa62d3e238da6f]
btrfs_do_readpage+0x244/0x750 [btrfs 7453c70c03e631c8d8bfdd4264fa62d3e238da6f]
btrfs_read_folio+0x9c/0x100 [btrfs 7453c70c03e631c8d8bfdd4264fa62d3e238da6f]
filemap_read_folio+0x37/0xe0
do_read_cache_folio+0x94/0x3e0
__page_get_link.isra.0+0x20/0x90
page_get_link+0x16/0x40
step_into+0x69b/0x830
path_lookupat+0xa7/0x170
filename_lookup+0xf7/0x200
? set_ptes.isra.0+0x36/0x70
vfs_statx+0x7a/0x160
do_statx+0x63/0xa0
__x64_sys_statx+0x90/0xe0
do_syscall_64+0x82/0xae0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
Please note bs > ps support is still under development and the
enablement patch is not even in btrfs development branch.
[CAUSE]
Btrfs reuses its data folio read path to handle symbolic links, as the
symbolic link target is stored as an inline data extent.
But for newly created inodes, btrfs only set the minimal order if the
target inode is a regular file.
Thus for above newly created symbolic link, it doesn't properly respect
the minimal folio order, and triggered the above crash.
[FIX]
Call btrfs_set_inode_mapping_order() unconditionally inside
btrfs_create_new_inode().
For symbolic links this will fix the crash as now the folio will meet
the minimal order.
For regular files this brings no change.
For directory/bdev/char and all the other types of inodes, they won't
go through the data read path, thus no effect either.
Fixes: cc38d178ff ("btrfs: enable large data folio support under CONFIG_BTRFS_EXPERIMENTAL")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2d83ed6c6c ]
Since the support of bs < ps support, extent_writepage_io() will submit
multiple blocks inside the folio.
But if we hit error submitting one sector, but the next sector can still
be submitted successfully, the function extent_writepage_io() will still
return 0.
This will make btrfs to silently ignore the error without setting error
flag for the filemap.
Fix it by recording the first error hit, and always return that value.
Fixes: 8bf334beb3 ("btrfs: fix double accounting race when extent_writepage_io() failed")
Reviewed-by: Daniel Vacek <neelx@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 216217ebee ]
The 'isolcpus' parameter specified at boot time can be assigned to an
isolated partition. While it is valid put the 'isolcpus' in an isolated
partition, attempting to change a member cpuset to an isolated partition
will fail if the cpuset contains any 'isolcpus'.
For example, the system boots with 'isolcpus=9', and the following
configuration works correctly:
# cd /sys/fs/cgroup/
# mkdir test
# echo 1 > test/cpuset.cpus
# echo isolated > test/cpuset.cpus.partition
# cat test/cpuset.cpus.partition
isolated
# echo 9 > test/cpuset.cpus
# cat test/cpuset.cpus.partition
isolated
# cat test/cpuset.cpus
9
However, the following steps to convert a member cpuset to an isolated
partition will fail:
# cd /sys/fs/cgroup/
# mkdir test
# echo 9 > test/cpuset.cpus
# echo isolated > test/cpuset.cpus.partition
# cat test/cpuset.cpus.partition
isolated invalid (partition config conflicts with housekeeping setup)
The issue occurs because the new partition state (new_prs) is used for
validation against housekeeping constraints before it has been properly
updated. To resolve this, move the assignment of new_prs before the
housekeeping validation check when enabling a root partition.
Fixes: 4a74e41888 ("cgroup/cpuset: Check partition conflict with housekeeping setup")
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 54d94c422f ]
When CONFIG_SECURITY is not set, CONFIG_LSM (builtin_lsm_order) does
not need to be visible and settable since builtin_lsm_order is defined in
security.o, which is only built when CONFIG_SECURITY=y.
So make CONFIG_LSM depend on CONFIG_SECURITY.
Fixes: 13e735c0e9 ("LSM: Introduce CONFIG_LSM")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
[PM: subj tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 661f951e37 ]
Leon [1] and Vinicius [2] noted a topology_span_sane() warning during
their testing starting from v6.16-rc1. Debug that followed pointed to
the tl->mask() for the NODE domain being incorrectly resolved to that of
the highest NUMA domain.
tl->mask() for NODE is set to the sd_numa_mask() which depends on the
global "sched_domains_curr_level" hack. "sched_domains_curr_level" is
set to the "tl->numa_level" during tl traversal in build_sched_domains()
calling sd_init() but was not reset before topology_span_sane().
Since "tl->numa_level" still reflected the old value from
build_sched_domains(), topology_span_sane() for the NODE domain trips
when the span of the last NUMA domain overlaps.
Instead of replicating the "sched_domains_curr_level" hack, get rid of
it entirely and instead, pass the entire "sched_domain_topology_level"
object to tl->cpumask() function to prevent such mishap in the future.
sd_numa_mask() now directly references "tl->numa_level" instead of
relying on the global "sched_domains_curr_level" hack to index into
sched_domains_numa_masks[].
The original warning was reproducible on the following NUMA topology
reported by Leon:
$ sudo numactl -H
available: 5 nodes (0-4)
node 0 cpus: 0 1
node 0 size: 2927 MB
node 0 free: 1603 MB
node 1 cpus: 2 3
node 1 size: 3023 MB
node 1 free: 3008 MB
node 2 cpus: 4 5
node 2 size: 3023 MB
node 2 free: 3007 MB
node 3 cpus: 6 7
node 3 size: 3023 MB
node 3 free: 3002 MB
node 4 cpus: 8 9
node 4 size: 3022 MB
node 4 free: 2718 MB
node distances:
node 0 1 2 3 4
0: 10 39 38 37 36
1: 39 10 38 37 36
2: 38 38 10 37 36
3: 37 37 37 10 36
4: 36 36 36 36 10
The above topology can be mimicked using the following QEMU cmd that was
used to reproduce the warning and test the fix:
sudo qemu-system-x86_64 -enable-kvm -cpu host \
-m 20G -smp cpus=10,sockets=10 -machine q35 \
-object memory-backend-ram,size=4G,id=m0 \
-object memory-backend-ram,size=4G,id=m1 \
-object memory-backend-ram,size=4G,id=m2 \
-object memory-backend-ram,size=4G,id=m3 \
-object memory-backend-ram,size=4G,id=m4 \
-numa node,cpus=0-1,memdev=m0,nodeid=0 \
-numa node,cpus=2-3,memdev=m1,nodeid=1 \
-numa node,cpus=4-5,memdev=m2,nodeid=2 \
-numa node,cpus=6-7,memdev=m3,nodeid=3 \
-numa node,cpus=8-9,memdev=m4,nodeid=4 \
-numa dist,src=0,dst=1,val=39 \
-numa dist,src=0,dst=2,val=38 \
-numa dist,src=0,dst=3,val=37 \
-numa dist,src=0,dst=4,val=36 \
-numa dist,src=1,dst=0,val=39 \
-numa dist,src=1,dst=2,val=38 \
-numa dist,src=1,dst=3,val=37 \
-numa dist,src=1,dst=4,val=36 \
-numa dist,src=2,dst=0,val=38 \
-numa dist,src=2,dst=1,val=38 \
-numa dist,src=2,dst=3,val=37 \
-numa dist,src=2,dst=4,val=36 \
-numa dist,src=3,dst=0,val=37 \
-numa dist,src=3,dst=1,val=37 \
-numa dist,src=3,dst=2,val=37 \
-numa dist,src=3,dst=4,val=36 \
-numa dist,src=4,dst=0,val=36 \
-numa dist,src=4,dst=1,val=36 \
-numa dist,src=4,dst=2,val=36 \
-numa dist,src=4,dst=3,val=36 \
...
[ prateek: Moved common functions to include/linux/sched/topology.h,
reuse the common bits for s390 and ppc, commit message ]
Closes: https://lore.kernel.org/lkml/20250610110701.GA256154@unreal/ [1]
Fixes: ccf74128d6 ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") # ce29a7da84, f55dac1daf
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Valentin Schneider <vschneid@redhat.com> # x86
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> # powerpc
Link: https://lore.kernel.org/lkml/a3de98387abad28592e6ab591f3ff6107fe01dc1.1755893468.git.tim.c.chen@linux.intel.com/ [2]
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3b0dec689a ]
The predicates in test expect event counting from 73e75e6fc3
("cgroup/pids: Separate semantics of pids.events related to pids.max")
and the test would fail on older kernels. We want to have one version of
tests for all, so detect the feature and skip the test on old kernels.
(The test could even switch to check v1 semantics based on the flag but
keep it simple for now.)
Fixes: 9f34c56602 ("selftests: cgroup: Add basic tests for pids controller")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Tested-by: Sebastian Chlad <sebastian.chlad@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ac9c408ed1 ]
RDPID instruction outputs to a word-sized register (64-bit on x86_64 and
32-bit on x86_32). Use an unsigned long variable to store the correct size.
LSL outputs to 32-bit register, use %k operand prefix to always print the
32-bit name of the register.
Use RDPID insn mnemonic while at it as the minimum binutils version of
2.30 supports it.
[ bp: Merge two patches touching the same function into a single one. ]
Fixes: ffebbaedc8 ("x86/vdso: Introduce helper functions for CPU and node number")
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250616095315.230620-1-ubizjak@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 43796f3050 ]
When running perf_fuzzer on PTL, sometimes the below "unchecked MSR
access error" is seen when accessing IA32_PMC_x_CFG_B MSRs.
[ 55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30)
[ 55.611280] Call Trace:
[ 55.611282] <TASK>
[ 55.611284] ? intel_pmu_config_acr+0x87/0x160
[ 55.611289] intel_pmu_enable_acr+0x6d/0x80
[ 55.611291] intel_pmu_enable_event+0xce/0x460
[ 55.611293] x86_pmu_start+0x78/0xb0
[ 55.611297] x86_pmu_enable+0x218/0x3a0
[ 55.611300] ? x86_pmu_enable+0x121/0x3a0
[ 55.611302] perf_pmu_enable+0x40/0x50
[ 55.611307] ctx_resched+0x19d/0x220
[ 55.611309] __perf_install_in_context+0x284/0x2f0
[ 55.611311] ? __pfx_remote_function+0x10/0x10
[ 55.611314] remote_function+0x52/0x70
[ 55.611317] ? __pfx_remote_function+0x10/0x10
[ 55.611319] generic_exec_single+0x84/0x150
[ 55.611323] smp_call_function_single+0xc5/0x1a0
[ 55.611326] ? __pfx_remote_function+0x10/0x10
[ 55.611329] perf_install_in_context+0xd1/0x1e0
[ 55.611331] ? __pfx___perf_install_in_context+0x10/0x10
[ 55.611333] __do_sys_perf_event_open+0xa76/0x1040
[ 55.611336] __x64_sys_perf_event_open+0x26/0x30
[ 55.611337] x64_sys_call+0x1d8e/0x20c0
[ 55.611339] do_syscall_64+0x4f/0x120
[ 55.611343] entry_SYSCALL_64_after_hwframe+0x76/0x7e
On PTL, GP counter 0 and 1 doesn't support auto counter reload feature,
thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR
which requires to enable auto counter reload on GP counter 0.
The root cause of causing this issue is the check for auto counter
reload (ACR) counter mask from user space is incorrect in
intel_pmu_acr_late_setup() helper. It leads to an invalid ACR counter
mask from user space could be set into hw.config1 and then written into
CFG_B MSRs and trigger the MSR access warning.
e.g., User may create a perf event with ACR counter mask (config2=0xcb),
and there is only 1 event created, so "cpuc->n_events" is 1.
The correct check condition should be "i + idx >= cpuc->n_events"
instead of "i + idx > cpuc->n_events" (it looks a typo). Otherwise,
the counter mask would traverse twice and an invalid "cpuc->assign[1]"
bit (bit 0) is set into hw.config1 and cause MSR accessing error.
Besides, also check if the ACR counter mask corresponding events are
ACR events. If not, filter out these counter mask. If a event is not a
ACR event, it could be scheduled to an HW counter which doesn't support
ACR. It's invalid to add their counter index in ACR counter mask.
Furthermore, remove the WARN_ON_ONCE() since it's easily triggered as
user could set any invalid ACR counter mask and the warning message
could mislead users.
Fixes: ec980e4fac ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250820023032.17128-3-dapeng1.mi@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d9cf9c6884 ]
After the commit 'd971342d38bf ("perf/x86/intel: Decouple BTS
initialization from PEBS initialization")' is introduced, x86_pmu.bts
would initialized in bts_init() which is hooked by arch_initcall().
Whereas init_hw_perf_events() is hooked by early_initcall(). Once the
core PMU is initialized, nmi watchdog initialization is called
immediately before bts_init() is called. It leads to the BTS buffer is
not really initialized since bts_init() is not called and x86_pmu.bts is
still false at that time. Worse, BTS buffer would never be initialized
then unless all core PMU events are freed and reserve_ds_buffers()
is called again.
Thus aligning with init_hw_perf_events(), use early_initcall() to hook
bts_init() to ensure x86_pmu.bts is initialized before nmi watchdog
initialization.
Fixes: d971342d38 ("perf/x86/intel: Decouple BTS initialization from PEBS initialization")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250820023032.17128-2-dapeng1.mi@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e6fe1bbef ]
When loading the i10nm_edac driver on some Intel Granite Rapids servers,
a call trace may appear as follows:
UBSAN: shift-out-of-bounds in drivers/edac/skx_common.c:453:16
shift exponent -66 is negative
...
__ubsan_handle_shift_out_of_bounds+0x1e3/0x390
skx_get_dimm_info.cold+0x47/0xd40 [skx_edac_common]
i10nm_get_dimm_config+0x23e/0x390 [i10nm_edac]
skx_register_mci+0x159/0x220 [skx_edac_common]
i10nm_init+0xcb0/0x1ff0 [i10nm_edac]
...
This occurs because some BIOS may disable a memory controller if there
aren't any memory DIMMs populated on this memory controller. The DIMMMTR
register of this disabled memory controller contains the invalid value
~0, resulting in the call trace above.
Fix this call trace by skipping DIMM enumeration on a disabled memory
controller.
Fixes: ba987eaaab ("EDAC/i10nm: Add Intel Granite Rapids server support")
Reported-by: Jose Jesus Ambriz Meza <jose.jesus.ambriz.meza@intel.com>
Reported-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Closes: https://lore.kernel.org/all/20250730063155.2612379-1-acelan.kao@canonical.com/
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Link: https://lore.kernel.org/r/20250806065707.3533345-1-qiuxu.zhuo@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fad988a215 ]
Already do real negotiation in smb_direct_handle_connect_request()
where we see the requested initiator_depth and responder_resources
from the client.
We should detect legacy iwarp clients using MPA v1
with the custom IRD/ORD negotiation.
We need to send the custom IRD/ORD in big endian,
but we need to try to let clients with broken requests
using little endian (older cifs.ko) to work.
Note the reason why this uses u8 for
initiator_depth and responder_resources is
that the rdma layer also uses it.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: linux-rdma@vger.kernel.org
Fixes: 0626e6641f ("cifsd: add server handler for central processing and tranport layers")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef71f1e046 ]
Do a real negotiation and check the servers initiator_depth and
responder_resources.
This should use big endian in order to be useful.
I have captures of windows clients showing this.
The fact that we used little endian up to now
means that we sent very large numbers and the
negotiation with the server truncated them to the
server limits.
Note the reason why this uses u8 for
initiator_depth and responder_resources is
that the rdma layer also uses it.
The inconsitency regarding the initiator_depth
and responder_resources values being reversed
for iwarp devices in RDMA_CM_EVENT_ESTABLISHED
should also be fixed later, but for now we should
fix it.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: linux-rdma@vger.kernel.org
Fixes: c739858334 ("CIFS: SMBD: Implement RDMA memory registration")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 334c0e493c ]
Since all real encoded extents (directly handled by the decompression
subsystem) have a sane, limited maximum decoded length
(Z_EROFS_PCLUSTER_MAX_DSIZE), and the read-more policy is only applied
if needed.
However, it makes no sense to read more for non-encoded maps, such as
fragment extents, since such extents can be huge (up to i_size) and
there is no benefit to reading more at this layer.
For normal images, it does not really matter, but for crafted images
generated by syzbot, excessively large fragment extents can cause
read-more to run for an overly long time.
Reported-and-tested-by: syzbot+1a9af3ef3c84c5e14dcc@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/68c8583d.050a0220.2ff435.03a3.GAE@google.com
Fixes: b44686c839 ("erofs: fix large fragment handling")
Fixes: b15b2e307c ("erofs: support on-disk compressed fragments data")
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a29fea30dd ]
Cast nr_pages to unsigned long to avoid overflow when handling large
AUX buffer sizes (>= 2 GiB).
Fixes: d5d9696b03 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 105f56877f ]
Cast nr_pages to unsigned long to avoid overflow when handling large
AUX buffer sizes (>= 2 GiB).
Fixes: 3fbf7f011f ("coresight: sink: Add TRBE driver")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba1afc94de ]
uprobe_warn() is passed a task structure, yet its using current. For
the most part this shouldn't matter, but since a task structure is
provided, lets use it.
Fixes: 248d3a7b2f ("uprobes: Change uprobe_copy_process() to dup return_instances")
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f6b4df37eb ]
CONFIG_PPC_FTRACE_OUT_OF_LINE introduced setup_ftrace_ool_stubs() to
extend the ppc64le module .stubs section with an array of
ftrace_ool_stub structures for each patchable function.
Fix its ppc64_stub_entry stub reservation loop to properly write across
all of the num_stubs used and not just the first entry.
Fixes: eec37961a5 ("powerpc64/ftrace: Move ftrace sequence out of line")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250912142740.3581368-3-joe.lawrence@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5337609a31 ]
When an ftrace call site is converted to a NOP, its corresponding
dyn_ftrace record should have its ftrace_ops pointer set to
ftrace_nop_ops.
Correct the powerpc implementation to ensure the
ftrace_rec_set_nop_ops() helper is called on all successful NOP
initialization paths. This ensures all ftrace records are consistent
before being handled by the ftrace core.
Fixes: eec37961a5 ("powerpc64/ftrace: Move ftrace sequence out of line")
Suggested-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250912142740.3581368-2-joe.lawrence@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d9e46de4bf ]
Commit ac9f97ff8b ("powerpc/8xx: Inconditionally use task PGDIR in
DTLB misses") removed the test that needed the valeur in SPRN_EPN but
failed to remove the read.
Remove it.
And remove related comments, including the very same comment
in InstructionTLBMiss that should have been removed by
commit 33c527522f ("powerpc/8xx: Inconditionally use task PGDIR in
ITLB misses").
Also update the comment about absence of a second level table which
has been handled implicitely since commit 5ddb75cee5 ("powerpc/8xx:
remove tests on PGDIR entry validity").
Fixes: ac9f97ff8b ("powerpc/8xx: Inconditionally use task PGDIR in DTLB misses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/5811c8d1d6187f280ad140d6c0ad6010e41eeaeb.1755361995.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ab26555c9 ]
GFS2 has been calling functions like dlm_lock() even after the lockspace
that these functions operate on has been released with
dlm_release_lockspace(). It has always assumed that those functions
would return -EINVAL in that case, but that was never guaranteed, and it
certainly is no longer the case since commit 4db41bf4f0 ("dlm: remove
ls_local_handle from struct dlm_ls").
To fix that, add proper lockspace locking.
Fixes: 3e11e53041 ("GFS2: ignore unlock failures after withdraw")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2309a01351 ]
Check for asynchronous completion and clear the GLF_PENDING_REPLY flag
earlier in do_xmote(). This will make future changes more readable.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Stable-dep-of: 6ab26555c9 ("gfs2: Add proper lockspace locking")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bddb53b776 ]
Get rid of the GLF_INVALIDATE_IN_PROGRESS flag: it was originally used
to indicate to add_to_queue() that the ->go_sync() and ->go_invalid()
operations were in progress, but as we have established in commit "gfs2:
Fix LM_FLAG_TRY* logic in add_to_queue", add_to_queue() has no need to
know.
Commit d99724c3c3 describes a race in which GLF_INVALIDATE_IN_PROGRESS
is used to serialize two processes which are both in do_xmote() at the
same time. That analysis is wrong: the serialization happens via the
GLF_LOCK flag, which ensures that at most one glock operation can be
active at any time.
Fixes: d99724c3c3 ("gfs2: Close timing window with GLF_INVALIDATE_IN_PROGRESS")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b54770b68 ]
In do_xmote(), remove the duplicate check for the ->go_sync and
->go_inval glock operations. They are either both defined, or none of
them are.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Stable-dep-of: bddb53b776 ("gfs2: Get rid of GLF_INVALIDATE_IN_PROGRESS")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c23e24164 ]
The logic in add_to_queue() for determining whether a LM_FLAG_TRY or
LM_FLAG_TRY_1CB holder should be queued does not make any sense: we are
interested in wether or not the new operation will block behind an
existing or future holder in the queue, but the current code checks for
ongoing locking or ->go_inval() operations, which has little to do with
that.
Replace that code with something more sensible, remove the incorrect
add_to_queue() function annotations, remove the similarly misguided
do_error(gl, 0) call in do_xmote(), and add a missing comment to the
same call in do_promote().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Stable-dep-of: bddb53b776 ("gfs2: Get rid of GLF_INVALIDATE_IN_PROGRESS")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fd70ab7155 ]
The gl_req field and GLF_BLOCKING flag are only relevant to gdlm_lock(),
its callback gdlm_ast(), and their helpers, so set and clear them inside
lock_dlm.c.
Also, the LM_FLAG_ANY flag is relevant to gdlm_lock(), but do_xmote()
doesn't pass that flag down to gdlm_lock() as it should. Fix that by
passing down all the flags.
In addition, document the effect of the LM_FLAG_ANY flag on locks held
in EX mode locally.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Stable-dep-of: bddb53b776 ("gfs2: Get rid of GLF_INVALIDATE_IN_PROGRESS")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa94ad9ab2 ]
There is an extraneous space before a newline in a fs_err message.
Remove it
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Stable-dep-of: bddb53b776 ("gfs2: Get rid of GLF_INVALIDATE_IN_PROGRESS")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 061df28b82 ]
Commit 865cc3e9cc ("gfs2: fix a deadlock on withdraw-during-mount")
added a statement to do_xmote() to clear the GLF_INVALIDATE_IN_PROGRESS
flag a second time after it has already been cleared. Fix that.
Fixes: 865cc3e9cc ("gfs2: fix a deadlock on withdraw-during-mount")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 740cdafd0d ]
The return value was not assigned to 'ret', so the check afterwards
does not do anything.
Fixes: 3d37d4307e ("kselftest/arm64: Add very basic GCS test program")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50af02425a ]
Thanks to -Waddress, the compiler warns that the ksft_test_result()
invocations in the arm64 tpidr2 selftest are always true. Oops.
Fix the test by, err, actually running the test functions.
Fixes: 6d80cb7313 ("kselftest/arm64: Convert tpidr2 test to use kselftest.h")
Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a679e5683d ]
Fix -Wunused-result warning generated when compiled with gcc 13.3.0,
by checking fread's return value and handling errors, preventing
potential failures when reading from stdin.
Fixes compiler warning:
warning: ignoring return value of 'fread' declared with attribute
'warn_unused_result' [-Wunused-result]
Fixes: 806a15b254 ("kselftests/arm64: add PAuth test for whether exec() changes keys")
Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 46104a7d3c ]
In the upstream commit 214c0eea43
("kbuild: add $(objtree)/ prefix to some in-kernel build artifacts")
artifacts required for building out-of-tree kernel modules had
$(objtree) prepended to them to prepare for building in other
directories.
When building external modules for powerpc,
arch/powerpc/lib/crtsavres.o is required for certain
configurations. This artifact is missing the prepended $(objtree).
Fixes: 13b25489b6 ("kbuild: change working directory to external module directory with M=")
Acked-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Tested-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Kienan Stewart <kstewart@efficios.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20250218-buildfix-extmod-powerpc-v2-1-1e78fcf12b56@efficios.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cce436aafc ]
Normally the tracee starts in SECCOMP_NOTIFY_INIT, sends an
event to the tracer, and starts to wait interruptibly. With
SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV, if the tracer receives the
message (SECCOMP_NOTIFY_SENT is reached) while the tracee was waiting
and is subsequently interrupted, the tracee begins to wait again
uninterruptibly (but killable).
This fails if SECCOMP_NOTIFY_REPLIED is reached before the tracee
is interrupted, as the check only considered SECCOMP_NOTIFY_SENT as a
condition to begin waiting again. In this case the tracee is interrupted
even though the tracer already acted on its behalf. This breaks the
assumption SECCOMP_FILTER_FLAG_WAIT_KILLABLE_RECV wanted to ensure,
namely that the tracer can be sure the syscall is not interrupted or
restarted on the tracee after it is received on the tracer. Fix this
by also considering SECCOMP_NOTIFY_REPLIED when evaluating whether to
switch to uninterruptible waiting.
With the condition changed the loop in seccomp_do_user_notification()
would exit immediately after deciding that noninterruptible waiting
is required if the operation already reached SECCOMP_NOTIFY_REPLIED,
skipping the code that processes pending addfd commands first. Prevent
this by executing the remaining loop body one last time in this case.
Fixes: c2aa2dfef2 ("seccomp: Add wait_killable semantic to seccomp user notifier")
Reported-by: Ali Polatel <alip@chesswob.org>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220291
Signed-off-by: Johannes Nixdorf <johannes@nixdorf.dev>
Link: https://lore.kernel.org/r/20250725-seccomp-races-v2-1-cf8b9d139596@nixdorf.dev
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fde0ab43b9 ]
There's a silly problem with the CC_HAS_ASM_GOTO_OUTPUT test: even with
a working compiler it will fail on some architectures simply because it
uses the mnemonic "jmp" for testing the inline asm.
And as reported by Geert, not all architectures use that mnemonic, so
the test fails spuriously on such platforms (including arm and riscv,
but also several other architectures).
This issue avoided any obvious test failures because the build still
works thanks to falling back on the old non-asm-goto code, which just
generates worse code.
Just use an empty asm statement instead.
Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Fixes: e2ffa15b9b ("kbuild: Disable CC_HAS_ASM_GOTO_OUTPUT on clang < 17")
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b9cb7e59ac ]
The capability check should not be audited since it is only being used
to determine the inode permissions. A failed check does not indicate a
violation of security policy but, when an LSM is enabled, a denial audit
message was being generated.
The denial audit message can either lead to the capability being
unnecessarily allowed in a security policy, or being silenced potentially
masking a legitimate capability check at a later point in time.
Similar to commit d6169b0206 ("net: Use ns_capable_noaudit() when
determining net sysctl permissions")
Fixes: 7863dcc72d ("pid: allow pid_max to be set per pid namespace")
CC: Christian Brauner <brauner@kernel.org>
CC: linux-security-module@vger.kernel.org
CC: selinux@vger.kernel.org
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7479260860 ]
INITRAMFS_PRESERVE_MTIME is only used in init/initramfs.c and
init/initramfs_test.c. Hence add a dependency on BLK_DEV_INITRD, to
prevent asking the user about this feature when configuring a kernel
without initramfs support.
Fixes: 1274aea127 ("initramfs: add INITRAMFS_PRESERVE_MTIME Kconfig option")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bbc46b23af ]
With the introduction of clone3 in commit 7f192e3cd3 ("fork: add
clone3") the effective bit width of clone_flags on all architectures was
increased from 32-bit to 64-bit, with a new type of u64 for the flags.
However, for most consumers of clone_flags the interface was not
changed from the previous type of unsigned long.
While this works fine as long as none of the new 64-bit flag bits
(CLONE_CLEAR_SIGHAND and CLONE_INTO_CGROUP) are evaluated, this is still
undesirable in terms of the principle of least surprise.
Thus, this commit fixes all relevant interfaces of the copy_thread
function that is called from copy_process to consistently pass
clone_flags as u64, so that no truncation to 32-bit integers occurs on
32-bit architectures.
Signed-off-by: Simon Schuster <schuster.simon@siemens-energy.com>
Link: https://lore.kernel.org/20250901-nios2-implement-clone3-v2-3-53fcf5577d57@siemens-energy.com
Fixes: c5febea095 ("fork: Pass struct kernel_clone_args into copy_thread")
Acked-by: Guo Ren (Alibaba Damo Academy) <guoren@kernel.org>
Acked-by: Andreas Larsson <andreas@gaisler.com> # sparc
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit c18ecd99e0 upstream.
As syzbot reported below:
------------[ cut here ]------------
kernel BUG at fs/f2fs/file.c:1243!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 5354 Comm: syz.0.0 Not tainted 6.17.0-rc1-syzkaller-00211-g90d970cade8e #0 PREEMPT(full)
RIP: 0010:f2fs_truncate_hole+0x69e/0x6c0 fs/f2fs/file.c:1243
Call Trace:
<TASK>
f2fs_punch_hole+0x2db/0x330 fs/f2fs/file.c:1306
f2fs_fallocate+0x546/0x990 fs/f2fs/file.c:2018
vfs_fallocate+0x666/0x7e0 fs/open.c:342
ksys_fallocate fs/open.c:366 [inline]
__do_sys_fallocate fs/open.c:371 [inline]
__se_sys_fallocate fs/open.c:369 [inline]
__x64_sys_fallocate+0xc0/0x110 fs/open.c:369
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f1e65f8ebe9
w/ a fuzzed image, f2fs may encounter panic due to it detects inconsistent
truncation range in direct node in f2fs_truncate_hole().
The root cause is: a non-inode dnode may has the same footer.ino and
footer.nid, so the dnode will be parsed as an inode, then ADDRS_PER_PAGE()
may return wrong blkaddr count which may be 923 typically, by chance,
dn.ofs_in_node is equal to 923, then count can be calculated to 0 in below
statement, later it will trigger panic w/ f2fs_bug_on(, count == 0 || ...).
count = min(end_offset - dn.ofs_in_node, pg_end - pg_start);
This patch introduces a new node_type NODE_TYPE_NON_INODE, then allowing
passing the new_type to sanity_check_node_footer in f2fs_get_node_folio()
to detect corruption that a non-inode dnode has the same footer.ino and
footer.nid.
Scripts to reproduce:
mkfs.f2fs -f /dev/vdb
mount /dev/vdb /mnt/f2fs
touch /mnt/f2fs/foo
touch /mnt/f2fs/bar
dd if=/dev/zero of=/mnt/f2fs/foo bs=1M count=8
umount /mnt/f2fs
inject.f2fs --node --mb i_nid --nid 4 --idx 0 --val 5 /dev/vdb
mount /dev/vdb /mnt/f2fs
xfs_io /mnt/f2fs/foo -c "fpunch 6984k 4k"
Cc: stable@kernel.org
Reported-by: syzbot+b9c7ffd609c3f09416ab@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/68a68e27.050a0220.1a3988.0002.GAE@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e750f85391 upstream.
When completing emulation of instruction that generated a userspace exit
for I/O, don't recheck L1 intercepts as KVM has already finished that
phase of instruction execution, i.e. has already committed to allowing L2
to perform I/O. If L1 (or host userspace) modifies the I/O permission
bitmaps during the exit to userspace, KVM will treat the access as being
intercepted despite already having emulated the I/O access.
Pivot on EMULTYPE_NO_DECODE to detect that KVM is completing emulation.
Of the three users of EMULTYPE_NO_DECODE, only complete_emulated_io() (the
intended "recipient") can reach the code in question. gp_interception()'s
use is mutually exclusive with is_guest_mode(), and
complete_emulated_insn_gp() unconditionally pairs EMULTYPE_NO_DECODE with
EMULTYPE_SKIP.
The bad behavior was detected by a syzkaller program that toggles port I/O
interception during the userspace I/O exit, ultimately resulting in a WARN
on vcpu->arch.pio.count being non-zero due to KVM no completing emulation
of the I/O instruction.
WARNING: CPU: 23 PID: 1083 at arch/x86/kvm/x86.c:8039 emulator_pio_in_out+0x154/0x170 [kvm]
Modules linked in: kvm_intel kvm irqbypass
CPU: 23 UID: 1000 PID: 1083 Comm: repro Not tainted 6.16.0-rc5-c1610d2d66b1-next-vm #74 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:emulator_pio_in_out+0x154/0x170 [kvm]
PKRU: 55555554
Call Trace:
<TASK>
kvm_fast_pio+0xd6/0x1d0 [kvm]
vmx_handle_exit+0x149/0x610 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xda8/0x1ac0 [kvm]
kvm_vcpu_ioctl+0x244/0x8c0 [kvm]
__x64_sys_ioctl+0x8a/0xd0
do_syscall_64+0x5d/0xc60
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
Reported-by: syzbot+cc2032ba16cc2018ca25@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68790db4.a00a0220.3af5df.0020.GAE@google.com
Fixes: 8a76d7f25f ("KVM: x86: Add x86 callback for intercept check")
Cc: stable@vger.kernel.org
Cc: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20250715190638.1899116-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 674b56aa57 upstream.
Syzkaller reports a KASAN issue as below:
general protection fault, probably for non-canonical address 0xfbd59c0000000021: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0xdead000000000108-0xdead00000000010f]
CPU: 0 PID: 5083 Comm: syz-executor.2 Not tainted 6.1.134-syzkaller-00037-g855bd1d7d838 #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__list_del include/linux/list.h:114 [inline]
RIP: 0010:__list_del_entry include/linux/list.h:137 [inline]
RIP: 0010:list_del include/linux/list.h:148 [inline]
RIP: 0010:p9_fd_cancelled+0xe9/0x200 net/9p/trans_fd.c:734
Call Trace:
<TASK>
p9_client_flush+0x351/0x440 net/9p/client.c:614
p9_client_rpc+0xb6b/0xc70 net/9p/client.c:734
p9_client_version net/9p/client.c:920 [inline]
p9_client_create+0xb51/0x1240 net/9p/client.c:1027
v9fs_session_init+0x1f0/0x18f0 fs/9p/v9fs.c:408
v9fs_mount+0xba/0xcb0 fs/9p/vfs_super.c:126
legacy_get_tree+0x108/0x220 fs/fs_context.c:632
vfs_get_tree+0x8e/0x300 fs/super.c:1573
do_new_mount fs/namespace.c:3056 [inline]
path_mount+0x6a6/0x1e90 fs/namespace.c:3386
do_mount fs/namespace.c:3399 [inline]
__do_sys_mount fs/namespace.c:3607 [inline]
__se_sys_mount fs/namespace.c:3584 [inline]
__x64_sys_mount+0x283/0x300 fs/namespace.c:3584
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
This happens because of a race condition between:
- The 9p client sending an invalid flush request and later cleaning it up;
- The 9p client in p9_read_work() canceled all pending requests.
Thread 1 Thread 2
...
p9_client_create()
...
p9_fd_create()
...
p9_conn_create()
...
// start Thread 2
INIT_WORK(&m->rq, p9_read_work);
p9_read_work()
...
p9_client_rpc()
...
...
p9_conn_cancel()
...
spin_lock(&m->req_lock);
...
p9_fd_cancelled()
...
...
spin_unlock(&m->req_lock);
// status rewrite
p9_client_cb(m->client, req, REQ_STATUS_ERROR)
// first remove
list_del(&req->req_list);
...
spin_lock(&m->req_lock)
...
// second remove
list_del(&req->req_list);
spin_unlock(&m->req_lock)
...
Commit 74d6a5d566 ("9p/trans_fd: Fix concurrency del of req_list in
p9_fd_cancelled/p9_read_work") fixes a concurrency issue in the 9p filesystem
client where the req_list could be deleted simultaneously by both
p9_read_work and p9_fd_cancelled functions, but for the case where req->status
equals REQ_STATUS_RCVD.
Update the check for req->status in p9_fd_cancelled to skip processing not
just received requests, but anything that is not SENT, as whatever
changed the state from SENT also removed the request from its list.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: afd8d65411 ("9P: Add cancelled() to the transport functions.")
Cc: stable@vger.kernel.org
Signed-off-by: Nalivayko Sergey <Sergey.Nalivayko@kaspersky.com>
Message-ID: <20250715154815.3501030-1-Sergey.Nalivayko@kaspersky.com>
[updated the check from status == RECV || status == ERROR to status != SENT]
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2ce245341 upstream.
Devices with power.no_pm set are not expected to need any power
management at all, so modify device_set_pm_not_required() to set
power.no_callbacks for them too in case runtime PM will be enabled
for any of them (which in principle may be done for convenience if
such a device participates in a dependency chain).
Since device_set_pm_not_required() must be called before device_add()
or it would not have any effect, it can update power.no_callbacks
without locking, unlike pm_runtime_no_callbacks() that can be called
after registering the target device.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable <stable@kernel.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/1950054.tdWV9SEqCh@rafael.j.wysocki
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d07bee10e upstream.
If copy_from_user() fails, write() currently returns -EFAULT, but any
partially written data leaves the TX FIFO in an inconsistent state.
Subsequent write() calls then fail with "transmit length mismatch"
errors.
Once partial data is written to the hardware FIFO, it cannot be removed
without a TX reset. Commit c6e8d85faf ("staging: axis-fifo: Remove
hardware resets for user errors") removed a full FIFO reset for this case,
which fixed a potential RX data loss, but introduced this TX issue.
Fix this by introducing a bounce buffer: copy the full packet from
userspace first, and write to the hardware FIFO only if the copy
was successful.
Fixes: c6e8d85faf ("staging: axis-fifo: Remove hardware resets for user errors")
Cc: stable@vger.kernel.org
Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
Link: https://lore.kernel.org/r/20250912101322.1282507-1-ovidiu.panait.oss@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52ff2b840b upstream.
Since commit 2ca34b5087 ("staging: axis-fifo: Correct handling of
tx_fifo_depth for size validation"), write() operations with packets
larger than 'tx_fifo_depth - 4' words are no longer rejected with -EINVAL.
Fortunately, the packets are not actually getting transmitted to hardware,
otherwise they would be raising a 'Transmit Packet Overrun Error'
interrupt, which requires a reset of the TX circuit to recover from.
Instead, the request times out inside wait_event_interruptible_timeout()
and always returns -EAGAIN, since the wake up condition can never be true
for these packets. But still, they unnecessarily block other tasks from
writing to the FIFO and the EAGAIN return code signals userspace to retry
the write() call, even though it will always fail and time out.
According to the AXI4-Stream FIFO reference manual (PG080), the maximum
valid packet length is 'tx_fifo_depth - 4' words, so attempting to send
larger packets is invalid and should not be happening in the first place:
> The maximum packet that can be transmitted is limited by the size of
> the FIFO, which is (C_TX_FIFO_DEPTH–4)*(data interface width/8) bytes.
Therefore, bring back the old behavior and outright reject packets larger
than 'tx_fifo_depth - 4' with -EINVAL. Add a comment to explain why the
check is necessary. The dev_err() message was removed to avoid cluttering
the dmesg log if an invalid packet is received from userspace.
Fixes: 2ca34b5087 ("staging: axis-fifo: Correct handling of tx_fifo_depth for size validation")
Cc: stable@vger.kernel.org
Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com>
Link: https://lore.kernel.org/r/20250817171350.872105-1-ovidiu.panait.oss@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ebcd3460c upstream.
A process might fail to allocate a new bitmap when trying to expand its
proc->dmap. In that case, dbitmap_grow() fails and frees the old bitmap
via dbitmap_free(). However, the driver calls dbitmap_free() again when
the same process terminates, leading to a double-free error:
==================================================================
BUG: KASAN: double-free in binder_proc_dec_tmpref+0x2e0/0x55c
Free of addr ffff00000b7c1420 by task kworker/9:1/209
CPU: 9 UID: 0 PID: 209 Comm: kworker/9:1 Not tainted 6.17.0-rc6-dirty #5 PREEMPT
Hardware name: linux,dummy-virt (DT)
Workqueue: events binder_deferred_func
Call trace:
kfree+0x164/0x31c
binder_proc_dec_tmpref+0x2e0/0x55c
binder_deferred_func+0xc24/0x1120
process_one_work+0x520/0xba4
[...]
Allocated by task 448:
__kmalloc_noprof+0x178/0x3c0
bitmap_zalloc+0x24/0x30
binder_open+0x14c/0xc10
[...]
Freed by task 449:
kfree+0x184/0x31c
binder_inc_ref_for_node+0xb44/0xe44
binder_transaction+0x29b4/0x7fbc
binder_thread_write+0x1708/0x442c
binder_ioctl+0x1b50/0x2900
[...]
==================================================================
Fix this issue by marking proc->map NULL in dbitmap_free().
Cc: stable@vger.kernel.org
Fixes: 15d9da3f81 ("binder: use bitmap for faster descriptor lookup")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Tiffany Yang <ynaffit@google.com>
Link: https://lore.kernel.org/r/20250915221248.3470154-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f8f84e286 upstream.
Without CONFIG_REGMAP, rmi-i2c.c fails to build because struct
regmap_config is not defined:
drivers/misc/amd-sbi/rmi-i2c.c: In function ‘sbrmi_i2c_probe’:
drivers/misc/amd-sbi/rmi-i2c.c:57:16: error: variable ‘sbrmi_i2c_regmap_config’ has initializer but incomplete type
57 | struct regmap_config sbrmi_i2c_regmap_config = {
| ^~~~~~~~~~~~~
Additionally, CONFIG_REGMAP_I2C is needed for devm_regmap_init_i2c():
ld: drivers/misc/amd-sbi/rmi-i2c.o: in function `sbrmi_i2c_probe':
drivers/misc/amd-sbi/rmi-i2c.c:69:(.text+0x1c0): undefined reference to `__devm_regmap_init_i2c'
Fixes: 013f7e7131 ("misc: amd-sbi: Use regmap subsystem")
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Tested-by: Akshay Gupta <Akshay.Gupta@amd.com>
Reviewed-by: Akshay Gupta <Akshay.Gupta@amd.com>
Link: https://lore.kernel.org/r/20250829091442.1112106-1-max.kellermann@ionos.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a699213d4e upstream.
Revert commit 1afa70632c ("serial: qcom-geni: Enable PM runtime for
serial driver") and its dependent commit 86fa39dd6f ("serial:
qcom-geni: Enable Serial on SA8255p Qualcomm platforms") because the
first one causes regression - hang task on Qualcomm RB1 board (QRB2210)
and unable to use serial at all during normal boot:
INFO: task kworker/u16:0:12 blocked for more than 42 seconds.
Not tainted 6.17.0-rc1-00004-g53e760d89498 #9
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u16:0 state:D stack:0 pid:12 tgid:12 ppid:2 task_flags:0x4208060 flags:0x00000010
Workqueue: async async_run_entry_fn
Call trace:
__switch_to+0xe8/0x1a0 (T)
__schedule+0x290/0x7c0
schedule+0x34/0x118
rpm_resume+0x14c/0x66c
rpm_resume+0x2a4/0x66c
rpm_resume+0x2a4/0x66c
rpm_resume+0x2a4/0x66c
__pm_runtime_resume+0x50/0x9c
__driver_probe_device+0x58/0x120
driver_probe_device+0x3c/0x154
__driver_attach_async_helper+0x4c/0xc0
async_run_entry_fn+0x34/0xe0
process_one_work+0x148/0x290
worker_thread+0x2c4/0x3e0
kthread+0x118/0x1c0
ret_from_fork+0x10/0x20
The issue was reported on 12th of August and was ignored by author of
commits introducing issue for two weeks. Only after complaining author
produced a fix which did not work, so if original commits cannot be
reliably fixed for 5 weeks, they obviously are buggy and need to be
dropped.
Fixes: 1afa70632c ("serial: qcom-geni: Enable PM runtime for serial driver")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Closes: https://lore.kernel.org/all/DC0D53ZTNOBU.E8LSD5E5Z8TX@linaro.org/
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Tested-by: Alexey Klimov <alexey.klimov@linaro.org>
Reviewed-by: Alexey Klimov <alexey.klimov@linaro.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Link: https://lore.kernel.org/r/20250917010437.129912-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 208d7f788e upstream.
This `srctree/` link pointed to a file with an underscore, but the header
used a dash instead.
Thus fix it.
This cleans a future warning that will check our `srctree/` links.
Cc: stable@vger.kernel.org
Fixes: 3253aba340 ("rust: block: introduce `kernel::block::mq` module")
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80eaf32672 upstream.
In 'stm32_csi_start', 'csidev->s_subdev' is dereferenced directly while
assigning a value to the 'src_pad'. However the same value is being
checked against NULL at a later point of time indicating that there
are chances that the value can be NULL.
Move the dereference after the NULL check.
Fixes: e7bad98c20 ("media: v4l: Convert the users of v4l2_get_link_freq to call it on a pad")
Cc: stable@vger.kernel.org
Signed-off-by: Chandra Mohan Sundar <chandramohan.explore@gmail.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02a24f13b3 upstream.
One internal buffer which is allocated only once per session was not
being freed during session close because it was not being tracked as
part of internal buffer list which resulted in a memory leak.
Add the necessary logic to explicitly free the untracked internal buffer
during session close to ensure all allocated memory is released
properly.
Fixes: 73702f45db ("media: iris: allocate, initialize and queue internal buffers")
Cc: stable@vger.kernel.org
Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com>
Tested-by: Vikash Garodia <quic_vgarodia@quicinc.com> # X1E80100
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-HDK
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-HDK
Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # x1e80100-crd
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1367da7eb8 upstream.
It is possible to hit a zero entry while traversing the vmas in unuse_mm()
called from swapoff path and accessing it causes the OOPS:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000446--> Loading the memory from offset 0x40 on the
XA_ZERO_ENTRY as address.
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
The issue is manifested from the below race between the fork() on a
process and swapoff:
fork(dup_mmap()) swapoff(unuse_mm)
--------------- -----------------
1) Identical mtree is built using
__mt_dup().
2) copy_pte_range()-->
copy_nonpresent_pte():
The dst mm is added into the
mmlist to be visible to the
swapoff operation.
3) Fatal signal is sent to the parent
process(which is the current during the
fork) thus skip the duplication of the
vmas and mark the vma range with
XA_ZERO_ENTRY as a marker for this process
that helps during exit_mmap().
4) swapoff is tried on the
'mm' added to the 'mmlist' as
part of the 2.
5) unuse_mm(), that iterates
through the vma's of this 'mm'
will hit the non-NULL zero entry
and operating on this zero entry
as a vma is resulting into the
oops.
The proper fix would be around not exposing this partially-valid tree to
others when droping the mmap lock, which is being solved with [1]. A
simpler solution would be checking for MMF_UNSTABLE, as it is set if
mm_struct is not fully initialized in dup_mmap().
Thanks to Liam/Lorenzo/David for all the suggestions in fixing this
issue.
Link: https://lkml.kernel.org/r/20250924181138.1762750-1-charan.kalla@oss.qualcomm.com
Link: https://lore.kernel.org/all/20250815191031.3769540-1-Liam.Howlett@oracle.com/ [1]
Fixes: d240629148 ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()")
Signed-off-by: Charan Teja Kalla <charan.kalla@oss.qualcomm.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e2ee70291 upstream.
Per UVC 1.1+ specification 3.7.2, units and terminals must have a non-zero
unique ID.
```
Each Unit and Terminal within the video function is assigned a unique
identification number, the Unit ID (UID) or Terminal ID (TID), contained in
the bUnitID or bTerminalID field of the descriptor. The value 0x00 is
reserved for undefined ID,
```
If we add a new entity with id 0 or a duplicated ID, it will be marked
as UVC_INVALID_ENTITY_ID.
In a previous attempt commit 3dd075fe8e ("media: uvcvideo: Require
entities to have a non-zero unique ID"), we ignored all the invalid units,
this broke a lot of non-compatible cameras. Hopefully we are more lucky
this time.
This also prevents some syzkaller reproducers from triggering warnings due
to a chain of entities referring to themselves. In one particular case, an
Output Unit is connected to an Input Unit, both with the same ID of 1. But
when looking up for the source ID of the Output Unit, that same entity is
found instead of the input entity, which leads to such warnings.
In another case, a backward chain was considered finished as the source ID
was 0. Later on, that entity was found, but its pads were not valid.
Here is a sample stack trace for one of those cases.
[ 20.650953] usb 1-1: new high-speed USB device number 2 using dummy_hcd
[ 20.830206] usb 1-1: Using ep0 maxpacket: 8
[ 20.833501] usb 1-1: config 0 descriptor??
[ 21.038518] usb 1-1: string descriptor 0 read error: -71
[ 21.038893] usb 1-1: Found UVC 0.00 device <unnamed> (2833:0201)
[ 21.039299] uvcvideo 1-1:0.0: Entity type for entity Output 1 was not initialized!
[ 21.041583] uvcvideo 1-1:0.0: Entity type for entity Input 1 was not initialized!
[ 21.042218] ------------[ cut here ]------------
[ 21.042536] WARNING: CPU: 0 PID: 9 at drivers/media/mc/mc-entity.c:1147 media_create_pad_link+0x2c4/0x2e0
[ 21.043195] Modules linked in:
[ 21.043535] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.11.0-rc7-00030-g3480e43aeccf #444
[ 21.044101] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[ 21.044639] Workqueue: usb_hub_wq hub_event
[ 21.045100] RIP: 0010:media_create_pad_link+0x2c4/0x2e0
[ 21.045508] Code: fe e8 20 01 00 00 b8 f4 ff ff ff 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 0f 0b eb e9 0f 0b eb 0a 0f 0b eb 06 <0f> 0b eb 02 0f 0b b8 ea ff ff ff eb d4 66 2e 0f 1f 84 00 00 00 00
[ 21.046801] RSP: 0018:ffffc9000004b318 EFLAGS: 00010246
[ 21.047227] RAX: ffff888004e5d458 RBX: 0000000000000000 RCX: ffffffff818fccf1
[ 21.047719] RDX: 000000000000007b RSI: 0000000000000000 RDI: ffff888004313290
[ 21.048241] RBP: ffff888004313290 R08: 0001ffffffffffff R09: 0000000000000000
[ 21.048701] R10: 0000000000000013 R11: 0001888004313290 R12: 0000000000000003
[ 21.049138] R13: ffff888004313080 R14: ffff888004313080 R15: 0000000000000000
[ 21.049648] FS: 0000000000000000(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
[ 21.050271] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 21.050688] CR2: 0000592cc27635b0 CR3: 000000000431c000 CR4: 0000000000750ef0
[ 21.051136] PKRU: 55555554
[ 21.051331] Call Trace:
[ 21.051480] <TASK>
[ 21.051611] ? __warn+0xc4/0x210
[ 21.051861] ? media_create_pad_link+0x2c4/0x2e0
[ 21.052252] ? report_bug+0x11b/0x1a0
[ 21.052540] ? trace_hardirqs_on+0x31/0x40
[ 21.052901] ? handle_bug+0x3d/0x70
[ 21.053197] ? exc_invalid_op+0x1a/0x50
[ 21.053511] ? asm_exc_invalid_op+0x1a/0x20
[ 21.053924] ? media_create_pad_link+0x91/0x2e0
[ 21.054364] ? media_create_pad_link+0x2c4/0x2e0
[ 21.054834] ? media_create_pad_link+0x91/0x2e0
[ 21.055131] ? _raw_spin_unlock+0x1e/0x40
[ 21.055441] ? __v4l2_device_register_subdev+0x202/0x210
[ 21.055837] uvc_mc_register_entities+0x358/0x400
[ 21.056144] uvc_register_chains+0x1fd/0x290
[ 21.056413] uvc_probe+0x380e/0x3dc0
[ 21.056676] ? __lock_acquire+0x5aa/0x26e0
[ 21.056946] ? find_held_lock+0x33/0xa0
[ 21.057196] ? kernfs_activate+0x70/0x80
[ 21.057533] ? usb_match_dynamic_id+0x1b/0x70
[ 21.057811] ? find_held_lock+0x33/0xa0
[ 21.058047] ? usb_match_dynamic_id+0x55/0x70
[ 21.058330] ? lock_release+0x124/0x260
[ 21.058657] ? usb_match_one_id_intf+0xa2/0x100
[ 21.058997] usb_probe_interface+0x1ba/0x330
[ 21.059399] really_probe+0x1ba/0x4c0
[ 21.059662] __driver_probe_device+0xb2/0x180
[ 21.059944] driver_probe_device+0x5a/0x100
[ 21.060170] __device_attach_driver+0xe9/0x160
[ 21.060427] ? __pfx___device_attach_driver+0x10/0x10
[ 21.060872] bus_for_each_drv+0xa9/0x100
[ 21.061312] __device_attach+0xed/0x190
[ 21.061812] device_initial_probe+0xe/0x20
[ 21.062229] bus_probe_device+0x4d/0xd0
[ 21.062590] device_add+0x308/0x590
[ 21.062912] usb_set_configuration+0x7b6/0xaf0
[ 21.063403] usb_generic_driver_probe+0x36/0x80
[ 21.063714] usb_probe_device+0x7b/0x130
[ 21.063936] really_probe+0x1ba/0x4c0
[ 21.064111] __driver_probe_device+0xb2/0x180
[ 21.064577] driver_probe_device+0x5a/0x100
[ 21.065019] __device_attach_driver+0xe9/0x160
[ 21.065403] ? __pfx___device_attach_driver+0x10/0x10
[ 21.065820] bus_for_each_drv+0xa9/0x100
[ 21.066094] __device_attach+0xed/0x190
[ 21.066535] device_initial_probe+0xe/0x20
[ 21.066992] bus_probe_device+0x4d/0xd0
[ 21.067250] device_add+0x308/0x590
[ 21.067501] usb_new_device+0x347/0x610
[ 21.067817] hub_event+0x156b/0x1e30
[ 21.068060] ? process_scheduled_works+0x48b/0xaf0
[ 21.068337] process_scheduled_works+0x5a3/0xaf0
[ 21.068668] worker_thread+0x3cf/0x560
[ 21.068932] ? kthread+0x109/0x1b0
[ 21.069133] kthread+0x197/0x1b0
[ 21.069343] ? __pfx_worker_thread+0x10/0x10
[ 21.069598] ? __pfx_kthread+0x10/0x10
[ 21.069908] ret_from_fork+0x32/0x40
[ 21.070169] ? __pfx_kthread+0x10/0x10
[ 21.070424] ret_from_fork_asm+0x1a/0x30
[ 21.070737] </TASK>
Reported-by: syzbot+0584f746fde3d52b4675@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=0584f746fde3d52b4675
Reported-by: syzbot+dd320d114deb3f5bb79b@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=dd320d114deb3f5bb79b
Reported-by: Youngjun Lee <yjjuny.lee@samsung.com>
Fixes: a3fbc2e6bb ("media: mc-entity.c: use WARN_ON, validate link pads")
Cc: stable@vger.kernel.org
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Co-developed-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Hans de Goede <hansg@kernel.org>
Signed-off-by: Hans de Goede <hansg@kernel.org>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa0f61cc1d upstream.
Syzbot reports a KASAN issue as below:
BUG: KASAN: use-after-free in __create_pipe include/linux/usb.h:1945 [inline]
BUG: KASAN: use-after-free in send_packet+0xa2d/0xbc0 drivers/media/rc/imon.c:627
Read of size 4 at addr ffff8880256fb000 by task syz-executor314/4465
CPU: 2 PID: 4465 Comm: syz-executor314 Not tainted 6.0.0-rc1-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:317 [inline]
print_report.cold+0x2ba/0x6e9 mm/kasan/report.c:433
kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
__create_pipe include/linux/usb.h:1945 [inline]
send_packet+0xa2d/0xbc0 drivers/media/rc/imon.c:627
vfd_write+0x2d9/0x550 drivers/media/rc/imon.c:991
vfs_write+0x2d7/0xdd0 fs/read_write.c:576
ksys_write+0x127/0x250 fs/read_write.c:631
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The iMON driver improperly releases the usb_device reference in
imon_disconnect without coordinating with active users of the
device.
Specifically, the fields usbdev_intf0 and usbdev_intf1 are not
protected by the users counter (ictx->users). During probe,
imon_init_intf0 or imon_init_intf1 increments the usb_device
reference count depending on the interface. However, during
disconnect, usb_put_dev is called unconditionally, regardless of
actual usage.
As a result, if vfd_write or other operations are still in
progress after disconnect, this can lead to a use-after-free of
the usb_device pointer.
Thread 1 vfd_write Thread 2 imon_disconnect
...
if
usb_put_dev(ictx->usbdev_intf0)
else
usb_put_dev(ictx->usbdev_intf1)
...
while
send_packet
if
pipe = usb_sndintpipe(
ictx->usbdev_intf0) UAF
else
pipe = usb_sndctrlpipe(
ictx->usbdev_intf0, 0) UAF
Guard access to usbdev_intf0 and usbdev_intf1 after disconnect by
checking ictx->disconnected in all writer paths. Add early return
with -ENODEV in send_packet(), vfd_write(), lcd_write() and
display_open() if the device is no longer present.
Set and read ictx->disconnected under ictx->lock to ensure memory
synchronization. Acquire the lock in imon_disconnect() before setting
the flag to synchronize with any ongoing operations.
Ensure writers exit early and safely after disconnect before the USB
core proceeds with cleanup.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Reported-by: syzbot+f1a69784f6efe748c3bf@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f1a69784f6efe748c3bf
Fixes: 21677cfc56 ("V4L/DVB: ir-core: add imon driver")
Cc: stable@vger.kernel.org
Signed-off-by: Larshin Sergey <Sergey.Larshin@kaspersky.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40b7a19f32 upstream.
The original code uses cancel_delayed_work() in xc5000_release(), which
does not guarantee that the delayed work item timer_sleep has fully
completed if it was already running. This leads to use-after-free scenarios
where xc5000_release() may free the xc5000_priv while timer_sleep is still
active and attempts to dereference the xc5000_priv.
A typical race condition is illustrated below:
CPU 0 (release thread) | CPU 1 (delayed work callback)
xc5000_release() | xc5000_do_timer_sleep()
cancel_delayed_work() |
hybrid_tuner_release_state(priv) |
kfree(priv) |
| priv = container_of() // UAF
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the timer_sleep is properly canceled before the xc5000_priv memory
is deallocated.
A deadlock concern was considered: xc5000_release() is called in a process
context and is not holding any locks that the timer_sleep work item might
also need. Therefore, the use of the _sync() variant is safe here.
This bug was initially identified through static analysis.
Fixes: f7a27ff1fb ("[media] xc5000: delay tuner sleep to 5 seconds")
Cc: stable@vger.kernel.org
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
[hverkuil: fix typo in Subject: tunner -> tuner]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79d10f4f21 upstream.
The state->timer is a cyclic timer that schedules work_i2c_poll and
delayed_work_enable_hotplug, while rearming itself. Using timer_delete()
fails to guarantee the timer isn't still running when destroyed, similarly
cancel_delayed_work() cannot ensure delayed_work_enable_hotplug has
terminated if already executing. During probe failure after timer
initialization, these may continue running as orphans and reference the
already-freed tc358743_state object through tc358743_irq_poll_timer.
The following is the trace captured by KASAN.
BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x7d7/0x8c0
Write of size 8 at addr ffff88800ded83c8 by task swapper/1/0
...
Call Trace:
<IRQ>
dump_stack_lvl+0x55/0x70
print_report+0xcf/0x610
? __pfx_sched_balance_find_src_group+0x10/0x10
? __run_timer_base.part.0+0x7d7/0x8c0
kasan_report+0xb8/0xf0
? __run_timer_base.part.0+0x7d7/0x8c0
__run_timer_base.part.0+0x7d7/0x8c0
? rcu_sched_clock_irq+0xb06/0x27d0
? __pfx___run_timer_base.part.0+0x10/0x10
? try_to_wake_up+0xb15/0x1960
? tmigr_update_events+0x280/0x740
? _raw_spin_lock_irq+0x80/0xe0
? __pfx__raw_spin_lock_irq+0x10/0x10
tmigr_handle_remote_up+0x603/0x7e0
? __pfx_tmigr_handle_remote_up+0x10/0x10
? sched_balance_trigger+0x98/0x9f0
? sched_tick+0x221/0x5a0
? _raw_spin_lock_irq+0x80/0xe0
? __pfx__raw_spin_lock_irq+0x10/0x10
? tick_nohz_handler+0x339/0x440
? __pfx_tmigr_handle_remote_up+0x10/0x10
__walk_groups.isra.0+0x42/0x150
tmigr_handle_remote+0x1f4/0x2e0
? __pfx_tmigr_handle_remote+0x10/0x10
? ktime_get+0x60/0x140
? lapic_next_event+0x11/0x20
? clockevents_program_event+0x1d4/0x2a0
? hrtimer_interrupt+0x322/0x780
handle_softirqs+0x16a/0x550
irq_exit_rcu+0xaf/0xe0
sysvec_apic_timer_interrupt+0x70/0x80
</IRQ>
...
Allocated by task 141:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x7f/0x90
__kmalloc_node_track_caller_noprof+0x198/0x430
devm_kmalloc+0x7b/0x1e0
tc358743_probe+0xb7/0x610 i2c_device_probe+0x51d/0x880
really_probe+0x1ca/0x5c0
__driver_probe_device+0x248/0x310
driver_probe_device+0x44/0x120
__device_attach_driver+0x174/0x220
bus_for_each_drv+0x100/0x190
__device_attach+0x206/0x370
bus_probe_device+0x123/0x170
device_add+0xd25/0x1470
i2c_new_client_device+0x7a0/0xcd0
do_one_initcall+0x89/0x300
do_init_module+0x29d/0x7f0
load_module+0x4f48/0x69e0
init_module_from_file+0xe4/0x150
idempotent_init_module+0x320/0x670
__x64_sys_finit_module+0xbd/0x120
do_syscall_64+0xac/0x280
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 141:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3a/0x60
__kasan_slab_free+0x3f/0x50
kfree+0x137/0x370
release_nodes+0xa4/0x100
devres_release_group+0x1b2/0x380
i2c_device_probe+0x694/0x880
really_probe+0x1ca/0x5c0
__driver_probe_device+0x248/0x310
driver_probe_device+0x44/0x120
__device_attach_driver+0x174/0x220
bus_for_each_drv+0x100/0x190
__device_attach+0x206/0x370
bus_probe_device+0x123/0x170
device_add+0xd25/0x1470
i2c_new_client_device+0x7a0/0xcd0
do_one_initcall+0x89/0x300
do_init_module+0x29d/0x7f0
load_module+0x4f48/0x69e0
init_module_from_file+0xe4/0x150
idempotent_init_module+0x320/0x670
__x64_sys_finit_module+0xbd/0x120
do_syscall_64+0xac/0x280
entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
Replace timer_delete() with timer_delete_sync() and cancel_delayed_work()
with cancel_delayed_work_sync() to ensure proper termination of timer and
work items before resource cleanup.
This bug was initially identified through static analysis. For reproduction
and testing, I created a functional emulation of the tc358743 device via a
kernel module and introduced faults through the debugfs interface.
Fixes: 869f38ae07 ("media: i2c: tc358743: Fix crash in the probe error path when using polling")
Fixes: d32d98642d ("[media] Driver for Toshiba TC358743 HDMI to CSI-2 bridge")
Cc: stable@vger.kernel.org
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 01e03fb7db upstream.
The original code uses cancel_delayed_work() in flexcop_pci_remove(), which
does not guarantee that the delayed work item irq_check_work has fully
completed if it was already running. This leads to use-after-free scenarios
where flexcop_pci_remove() may free the flexcop_device while irq_check_work
is still active and attempts to dereference the device.
A typical race condition is illustrated below:
CPU 0 (remove) | CPU 1 (delayed work callback)
flexcop_pci_remove() | flexcop_pci_irq_check_work()
cancel_delayed_work() |
flexcop_device_kfree(fc_pci->fc_dev) |
| fc = fc_pci->fc_dev; // UAF
This is confirmed by a KASAN report:
==================================================================
BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x7d7/0x8c0
Write of size 8 at addr ffff8880093aa8c8 by task bash/135
...
Call Trace:
<IRQ>
dump_stack_lvl+0x55/0x70
print_report+0xcf/0x610
? __run_timer_base.part.0+0x7d7/0x8c0
kasan_report+0xb8/0xf0
? __run_timer_base.part.0+0x7d7/0x8c0
__run_timer_base.part.0+0x7d7/0x8c0
? __pfx___run_timer_base.part.0+0x10/0x10
? __pfx_read_tsc+0x10/0x10
? ktime_get+0x60/0x140
? lapic_next_event+0x11/0x20
? clockevents_program_event+0x1d4/0x2a0
run_timer_softirq+0xd1/0x190
handle_softirqs+0x16a/0x550
irq_exit_rcu+0xaf/0xe0
sysvec_apic_timer_interrupt+0x70/0x80
</IRQ>
...
Allocated by task 1:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x7f/0x90
__kmalloc_noprof+0x1be/0x460
flexcop_device_kmalloc+0x54/0xe0
flexcop_pci_probe+0x1f/0x9d0
local_pci_probe+0xdc/0x190
pci_device_probe+0x2fe/0x470
really_probe+0x1ca/0x5c0
__driver_probe_device+0x248/0x310
driver_probe_device+0x44/0x120
__driver_attach+0xd2/0x310
bus_for_each_dev+0xed/0x170
bus_add_driver+0x208/0x500
driver_register+0x132/0x460
do_one_initcall+0x89/0x300
kernel_init_freeable+0x40d/0x720
kernel_init+0x1a/0x150
ret_from_fork+0x10c/0x1a0
ret_from_fork_asm+0x1a/0x30
Freed by task 135:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3a/0x60
__kasan_slab_free+0x3f/0x50
kfree+0x137/0x370
flexcop_device_kfree+0x32/0x50
pci_device_remove+0xa6/0x1d0
device_release_driver_internal+0xf8/0x210
pci_stop_bus_device+0x105/0x150
pci_stop_and_remove_bus_device_locked+0x15/0x30
remove_store+0xcc/0xe0
kernfs_fop_write_iter+0x2c3/0x440
vfs_write+0x871/0xd70
ksys_write+0xee/0x1c0
do_syscall_64+0xac/0x280
entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the delayed work item is properly canceled and any executing delayed
work has finished before the device memory is deallocated.
This bug was initially identified through static analysis. To reproduce
and test it, I simulated the B2C2 FlexCop PCI device in QEMU and introduced
artificial delays within the flexcop_pci_irq_check_work() function to
increase the likelihood of triggering the bug.
Fixes: 382c5546d6 ("V4L/DVB (10694): [PATCH] software IRQ watchdog for Flexcop B2C2 DVB PCI cards")
Cc: stable@vger.kernel.org
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e31a6bc07 upstream.
There is a bug observed when rtw89_core_tx_kick_off_and_wait() tries to
access already freed skb_data:
BUG: KFENCE: use-after-free write in rtw89_core_tx_kick_off_and_wait drivers/net/wireless/realtek/rtw89/core.c:1110
CPU: 6 UID: 0 PID: 41377 Comm: kworker/u64:24 Not tainted 6.17.0-rc1+ #1 PREEMPT(lazy)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS edk2-20250523-14.fc42 05/23/2025
Workqueue: events_unbound cfg80211_wiphy_work [cfg80211]
Use-after-free write at 0x0000000020309d9d (in kfence-#251):
rtw89_core_tx_kick_off_and_wait drivers/net/wireless/realtek/rtw89/core.c:1110
rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.h:141
rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
process_one_work kernel/workqueue.c:3241
worker_thread kernel/workqueue.c:3400
kthread kernel/kthread.c:463
ret_from_fork arch/x86/kernel/process.c:154
ret_from_fork_asm arch/x86/entry/entry_64.S:258
kfence-#251: 0x0000000056e2393d-0x000000009943cb62, size=232, cache=skbuff_head_cache
allocated by task 41377 on cpu 6 at 77869.159548s (0.009551s ago):
__alloc_skb net/core/skbuff.c:659
__netdev_alloc_skb net/core/skbuff.c:734
ieee80211_nullfunc_get net/mac80211/tx.c:5844
rtw89_core_send_nullfunc drivers/net/wireless/realtek/rtw89/core.c:3431
rtw89_core_scan_complete drivers/net/wireless/realtek/rtw89/core.c:5338
rtw89_hw_scan_complete_cb drivers/net/wireless/realtek/rtw89/fw.c:7979
rtw89_chanctx_proceed_cb drivers/net/wireless/realtek/rtw89/chan.c:3165
rtw89_chanctx_proceed drivers/net/wireless/realtek/rtw89/chan.c:3194
rtw89_hw_scan_complete drivers/net/wireless/realtek/rtw89/fw.c:8012
rtw89_mac_c2h_scanofld_rsp drivers/net/wireless/realtek/rtw89/mac.c:5059
rtw89_fw_c2h_work drivers/net/wireless/realtek/rtw89/fw.c:6758
process_one_work kernel/workqueue.c:3241
worker_thread kernel/workqueue.c:3400
kthread kernel/kthread.c:463
ret_from_fork arch/x86/kernel/process.c:154
ret_from_fork_asm arch/x86/entry/entry_64.S:258
freed by task 1045 on cpu 9 at 77869.168393s (0.001557s ago):
ieee80211_tx_status_skb net/mac80211/status.c:1117
rtw89_pci_release_txwd_skb drivers/net/wireless/realtek/rtw89/pci.c:564
rtw89_pci_release_tx_skbs.isra.0 drivers/net/wireless/realtek/rtw89/pci.c:651
rtw89_pci_release_tx drivers/net/wireless/realtek/rtw89/pci.c:676
rtw89_pci_napi_poll drivers/net/wireless/realtek/rtw89/pci.c:4238
__napi_poll net/core/dev.c:7495
net_rx_action net/core/dev.c:7557 net/core/dev.c:7684
handle_softirqs kernel/softirq.c:580
do_softirq.part.0 kernel/softirq.c:480
__local_bh_enable_ip kernel/softirq.c:407
rtw89_pci_interrupt_threadfn drivers/net/wireless/realtek/rtw89/pci.c:927
irq_thread_fn kernel/irq/manage.c:1133
irq_thread kernel/irq/manage.c:1257
kthread kernel/kthread.c:463
ret_from_fork arch/x86/kernel/process.c:154
ret_from_fork_asm arch/x86/entry/entry_64.S:258
It is a consequence of a race between the waiting and the signaling side
of the completion:
Waiting thread Completing thread
rtw89_core_tx_kick_off_and_wait()
rcu_assign_pointer(skb_data->wait, wait)
/* start waiting */
wait_for_completion_timeout()
rtw89_pci_tx_status()
rtw89_core_tx_wait_complete()
rcu_read_lock()
/* signals completion and
* proceeds further
*/
complete(&wait->completion)
rcu_read_unlock()
...
/* frees skb_data */
ieee80211_tx_status_ni()
/* returns (exit status doesn't matter) */
wait_for_completion_timeout()
...
/* accesses the already freed skb_data */
rcu_assign_pointer(skb_data->wait, NULL)
The completing side might proceed and free the underlying skb even before
the waiting side is fully awoken and run to execution. Actually the race
happens regardless of wait_for_completion_timeout() exit status, e.g.
the waiting side may hit a timeout and the concurrent completing side is
still able to free the skb.
Skbs which are sent by rtw89_core_tx_kick_off_and_wait() are owned by the
driver. They don't come from core ieee80211 stack so no need to pass them
to ieee80211_tx_status_ni() on completing side.
Introduce a work function which will act as a garbage collector for
rtw89_tx_wait_info objects and the associated skbs. Thus no potentially
heavy locks are required on the completing side.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 1ae5ca6152 ("wifi: rtw89: add function to wait for completion of TX skbs")
Cc: stable@vger.kernel.org
Suggested-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20250919210852.823912-2-pchelkin@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f2c0ac142 upstream.
The previous commit 0718a78f6a ("ALSA: usb-audio: Kill timer properly at
removal") patched a UAF issue caused by the error timer.
However, because the error timer kill added in this patch occurs after the
endpoint delete, a race condition to UAF still occurs, albeit rarely.
Additionally, since kill-cleanup for urb is also missing, freed memory can
be accessed in interrupt context related to urb, which can cause UAF.
Therefore, to prevent this, error timer and urb must be killed before
freeing the heap memory.
Cc: <stable@vger.kernel.org>
Reported-by: syzbot+f02665daa2abeef4a947@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f02665daa2abeef4a947
Fixes: 0718a78f6a ("ALSA: usb-audio: Kill timer properly at removal")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 27e06650a5 upstream.
A buffer overflow arises from the usage of snprintf to write into the
buffer "buf" in target_lu_gp_members_show function located in
/drivers/target/target_core_configfs.c. This buffer is allocated with
size LU_GROUP_NAME_BUF (256 bytes).
snprintf(...) formats multiple strings into buf with the HBA name
(hba->hba_group.cg_item), a slash character, a devicename (dev->
dev_group.cg_item) and a newline character, the total formatted string
length may exceed the buffer size of 256 bytes.
Since snprintf() returns the total number of bytes that would have been
written (the length of %s/%sn ), this value may exceed the buffer length
(256 bytes) passed to memcpy(), this will ultimately cause function
memcpy reporting a buffer overflow error.
An additional check of the return value of snprintf() can avoid this
buffer overflow.
Reported-by: Wang Haoran <haoranwangsec@gmail.com>
Reported-by: ziiiro <yuanmingbuaa@gmail.com>
Signed-off-by: Wang Haoran <haoranwangsec@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
description:Create a report to help us fix your issue
body:
- type:markdown
attributes:
value:|
**Is this the right place for my bug report?**
This repository contains the Linux kernel used on the Raspberry Pi.
If you believe that the issue you are seeing is kernel-related, this is the right place.
If not, we have other repositories for the GPU firmware at [github.com/raspberrypi/firmware](https://github.com/raspberrypi/firmware) and Raspberry Pi userland applications at [github.com/raspberrypi/userland](https://github.com/raspberrypi/userland).
If you have problems with the Raspbian distribution packages, report them in the [github.com/RPi-Distro/repo](https://github.com/RPi-Distro/repo).
If you simply have a question, then [the Raspberry Pi forums](https://www.raspberrypi.org/forums) are the best place to ask it.
- type:textarea
id:description
attributes:
label:Describe the bug
description:|
Add a clear and concise description of what you think the bug is.
validations:
required:true
- type:textarea
id:reproduce
attributes:
label:Steps to reproduce the behaviour
description:|
List the steps required to reproduce the issue.
validations:
required:true
- type:dropdown
id:model
attributes:
label:Device (s)
description:Onwhich device you are facing the bug?
multiple:true
options:
- Raspberry Pi Zero
- Raspberry Pi Zero W/WH
- Raspberry Pi Zero 2 W
- Raspberry Pi 1 Mod. A
- Raspberry Pi 1 Mod. A+
- Raspberry Pi 1 Mod. B
- Raspberry Pi 1 Mod. B+
- Raspberry Pi 2 Mod. B
- Raspberry Pi 2 Mod. B v1.2
- Raspberry Pi 3 Mod. A+
- Raspberry Pi 3 Mod. B
- Raspberry Pi 3 Mod. B+
- Raspberry Pi 4 Mod. B
- Raspberry Pi 400
- Raspberry Pi 5
- Raspberry Pi 500
- Raspberry Pi 500+
- Raspberry Pi CM0
- Raspberry Pi CM1
- Raspberry Pi CM3
- Raspberry Pi CM3 Lite
- Raspberry Pi CM3+
- Raspberry Pi CM3+ Lite
- Raspberry Pi CM4
- Raspberry Pi CM4 Lite
- Raspberry Pi CM5
- Raspberry Pi CM5 Lite
- Other
validations:
required:true
- type:textarea
id:system
attributes:
label:System
description:|
Copy and paste the URL returned from `raspinfo | pastebinit` into this section.
Alternatively, add answers to the following questions:
* Which OS and version (`cat /etc/rpi-issue`)?
* Which firmware version (`vcgencmd version`)?
* Which kernel version (`uname -a`)?
validations:
required:true
- type:textarea
id:logs
attributes:
label:Logs
description:|
If applicable, add the relevant output from `dmesg` or similar.
about:"Please do not use GitHub for asking questions. If you simply have a question, then the Raspberry Pi forums are the best place to ask it. Thanks in advance for helping us keep the issue tracker clean!"
- name:"⛔ Problems with Raspberry Pi OS packages"
url:https://github.com/RPi-Distro/repo
about:"If you have problems with a Raspberry Pi OS package, please report them at https://github.com/RPi-Distro/repo."
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.