This reverts commit 063a326cba.
Reverting temporarily because not enabling the composite output has
been seen to break HDMI on older Pis.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
It referenced fragment@13 which used to be part of edt-ft5406.dtsi,
but has now been removed.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The code that set the scdc_enabled flag to ensure it was
disabled at boot time also ran on Pi0-3 where there is no
SCDC support. This lead to a warning in vc4_hdmi_encoder_post_crtc_disable
due to vc4_hdmi_disable_scrambling being called and trying to
read (and write) register HDMI_SCRAMBLER_CTL which doesn't
exist on those platforms.
Only set the flag should the interface be configured to support
more than HDMI 1.4.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
We currently rely on two functions, vc4_hdmi_supports_scrambling() and
vc4_hdmi_mode_needs_scrambling() to determine if we should enable and
disable the scrambler for any given mode.
Since we might need to disable the controller at boot, we also always
run vc4_hdmi_disable_scrambling() and thus call those functions without
a mode yet, which in turns need to make some special casing in order for
it to work.
Instead of duplicating the check for whether or not we need to take care
of the scrambler in both vc4_hdmi_enable_scrambling() and
vc4_hdmi_disable_scrambling(), we can do that check only when we enable
it and store whether or not it's been enabled in our private structure.
We also need to initialize that flag at true to make sure we disable the
scrambler at boot since we can't really know its state yet.
This allows to simplify a bit that part of the driver, and removes one
user of our copy of the CRTC adjusted mode outside of KMS (since
vc4_hdmi_disable_scrambling() might be called from the hotplug interrupt
handler).
It also removes our last user of the legacy encoder->crtc pointer.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
We currently poke at encoder->crtc in the ALSA code path to determine
whether the HDMI output is enabled or not, and thus whether we should
allow the audio output.
However, that pointer is deprecated and shouldn't really be used by
atomic drivers anymore. Since we have the infrastructure in place now,
let's just create a flag that we toggle to report whether the controller
is currently enabled and use that instead of encoder->crtc in ALSA.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Even though we already check that the encoder->crtc pointer is there
during in startup(), which is part of the open() path in ASoC, nothing
guarantees that our encoder state won't change between the time when we
open the device and the time we prepare it.
Move the sanity checks we do in startup() to a helper and call it from
prepare().
Fixes: 91e99e1139 ("drm/vc4: hdmi: Register HDMI codec")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Accessing the crtc->state pointer from outside the modesetting context
is not allowed. We thus need to copy whatever we need from the KMS state
to our structure in order to access it.
However, in the vc4 HDMI driver we do use that pointer in the ALSA code
path, and potentially in the hotplug interrupt handler path.
These paths both need access to the CRTC adjusted mode in order for the
proper dividers to be set for ALSA, and the scrambler state to be
reinstated properly for hotplug.
Let's copy this mode into our private encoder structure and reference it
from there when needed. Since that part is shared between KMS and other
paths, we need to protect it using our mutex.
Link: https://lore.kernel.org/all/YWgteNaNeaS9uWDe@phenom.ffwll.local/
Fixes: bb7d785688 ("drm/vc4: Add HDMI audio support")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The vc4 HDMI controller registers into the KMS, CEC and ALSA
frameworks.
However, no particular care is done to prevent the concurrent execution
of different framework hooks from happening at the same time.
In order to protect against that scenario, let's introduce a mutex that
relevant ALSA and KMS hooks will need to take to prevent concurrent
execution.
CEC is left out at the moment though, since the .get_modes and .detect
KMS hooks, when running cec_s_phys_addr_from_edid, can end up calling
CEC's .adap_enable hook. This introduces some reentrancy that isn't easy
to deal with properly.
The CEC hooks also don't share much state with the rest of the driver:
the registers are entirely separate, we don't share any variable, the
only thing that can conflict is the CEC clock divider setup that can be
affected by a mode set.
However, after discussing it, it looks like CEC should be able to
recover from this if it was to happen.
Fixes: bb7d785688 ("drm/vc4: Add HDMI audio support")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The vc4 HDMI driver has multiple path shared between the CEC, ALSA and
KMS frameworks, plus two interrupt handlers (CEC and hotplug) that will
read and modify a number of registers.
Even though not bug has been reported so far, it's definitely unsafe, so
let's just add a spinlock to protect the register access of the HDMI
controller.
Fixes: c8b75bca92 ("drm/vc4: Add KMS support for Raspberry Pi.")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Accessing the crtc->state pointer from outside the modesetting context
is not allowed. We thus need to copy whatever we need from the KMS state
to our structure in order to access it.
In VC4, a number of users of that pointers have crept in over the years,
and the previous commits removed them all but the HVS channel a CRTC has
been assigned.
Let's move this channel in struct vc4_crtc at atomic_begin() time, drop
it from our private state structure, and remove our use of crtc->state
from our vblank handler entirely.
Link: https://lore.kernel.org/all/YWgteNaNeaS9uWDe@phenom.ffwll.local/
Fixes: 87ebcd42fb ("drm/vc4: crtc: Assign output to channel automatically")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
In some situation, we can end up being stuck on a non-blocking that went
through properly.
The situation that seems to trigger it reliably is to first start a
non-blocking commit, and then right after, and before we had any vblank
interrupt), start a blocking commit.
This will lead to the first commit workqueue to be scheduled, setup the
display, while the second commit is waiting for the first one to be
completed.
The vblank interrupt will then be raised, vc4_crtc_handle_vblank() will
run and will compare the active dlist in the HVS channel to the one
associated with the crtc->state.
However, at that point, the second commit is waiting using
drm_atomic_helper_wait_for_dependencies that occurs after
drm_atomic_helper_swap_state has been called, so crtc->state points to
the second commit state. vc4_crtc_handle_vblank() will compare the two
dlist addresses and since they don't match will ignore the interrupt.
The vblank event will never be reported, and the first and second commit
will wait for the first commit completion until they timeout.
The underlying reason is that it was never safe to do so. Indeed,
accessing the ->state pointer access synchronization is based on
ownership guarantees that can only occur within the functions and hooks
defined as part of the KMS framework, and obviously the irq handler
isn't one of them. The rework to move to generic helpers only uncovered
the underlying issue.
However, since the code path between
drm_atomic_helper_wait_for_dependencies() and
drm_atomic_helper_wait_for_vblanks() is serialised and we can't get two
commits in that path at the same time, we can work around this issue by
setting a variable associated to struct drm_crtc to the dlist we expect,
and then using it from the vc4_crtc_handle_vblank() function.
Since that state is shared with the modesetting path, we also need to
introduce a spinlock to protect the code shared between the interrupt
handler and the modesetting path, protecting only our new variable for
now.
Link: https://lore.kernel.org/all/YWgteNaNeaS9uWDe@phenom.ffwll.local/
Fixes: 56d1fe0979 ("drm/vc4: Make pageflip completion handling more robust.")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Accessing the crtc->state pointer from outside the modesetting context
is not allowed. We thus need to copy whatever we need from the KMS state
to our structure in order to access it.
In VC4, a number of users of that pointers have crept in over the years,
the first one being whether or not the downstream controller of the
pixelvalve is our writeback controller.
Fortunately for us, Since commit 39fcb28083 ("drm/vc4: txp: Turn the
TXP into a CRTC of its own") this is no longer something that can change
from one commit to the other and is hardcoded.
Let's set this flag in struct vc4_crtc if we happen to be the TXP, and
drop the flag from our private state structure.
Link: https://lore.kernel.org/all/YWgteNaNeaS9uWDe@phenom.ffwll.local/
Fixes: 008095e065 ("drm/vc4: Add support for the transposer block")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
drm_crtc_legacy_gamma_set updates the gamma_lut blob unconditionally,
which leads to unnecessary reprogramming of hardware.
Check whether the blob contents has actually changed before
signalling that it has been updated.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add a check to vc4_hvs_gamma_check to ensure a new non-empty
gamma LUT is of the correct length before accepting it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
edt-ft5406.dtsi is included from vc4-kms-dsi-7inch which was
also setting i2c0mux and i2c0if status fields. This meant that
dtoverlay wouldn't apply the overlay due to multiple fragments
changing the same parameter.
Move the enable from edt-ft5406.dtsi to edt-ft5406-overlay.dts
for when it should be needed as an independent overlay.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Two calls were made to drm_crtc_enable_color_mgmt to add gamma
and CTM, however they were both set to add the gamma properties,
so they ended up added twice.
Fixes: 766cc6b1f7 "drm/vc4: Add CTM support"
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
With HVS5 the gamma block is now only reprogrammed with
a disable/enable. Loading the table from vc4_hvs_init_channel
(called from vc4_hvs_atomic_enable) appears to be at an
invalid point in time and so isn't applied.
Switch to enabling and disabling the gamma table instead. This
isn't safe if the pipeline is running, but it isn't now.
For HVS4 it is safe to enable and disable dynamically, so
adopt that approach there too.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
On a Pi 4, enabling composite video disables the HDMI output. As a
consequence, the composite output is disabled by default. Change the
vc4-kms-v3d overlay used on older Pis to also disable composite by
default, replacing the "nocomposite" parameter with a "composite"
parameter.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
It is important to reinitialise the firmware array pointers to protect
against the case that the brcmfmac driver is reprobed without first
being unloaded.
The potential hazard was noticed while investigating
https://github.com/raspberrypi/firmware/issues/1644 .
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
BCM43430/2 may be BCM43430B0 or BCM43436P, and BCM43430/1 can be either
BCM43430A1 or BCM43436S, the former being upstream names and the
latter downstream names for differently-sourced sister parts.
Make the choice of firmwares board-specific (without making the actual
firmwares board-specific) by placing the alternative firmware names for
each part in the DT node.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add the ability to load the names of alternative firmwares from the
Device Tree node. This permits separate firmwares for 43436s and 43438
and allows downstream firmwares to coexist with upstream.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add an option to enable configuration via the Media Controller API
(rather than the video-node-centric /dev/videoN) as about to
be used by libcamera as it enables more complex pipelines to be
handled.
Any source that has a libcamera tuning merged has MC enabled by
default.
Sources with no libcamera tuning merged have it disabled by
default.
In either case it can be overridden with the overlay parameter
"media-controller".
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
An unwanted side effect of enabling the BRCMDBG config setting is
redefining brcmf_info to be brcmf_err. This can be alarming to users
and makes it harder to spot real errors, so don't do it.
See: https://github.com/raspberrypi/linux/issues/4663
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Retain the old names for backwards compatibility for a while, while the
necessary firmware change rolls out.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add predefined modelines for the 240p (NTSC) and 288p (PAL) progressive
modes, and report them through vc4_vec_connector_get_modes().
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
Make vc4_vec_encoder_atomic_check() accept arbitrary modelines, as long
as they result in somewhat sane output from the VEC. The bounds have
been determined empirically. Additionally, add support for the
progressive 262-line and 312-line modes.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
trigger_mode == 0 (default) => no effect / no registers written
trigger_mode == 1 => source
trigger_mode == 2 => sink
This can be set e.g. in /boot/cmdline.txt as imx477.trigger_mode=N
Signed-off-by: Jonas Jacob <jonas.jacob@neocortexvision.com>
The HVS Gamma block can only be updated when idle, so we need to disable
the HVS channel when the gamma property is set in an atomic commit.
Since the pixelvalve cannot have its assigned channel halted without
stalling unless it's disabled as well, in our case that means forcing a
full disable / enable cycle on the pipeline.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
BCM2711 changes from a 256 entry lookup table to a 16 point
piecewise linear function as the pipeline bitdepth has increased
to make a LUT unwieldy.
Implement a simple conversion from a 256 entry LUT that userspace
is likely to expect to 16 evenly spread points in the PWL. This
could be improved with curve fitting at a later date.
Co-developed-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
clk-2835 is deprecated and gets an innacurate clock for VEC (107MHz).
Switch to clk-raspberrypi which uses the right PLL to get an accurate 108MHz.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Both the firmware audio driver and the vc4-kms-v3d driver are capable
of providing HDMI audio, but only one should be active at any time.
The vc4-kms-v3d overlays disable the firmware audio driver, but they
also have a noaudio parameter that as well as disabling the ARM-side
HDMI audio also re-enables the firmware HDMI audio. This is not
guaranteed to work and has been seen to break the display completely.
Modify the noaudio parameters so that the firmware HDMI audio support
remains disabled.
See: https://github.com/raspberrypi/linux/issues/4651
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Limit the number of allowed video callbacks. This helps with limiting
the size of the coded input FIFO which in turn helps to control latency.
Choose -5 as the magic number as it translates to DPB+5 buffers which
has been proven to be a good number in the past.
Ideally coded buffers would not be returned to the user until they
had been decoded into the DPB or been discarded as bad, but that grade
of control is unavailable to us.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Adds Media Controller API support for more complex pipelines.
libcamera is about to switch to using this mechanism for configuring
sensors.
This can be enabled by either a module parameter, or device tree.
Various functions have been moved to group video-centric and
mc-centric functions together.
Based on a similar conversion done to ti-vpe.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The driver was making big assumptions about the source device
using pad 0 and 1, which doesn't follow for more complex
devices where Unicam's source device may be a sink device for
something else.
Read the pad numbers through media controller, and reference
them appropriately.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
RAW color spaces are more usually reported as having full range
quantization.
Tested using libcamera.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
The get() method does not understand the on-the-wire encoding of the
remote GPIO states, thinking they are simple on/off bits when they are
really pairs of 16-bit counts. Rewrite the get() handler to return the
value last written, which will eventually match the actual GPIO state
if there are no other changes.
See: https://github.com/raspberrypi/linux/issues/4638
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
In order to resolve a potential startup order bug, the vcio driver has
been rewritten as a platform driver that depends on a DT node for
its instantiation and to locate the firmware driver.
Add that DT node.
See: https://github.com/raspberrypi/linux/issues/4620
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The old vcio driver is a simple character device that manually locates
the firmware driver. Initialising it before the firmware driver causes
a failure, and no retries are attempted.
Rewrite vcio as a platform driver that depends on a DT node for its
instantiation and the location of the firmware driver, making use of
the miscdevice framework to reduce the code size.
N.B. Using miscdevice changes the udev SUBSYSTEM string, so a change
to the companion udev rule is required in order to continue to set
the correct device permissions, e.g.:
KERNEL="vcio", GROUP="video", MODE="0660"
See: https://github.com/raspberrypi/linux/issues/4620
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Power-on reset after the insertion of a battery does not always complete
successfully, leading to corrupted register content. The EXT_TEST bit
will stop the clock from running, but currently the driver will never
recover.
Safely handle the erroneous state by clearing EXT_TEST as part of the
usual set_time method.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
For cases where spare PWM outputs are available, but are desired
to be addressed a standard outputs instead.
Wraps a PWM channel as a new GPIO chip with the one output.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This is a copy of the code and approach from our DAC+ driver.
It allows to probe (and activate) an optional TPA6130A2 headphone
amplifier. Updated email address.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
This is a copy of the approach from our DAC+ driver.
It allows to probe (and activate) an optional TPA6130A2 headphone
amplifier.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
A regression introduced in https://github.com/raspberrypi/linux/pull/3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.
Use the larger of SCHEDULE_SLOP or the configured endpoint interval.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Adds an overlay to configure the TinyDRM driver for ST7735R
based 160x128 and 128x128 (untested) displays such as the
Adafruit 1.8" display.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This commit updates the imx519 driver to adverise support for embedded
data streams.
The imx519 sensor subdevice overloads the media pad to differentiate
between image stream (pad 0) and embedded data stream (pad 1) when
performing the v4l2_subdev_pad_ops functions.
Signed-off-by: Lee Jackson <info@arducam.com>
Adds a driver for the 16MPix IMX519 CSI2 sensor.
Whilst the sensor supports 2 or 4 CSI2 data lanes, this driver
currently only supports 2 lanes.
The following Bayer modes are currently available:
4656x3496 10-bit @ 10fps
3840x2160 10-bit (cropped) @ 21fps
2328x1748 10-bit (binned) @ 30fps
1920x1080 10-bit (cropped/binned) @ 60fps
1280x720 10-bit (cropped/binned) @ 120fps
Signed-off-by: Lee Jackson <info@arducam.com>
The Linux support for controlling card power via regulators appears to
be contentious. I would argue that the default behaviour is contrary to
the SDHCI spec - turning off the power writes a reserved value to the
SD Bus Voltage Select field of the Power Control Register, which
seems to kill the Arasan/iProc controller - but fortunately there is a
hook in sdhci_ops to override the behaviour.
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
This reverts commit aed19399a0.
Commit 6c92ae1e45 ("mmc: sdhci: Introduce sdhci_set_power_and_bus_voltage()")
introduced a generic helper that does the same thing so use that instead in
the following commit.
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Check for errors only if we actually tried to enable the bvb clock.
Fixes: 01a6d727b4 ("vc4/drm: hdmi: Handle case when bvb clock is null")
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
The read and write functions did not use the correct pointer offset
when dealing with an odd number of bytes after a DMA transfer. Also,
only handle the remaining odd bytes if the DMA transfer completed
successfully.
Submitted-by: @madimario (GitHub)
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The driver was implementing a get_brightness function that
tried to read back the PWM setting of the display to report
as the current brightness.
The controller on the display does not support that, therefore
we end up reporting a brightness of 0, and that confuses
systemd's backlight service.
Remove the hook so that the framework returns the current
brightness automatically.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This allows using the video-i2c camera driver with MLX90640 thermal
infrared sensors, connected to Raspberry Pi. CONFIG_VIDEO_V4L2_I2C
has to be selected to use the camera.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
The ISP can do H & V flips whilst resizing or converting
the image, so expose that via V4L2_CID_[H|V]FLIP.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Currently the code was only setting some controls from
bcm2835_codec_set_ctrls, but it's simpler to use
v4l2_ctrl_handler_setup to avoid forgetting to adding new
controls to the list.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Completions don't reference count, so setting the completion
on the first buffer returned and then not reinitialising it
means that the flush function doesn't behave as intended.
Signal the completion when the last buffer is returned.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Should we go through the timeout failure case with port_disable
not returning all buffers for whatever reason, the
buffers_with_vpu counter gets left at a non-zero value, which
will cause reference counting issues should the instance be
reused.
Reset the count when the port is enabled again, but before
any buffers have been sent to the VPU.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
When a buffer is returned on a port that is disabled, return it
to the videobuf2 QUEUED state instead of DONE which returns it
to the client.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The firmware defaults to not stopping video decode if only the
pixel aspect ratio or colourspace change. V4L2 requires us
to stop decoding on any change, therefore tell the firmware
of the desire for this alternate behaviour.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
When a format changed event occurs, the spec says that it
triggers an implicit drain, and that needs to be signalled
via -EPIPE.
For BCM2835, the format changed event happens at the point
the format change occurs, so no further buffers exist from
before the resolution changed point. We therefore signal the
last buffer immediately.
We don't have a V4L2 available to us at this point, so set
the videobuf2 queue last_buffer_dequeued flag directly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The list of includes was slightly over generic, and wasn't
in alphabetical order. Clean it up.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
With video decode we now enable both input and output ports on
the component. This means that buffers will get passed to the VPU
earlier than desired if they are queued befoer STREAMON.
Check that the queue is streaming before sending buffers to the VPU.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The V4L2 stateful video decoder API requires that you can STREAMON
on only the OUTPUT queue, feed in buffers, and wait for the
SOURCE_CHANGE event.
This requires that we enable the MMAL output port at the same time
as the input port, because the output port is the one that creates
the SOURCE_CHANGED event.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Before uniniting the decode context sync with the IRQ queues to ensure
that decode no longer has any buffers in use. This fixes a problem that
manifested as ffmpeg leaking CMA buffers when it did a stream off on
OUTPUT before CAPTURE, though in reality it was probably much more
dangerous than that.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Remove unused ctx state tracking variable and associated defines.
Their presence implies they might be used, but they aren't.
Signed-off-by: John Cox <jc@kynesim.co.uk>
With EDPWRDOWN set in idle, it must be cleared before checking for
ENERGYON going high, indicating that a link is being established.
The existing code allows 640ms for ENERGYON to go high, but on
Raspberry Pis that appears not to be enough, causing link detection
to fail.
Increase the polling timeout to 1500ms - with a polling interval of
10ms it shouldn't cause unnecessary delays.
See: https://github.com/raspberrypi/linux/issues/4393
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
V4L2 spec says that G/S/TRY_FMT IOCTLs should never return errors for
anything other than wrong buffer types. Improve the capture format
function such that this is so and unsupported values get converted
to supported ones properly.
Signed-off-by: John Cox <jc@kynesim.co.uk>
As we're wanting to wrap the image_fx component for deinterlacing,
add the deinterlace algorithm values to enum mmal_parameter_imagefx
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
If the client provides a bytesperline value in try_fmt/s_fmt then
validate it and correct if necessary.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Should start_streaming fail, or buffers be queued during
stop_streaming, they should be returned to the core as QUEUED
and not (as currently) as ERROR.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The video decoder can support decoding interlaced streams, so add
the required plumbing to signal this correctly.
The encoder and ISP do NOT support interlaced data, so trying to
configure an interlaced format on those nodes will be rejected.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Adds enum mmal_interlace_type and struct
mmal_parameter_video_interlace_type to allow for querying the
interlacing mode on decoders.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
In order to effectively guarantee that a V4L2_EVENT_SOURCE_CHANGE
event occurs, adopt a default resolution of 32x32 so that it
is incredibly unlikely to be decoding a stream of that resolution
and therefore failing to note a "change" requiring the event.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The older panel-raspberrypi-touchscreen driver had issues in
that it also controlled the power for the touchscreen without
having an appropriate hook for the touchscreen driver to control
that.
Mainline has now added a Toshiba TC358762 bridge driver, and
a regulator/backlight driver for the ATTiny microcontroller on
the board. That allows clean integration with the touchscreen
driver.
Switch the overlays over to using newer drivers.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
We need independent control of the resets for the panel&bridge,
vs the touch controller.
Expose the reset lines that are on the Atmel's port C via the GPIO
API so that they can be controlled appropriately.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Atmel was doing a load of automatic sequencing of
control lines, however it was combining the touch controller's
reset with the bridge/panel control.
Change to control the control signals directly rather than
through the automatic POWERON control.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The initial state of the Atmel is not defined, so ensure the
backlight PWM is set to 0 by default.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The driver was using the regmap lock to serialise the
individual accesses, but we really need to protect the
timings of enabling the regulators, including any communication
with the Atmel.
Use a mutex within the driver to control overall accesses to
the Atmel, instead of the regmap lock.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Atmel is doing some things in the I2C ISR, during which
period it will not respond to further commands. This is
particularly true of the POWERON command.
Increase delays appropriately, and retry should I2C errors be
reported.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
There's no reason why 2 Raspberry Pi DSI displays can't be
attached to a Pi Compute Module, so the backlight names need to
be unique.
Use the parent dev_name. It's not as readable, but is unique.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
If no interrupt is defined then a timer and workqueue are used
to poll the controller.
On remove these were not being cleaned up correctly.
Fixes: ca61fdaba7 "Input: edt-ft5x06: Poll the device if no interrupt is
configured."
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Raspberry Pi 7" 800x480 panel uses a Toshiba TC358762 DSI
to DPI bridge chip, so there is a requirement for the timings
to be specified for the end panel. Add such a definition.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
rpi_touchscreen_i2c_read returns any errors from i2c_transfer,
or the 8 bit received value.
Check for error values before trying to process the data as
valid.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The panel has a prepare call which is before video starts, and an
enable call which is after.
The Toshiba bridge should be configured before video, so move
the relevant power and initialisation calls to prepare.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
If a call to rpi_touchscreen_i2c_write from rpi_touchscreen_probe
fails before mipi_dsi_device_register_full is called, then
in trying to log the error message if uses ts->dsi->dev when
it is still NULL.
Use ts->i2c->dev instead, which is initialised earlier in probe.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The divider calculations tried to find the divider
just faster than the clock requested. However if
it required a divider of 7 then the for loop
aborted without handling the "error" case, and could
end up with a clock lower than requested.
Correct the loop so that we always have a clock greater
than requested.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
On Pi0-3 the driver allocates a buffer and requests a DMA channel
because the ARM can't write to DSI1's registers directly.
However unbind and the error paths in bind don't release the buffer or
the DMA channel.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The HDMI block can repeat pixels for double clocked modes,
and the firmware is now configuring the block to do this as
the PV is doing it incorrectly when at 2pixels/clock.
If the kernel doesn't reset it then we end up with strange
modes.
Reset MISC_CONTROL.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Run the PoE hat fan with reduced noise when possible. The PoE hat fan will spin even at PWM=1 and spins almost silent. Tested at 17ºC, the fan PWM alternates between 1 and 10, which is a lot less noisy then alternating between PWM levels 0 and 31.
The exit path of vc4_hdmi_encoder_pre_crtc_configure() is fairly hard to
maintain given its numerous error conditions.
Switch to a goto based approach to simplify it.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Unlike pm_runtime_get_sync(), pm_runtime_resume_and_get() doesn't take a
reference on failure, so we don't need to call pm_runtime_put() on
failure.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
A module parameter "dpc_enable" is added to allow the control of the
sensor's on-board DPC (Defective Pixel Correction) function.
This is a global setting to be configured before using the sensor;
there is no intention that this would ever be changed on-the-fly.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
drm_connector_helper_hpd_irq_event() calls
drm_kms_helper_hotplug_event() with the mode-setting lock taken while
it's supposed to be called without that lock taken.
This results in a lockdep warning, and a deadlock if we were to wake up
a TV through CEC (and possibly other cases).
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
When the firmware doesn't setup the HSM rate (such as when booting
without an HDMI cable plugged in), its rate is 0 and thus any register
access results in a CPU stall, even though HSM is enabled.
Let's enforce a minimum rate at boot to avoid this issue.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Commit 9d44abbbb8 ("drm/vc4: Fall back to using an EDID probe in the
absence of a GPIO.") added some code to read the EDID through DDC in the
HDMI driver detect hook since the Pi3 had no HPD GPIO back then.
However, commit b1b8f45b31 ("ARM: dts: bcm2837: Add missing GPIOs of
Expander") changed that a couple of years later.
This causes an issue though since some TV (like the LG 55C8) when it
comes out of standy will deassert the HPD line, but the EDID will
remain readable.
It causes an issues nn platforms without an HPD GPIO, like the Pi4,
where the DDC probing will be our primary mean to detect a display, and
thus we will never detect the HPD pulse. This was fine before since the
pulse was small enough that we would never detect it, and we also didn't
have anything (like the scrambler) that needed to be set up in the
display.
However, now that we have both, the display during the HPD pulse will
clear its scrambler status, and since we won't detect the
disconnect/reconnect cycle we will never enable the scrambler back.
As our main reason for that DDC probing is gone, let's just remove it.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The drm_helper_hpd_irq_event() documentation states that this function
is "useful for drivers which can't or don't track hotplug interrupts for
each connector." and that "Drivers which support hotplug interrupts for
each connector individually and which have a more fine-grained detect
logic should bypass this code and directly call
drm_kms_helper_hotplug_event()". This is thus what we ended-up doing.
However, what this actually means, and is further explained in the
drm_kms_helper_hotplug_event() documentation, is that
drm_kms_helper_hotplug_event() should be called by drivers that can
track the connection status change, and if it has changed we should call
that function.
This underlying expectation we failed to provide is that the caller of
drm_kms_helper_hotplug_event() should call drm_helper_probe_detect() to
probe the new status of the connector.
Since we didn't do it, it meant that even though we were sending the
notification to user-space and the DRM clients that something changed we
never probed or updated our internal connector status ourselves.
This went mostly unnoticed since the detect callback usually doesn't
have any side-effect. Also, if we were using the DRM fbdev emulation
(which is a DRM client), or any user-space application that can deal
with hotplug events, chances are they would react to the hotplug event
by probing the connector status eventually.
However, now that we have to enable the scrambler in detect() if it was
enabled it has a side effect, and an application such as Kodi or
modetest doesn't deal with hotplug events. This resulted with a black
screen when Kodi or modetest was running when a screen was disconnected
and then reconnected, or switched off and on.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
In the bind hook, we actually need the device to have the HSM clock
running during the final part of the display initialisation where we
reset the controller and initialise the CEC component.
Failing to do so will result in a complete, silent, hang of the CPU.
Fixes: 411efa18e4 ("drm/vc4: hdmi: Move the HSM clock enable to runtime_pm")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The drm_helper_hpd_irq_event() function is iterating over all the
connectors when an hotplug event is detected.
During that iteration, it will call each connector detect function and
figure out if its status changed.
Finally, if any connector changed, it will notify the user-space and the
clients that something changed on the DRM device.
This is supposed to be used for drivers that don't have a hotplug
interrupt for individual connectors. However, drivers that can use an
interrupt for a single connector are left in the dust and can either
reimplement the logic used during the iteration for each connector or
use that helper and iterate over all connectors all the time.
Since both are suboptimal, let's create a helper that will only perform
the status detection on a single connector.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
See https://github.com/raspberrypi/linux/issues/3981
An unknown unsafe memory access can result in the ep_state variable
in xhci_virt_ep being trampled with a stuck SET_DEQ_PENDING state
despite successful completion of a Set TR Deq Pointer command.
All URB enqueue/dequeue calls for the endpoint will fail in this state
so no transfers are possible until the device is reconnected.
As a workaround, clear the flag if we see it set and issue a new Set
TR Deq command anyway - this should be harmless, as a prior Set TR Deq
command will only have been issued in the Stopped state, and if the
endpoint is Running then the controller is required to ignore it and
respond with a Context State Error event TRB.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Only create aux entries of mv info for frames where that info might
be used by a later frame. This saves some memory bandwidth and
potentially some memory.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Allows the user to submit a whole frames worth of slice headers in
one lump along with a single bitstream dmabuf for the whole lot.
This saves potentially a lot of bitstream copying.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Implement support for dynamically allocated arrays.
Most of the changes concern keeping track of the number of elements
of the array and the number of elements allocated for the array and
reallocating memory if needed.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
(cherry picked from commit fd5d45e6561f6f8c406b81aeddecaa11f0bd15af)
Add a new flag that indicates that this control is a dynamically sized
array. Also document this flag.
Currently dynamically sized arrays are limited to one dimensional arrays,
but that might change in the future if there is a need for it.
The initial use-case of dynamic arrays are stateless codecs. A frame
can be divided in many slices, so you want to provide an array containing
slice information for each slice. Typically the number of slices is small,
but the standard allow for hundreds or thousands of slices. Dynamic arrays
are a good solution since sizing the array for the worst case would waste
substantial amounts of memory.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
(cherry picked from commit c095ad310b223b9de38d491c9f66533d05586b45)
DPB entries have moved from slice params to the new decode params
attribute - update to deal with this. Also fixes fallthrough
warnings which seem to be new in 5.14.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Add code to support V4L2_CID_MPEG_VIDEO_HEVC_SCALING_MATRIX to
v4l2-ctrl-*. This change was in the unsplit v4l2-ctrl.c but failed
to make it through the split.
Signed-off-by: John Cox <jc@kynesim.co.uk>
v4l2-ctrls.c has been split into 4 files v4l2-ctrls-*.c, the original
is redundant, confusing and probably should have been removed by a
previous patch.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Add call to v4l2_ctrl_new_fwnode_properties to read and
create the fwnode based controls.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add call to v4l2_ctrl_new_fwnode_properties to read and
create the fwnode based controls.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add call to v4l2_ctrl_new_fwnode_properties to read and
create the fwnode based controls.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
In vc4_atomic_commit_tail() we iterate of the set of old CRTCs, and
attempt to wait on any channels which are still in use. When we iterate
over the CRTCs, we have:
* `i` - the index of the CRTC
* `channel` - the channel a CRTC is using
When we check the channel state, we consult:
old_hvs_state->fifo_state[channel].in_use
... but when we wait for the channel, we erroneously wait on:
old_hvs_state->fifo_state[i].pending_commit
... rather than:
old_hvs_state->fifo_state[channel].pending_commit
... and this bogus access has been observed to result in boot-time hangs
on some arm64 configurations, and can be detected using KASAN. FIx this
by using the correct index.
I've tested this on a Raspberry Pi 3 model B v1.2 with KASAN.
Trimmed KASAN splat:
| ==================================================================
| BUG: KASAN: slab-out-of-bounds in vc4_atomic_commit_tail+0x1cc/0x910
| Read of size 8 at addr ffff000007360440 by task kworker/u8:0/7
| CPU: 2 PID: 7 Comm: kworker/u8:0 Not tainted 5.13.0-rc3-00009-g694c523e7267 #3
|
| Hardware name: Raspberry Pi 3 Model B (DT)
| Workqueue: events_unbound deferred_probe_work_func
| Call trace:
| dump_backtrace+0x0/0x2b4
| show_stack+0x1c/0x30
| dump_stack+0xfc/0x168
| print_address_description.constprop.0+0x2c/0x2c0
| kasan_report+0x1dc/0x240
| __asan_load8+0x98/0xd4
| vc4_atomic_commit_tail+0x1cc/0x910
| commit_tail+0x100/0x210
| ...
|
| Allocated by task 7:
| kasan_save_stack+0x2c/0x60
| __kasan_kmalloc+0x90/0xb4
| vc4_hvs_channels_duplicate_state+0x60/0x1a0
| drm_atomic_get_private_obj_state+0x144/0x230
| vc4_atomic_check+0x40/0x73c
| drm_atomic_check_only+0x998/0xe60
| drm_atomic_commit+0x34/0x94
| drm_client_modeset_commit_atomic+0x2f4/0x3a0
| drm_client_modeset_commit_locked+0x8c/0x230
| drm_client_modeset_commit+0x38/0x60
| drm_fb_helper_set_par+0x104/0x17c
| fbcon_init+0x43c/0x970
| visual_init+0x14c/0x1e4
| ...
|
| The buggy address belongs to the object at ffff000007360400
| which belongs to the cache kmalloc-128 of size 128
| The buggy address is located 64 bytes inside of
| 128-byte region [ffff000007360400, ffff000007360480)
| The buggy address belongs to the page:
| page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7360
| flags: 0x3fffc0000000200(slab|node=0|zone=0|lastcpupid=0xffff)
| raw: 03fffc0000000200 dead000000000100 dead000000000122 ffff000004c02300
| raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
| page dumped because: kasan: bad access detected
|
| Memory state around the buggy address:
| ffff000007360300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
| ffff000007360380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
| >ffff000007360400: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
| ^
| ffff000007360480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
| ffff000007360500: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
| ==================================================================
Link: https://lore.kernel.org/r/4d0c8318-bad8-2be7-e292-fc8f70c198de@samsung.com
Link: https://lore.kernel.org/linux-arm-kernel/20210607151740.moncryl5zv3ahq4s@gilmour
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Emma Anholt <emma@anholt.net>
Cc: Maxime Ripard <maxime@cerno.tech>
Cc: Will Deacon <will@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608085513.2069-1-mark.rutland@arm.com
If vc4_crtc_disable_at_boot runs, we call into the HDMI controller hooks
to disable it but we never made sure it was running and the core clock
was running either.
This doesn't fix any known bugs, but it's still something we should make
sure of.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
On bind we will register the HDMI codec device but we don't unregister
it on unbind, leading to a device leakage. Unregister our device at
unbind.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Avoid a compiler warning by using the "fallthrough" pseudo-keyword in
place of the old "/* fall through */" comment convention.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Avoid compiler warnings by using the "fallthrough" pseudo-keyword in
place of the old "/* fall through */" comment convention.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Avoid a compiler warning by using the "fallthrough" pseudo-keyword in
place of the old "/* fall through */" comment convention.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[ drm-misc commit 2eecd93b74 ]
There'a limit to how big a kmalloc buffer can be, and as memory gets
fragmented it becomes more difficult to get big buffers. The downside of
smaller buffers is that the driver has to split the transfer up which
hampers performance. Compression might also take a hit because of the
splitting.
Solve this by allocating the transfer buffer using vmalloc and create a
SG table to be passed on to the USB subsystem. vmalloc_32() is used to
avoid DMA bounce buffers on USB controllers that can only access 32-bit
addresses.
This also solves the problem that split transfers can give host side
tearing since flushing is decoupled from rendering.
usb_sg_wait() doesn't have timeout handling builtin, so it is wrapped in
a timer like 4 out of 6 users in the kernel have done.
v2:
- Use DIV_ROUND_UP (Linus)
- Add timeout note to the commit log (Linus)
- Expand note about upper buffer limit (Linus)
- Change var name s/timer/ctx/ in gud_usb_bulk_timeout()
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210701170748.58009-2-noralf@tronnes.org
[ drm-misc commit f8ac863b6a ]
Free transfer and compression buffers on device removal instead of at
DRM device removal time. This ensures that the usual 2x8MB buffers are
released when the device is unplugged and not kept around should
userspace keep the DRM device fd open.
At least Ubuntu 20.04 doesn't release the DRM device on unplug.
The damage_lock mutex is not destroyed because it is used outside the
drm_dev_enter/exit block in gud_pipe_update(). AFAICT it's possible for
an open fbdev descriptor to trigger a commit after the USB device is gone.
v2: Don't destroy damage_lock
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210701170748.58009-1-noralf@tronnes.org
The imx477 driver's line length for this mode had not been updated to
the value supplied to us by the sensor manufacturer. With this
correction the sensor delivers the framerates that are expected.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
This is a squash of all firmware-kms related patches from previous
branches, up to and including
"drm/vc4: Set the possible crtcs mask correctly for planes with FKMS"
plus a couple of minor fixups for the 5.9 branch.
Please refer to earlier branches for full history.
This patch includes work by Eric Anholt, James Hughes, Phil Elwell,
Dave Stevenson, Dom Cobley, and Jonathon Bell.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: Fixup firmware-kms after "drm/atomic: Pass the full state to CRTC atomic enable/disable"
Prototype for those calls changed, so amend fkms (which isn't
upstream) to match.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: Fixup fkms for API change
Atomic flush and check changed API, so fix up the downstream-only
FKMS driver.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: Make normalize_zpos conditional on using fkms
Eric's view was that there was no point in having zpos
support on vc4 as all the planes had the same functionality.
Can be later squashed into (and fixes):
drm/vc4: Add firmware-kms mode
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
drm/vc4: FKMS: Change of Broadcast RGB mode needs a mode change
The Broadcast RGB (aka HDMI limited/full range) property is only
notified to the firmware on mode change, so this needs to be
signalled when set.
https://github.com/raspberrypi/firmware/issues/1580
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc4/drv: Only notify firmware of display done with kms
fkms driver still wants firmware display to be active
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
ydrm/vc4: fkms: Fix margin calculations for the right/bottom edges
The calculations clipped the right/bottom edge of the clipped
range based on the left/top margins.
https://github.com/raspberrypi/linux/issues/4447
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
drm/vc4: fkms: Use new devm_rpi_firmware_get api
drm/kms: Add allow_fb_modifiers
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
In vc4_hdmi_encoder_pre_crtc_configure, if clk_request_start for the HSM
clock fails, we don't call clk_disable_unprepare on the pixel clock even
though it's enabled by now.
Make sure it's there to avoid leaking that reference.
Fixes: cd4cb49dc5 ("drm/vc4: hdmi: Adjust HSM clock rate depending on pixel rate")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
In the vc4_hdmi_encoder_pre_crtc_configure() function error path we
never actually call pm_runtime_put() even though
pm_runtime_resume_and_get() is our very first call.
Fixes: 4f6e3d66ac ("drm/vc4: Add runtime PM support to the HDMI encoder driver")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Once the call to drm_fb_helper_remove_conflicting_framebuffers() has
been made, simplefb has been unregistered and the KMS driver is entirely
in charge of the display.
Thus, we can notify the firmware it can free whatever resource it was
using to maintain simplefb functional.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The bind hooks will modify their controller registers, so simplefb is
going to be unusable anyway. Let's avoid any transient state where it
could still be in the system but no longer functionnal.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Before the introduction of the BCM2711 support, the HSM clock rate was
fixed, and was the CEC and audio clock source on the SoCs previously
supported.
The HSM clock is also the source of the internal state machine of the
controller and needs to run faster than the pixel clock. All these
requirements were met by running at 101% of the maximum pixel rate,
meeting the fixed clock requirement for audio and CEC, while remaining
faster than any pixel clock we might need.
However, the BCM2711 brought support for 4k and therefore increased
significantly the rate needed for the HSM, and new, independant, clocks
to feed the audio and CEC clocks. Since the HSM clock can also run much
higher, we also need to lower its rate if possible to reduce its power
consumption.
The CEC support code changes its clock divider when the HSM clock rate
is changed, but the audio support never had a similar feature and will
glitch out if audio is played back during a mode set.
Since the HSM rate was meant to be fixed on the SoCs prior to the
BCM2711 anyway, let's introduce back a fixed HSM rate and fix audio.
Fixes: cd4cb49dc5 ("drm/vc4: hdmi: Adjust HSM clock rate depending on pixel rate")
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The hdmi-codec brings a lot of advanced features, including the HDMI
channel mapping. Let's use it in our driver instead of our own codec.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The new clock request API allows us to increase the rate of the HSM
clock to match our pixel rate requirements while decreasing it when
we're done, resulting in a better power-efficiency.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Replace drm_encoder_helper_funcs::atomic_check with
drm_connector_helper_funcs::atomic_check - the former is not called
during drm_mode_obj_set_property_ioctl(). Set crtc_state->mode_changed
if TV norm changes even without explicit mode change. This makes things
like "xrandr --output Composite-1 --set mode PAL-M" work properly.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
Similar to the ch7006 and nouveau drivers, introduce a "tv_mode" module
parameter that allow setting the TV norm by specifying vc4.tv_norm= on
the kernel command line.
If that is not specified, try inferring one of the most popular norms
(PAL or NTSC) from the video mode specified on the command line. On
Raspberry Pis, this causes the most common cases of the sdtv_mode
setting in config.txt to be respected.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
Add support for the following composite output modes (all of them are
somewhat more obscure than the previously defined ones):
- NTSC_443 - NTSC-style signal with the chroma subcarrier shifted to
4.43361875 MHz (the PAL subcarrier frequency). Never used for
broadcasting, but sometimes used as a hack to play NTSC content in PAL
regions (e.g. on VCRs).
- PAL_N - PAL with alternative chroma subcarrier frequency,
3.58205625 MHz. Used as a broadcast standard in Argentina, Paraguay
and Uruguay to fit 576i50 with colour in 6 MHz channel raster.
- PAL60 - 480i60 signal with PAL-style color at normal European PAL
frequency. Another non-standard, non-broadcast mode, used in similar
contexts as NTSC_443. Some displays support one but not the other.
- SECAM - French frequency-modulated analog color standard; also have
been broadcast in Eastern Europe and various parts of Africa and Asia.
Uses the same 576i50 timings as PAL.
Also added some comments explaining color subcarrier frequency
registers.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
PAL-M is a Brazilian analog TV standard that uses a PAL-style chroma
subcarrier at 3.575611[888111] MHz on top of 525-line (480i60) timings.
This commit makes the driver actually use the proper VEC preset for this
mode instead of just changing PAL subcarrier frequency.
DRM mode constant names have also been changed, as they no longer
correspond to the "NTSC" or "PAL" terms.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
Change the mode_set function pointer logic to declarative config0,
config1 and custom_freq fields, to make TV mode setting logic more
concise and uniform.
Additionally, remove the superfluous tv_mode field, which was redundant
with the mode field in struct drm_tv_connector_state.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
This commit fixes vertical timings of the VEC (composite output) modes
to accurately represent the 525-line ("NTSC") and 625-line ("PAL") ITU-R
standards.
Previous timings were actually defined as 502 and 601 lines, resulting
in non-standard 62.69 Hz and 52 Hz signals being generated,
respectively.
Changes to vc4_crtc.c have also been made, to make the PixelValve
vertical timings accurately correspond to the DRM modeline in interlaced
modes. The resulting VERTA/VERTB register values have been verified
against the reference values set by the Raspberry Pi firmware.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
The calculations clipped the right/bottom edge of the clipped
range based on the left/top margins.
Fixes: 666e73587f ("drm/vc4: Take margin setup into account when updating planes")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Our hotplug handler will currently call the drm_kms_helper_hotplug_event
every time a hotplug interrupt is called.
However, since the device is registered after all the drivers have
finished their bind callback, we have a window between when we install
our interrupt handler and when drm_dev_register() is eventually called
where our handler can run and call drm_kms_helper_hotplug_event but the
device hasn't been registered yet, causing a null pointer dereference.
Fix this by making sure we only call drm_kms_helper_hotplug_event if our
device has been properly registered.
Fixes: f4790083c7 ("drm/vc4: hdmi: Rely on interrupts to handle hotplug")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The hotplugs interrupt handlers are registered through the
devm_request_threaded_irq function. However, while free_irq is indeed
called properly when the device is unbound or bind fails, it's called
after unbind or bind is done.
In our particular case, it means that on failure it creates a window
where our interrupt handler can be called, but we're freeing every
resource (CEC adapter, DRM objects, etc.) it might need.
In order to address this, let's switch to the non-devm variant to
control better when the handler will be unregistered and allow us to
make it safe.
Fixes: f4790083c7 ("drm/vc4: hdmi: Rely on interrupts to handle hotplug")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.
See https://www.netbsd.org/about/redistribution.html for more
information.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Commit ecdd08fd9b ("drm/vc4: hdmi: Make sure the device is powered
with CEC") made sure that the device is powered while there is
CEC-related accesses but missed one register read in the variable
declaration.
Move the variable assignment after the pm_runtime_resume_and_get.
Fixes: ecdd08fd9b ("drm/vc4: hdmi: Make sure the device is powered with CEC")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
We've had many silent hangs where the kernel would look like it just
stalled due to the access to one of the HDMI registers while the
controller was disabled.
Add a warning if we're about to do that so that it's at least not silent
anymore.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Similarly to what we encountered with the detect hook with DRM, nothing
actually prevents any of the CEC callback from being run while the HDMI
output is disabled.
However, this is an issue since any register access to the controller
when it's powered down will result in a silent hang.
Let's make sure we run the runtime_pm hooks when the CEC adapter is
opened and closed by the userspace to avoid that issue.
Fixes: 15b4511a4a ("drm/vc4: add HDMI CEC support")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
In order to ease further additions to the CEC enable and disable, let's
split the function into two functions, one to enable and the other to
disable.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
If we have a state already and disconnect/reconnect the display, the
SCDC messages won't be sent again since we didn't go through a disable /
enable cycle.
In order to fix this, let's call the vc4_hdmi_enable_scrambling function
in the detect callback if there is a mode and it needs the scrambler to
be enabled.
Fixes: 74465b84fa ("drm/vc4: hdmi: Enable the scrambler")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The core clock needs to be raised temporarily during a modeset to
500MHz. However, the HVS core clock requirement might be higher than
500MHz. This rate will be enforced at the end of the mode setting,
meaning that might might end up with a core clock rate lower than
planned on the first mode set.
Use the maximum value of 500MHz and the HVS core clock rate for our
temporary boost to fix this issue.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Increase the number of post-sync blanking lines on odd fields instead of
decreasing it on even fields. This makes the total number of lines
properly match the modelines.
Additionally fix the value of PV_VCONTROL_ODD_DELAY, which did not take
pixels_per_clock into account, causing some displays to invert the
fields when driven by bcm2711.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
When we have the entire DRM state, retrieving the connector state only
requires the drm_connector pointer. Fortunately for us, we have
allocated it as a part of the vc4_hdmi structure, so we can retrieve get
a pointer by simply accessing our field in that structure.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Depending on a given HVS output (HVS to PixelValves) and input (planes
attached to a channel) load, the HVS needs for the core clock to be
raised above its boot time default.
Failing to do so will result in a vblank timeout and a stalled display
pipeline.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The load tracker was initially designed to report and warn about a load
too high for the HVS. To do so, it computes for each plane the impact
it's going to have on the HVS, and will warn (if it's enabled) if we go
over what the hardware can process.
While the limits being used are a bit irrelevant to the BCM2711, the
algorithm to compute the HVS load will be one component used in order to
compute the core clock rate on the BCM2711.
Let's remove the hooks to prevent the load tracker to do its
computation, but since we don't have the same limits, don't check them
against them, and prevent the debugfs file to enable it from being
created.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The encoder retrieval code has been a source of bugs and glitches in the
past and the crtc <-> encoder association been wrong in a number of
different ways.
Add some logging to quickly spot issues if they occur.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
It turns out the encoder retrieval code, in addition to being
unnecessarily complicated, has a bug when only the planes and crtcs are
affected by a given atomic commit.
Indeed, in such a case, either drm_atomic_get_old_connector_state or
drm_atomic_get_new_connector_state will return NULL and thus our encoder
retrieval code will not match on anything.
We can however simplify the code by using drm_for_each_encoder_mask, the
drm_crtc_state storing the encoders a given CRTC is connected to
directly and without relying on any other state.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
vc4_crtc_config_pv() retrieves the encoder again, even though its only
caller, vc4_crtc_atomic_enable(), already did.
Pass the encoder pointer as an argument instead of going through all the
connectors to retrieve it again.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
DRM currently polls for the HDMI connector status every 10s, which can
be an issue when we connect/disconnect a display quickly or the device
on the other end only issues a hotplug pulse (for example on EDID
change).
Switch the driver to rely on the internal controller logic for the
BCM2711/RPi4.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The current core while setting the min and max rate properly in the
clk_request structure will not make sure that the requested rate is
within these boundaries, leaving it to each and every driver to make
sure it is.
Add a clamp call to make sure it's always done.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The 2711 pixel valve can't produce odd horizontal timings, and
checks were added to vc4_hdmi_encoder_atomic_check and
vc4_hdmi_encoder_mode_valid to filter out/block selection of
such modes.
Modes with DRM_MODE_FLAG_DBLCLK double all the horizontal timing
values before programming them into the PV. The PV values,
therefore, can not be odd, and so the modes can be supported.
Amend the filtering appropriately.
See https://github.com/raspberrypi/linux/issues/4307
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Now that we have the infrastructure in place, we can raise the maximum
pixel rate we can reach for HDMI0 on the BCM2711.
HDMI1 is left untouched since its pixelvalve has a smaller FIFO and
would need a clock faster than what we can provide to support the same
modes.
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
vc4_dsi_encoder_disable is partially an open coded version of
drm_bridge_chain_disable, but it missed a termination condition
in the loop for ->disable which meant that no post_disable
calls were made.
Add in the termination clause.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
DSI0 seemingly had very little or no testing as a load of
the register mappings were incorrect/missing, so host
transfers always timed out due to enabling/checking incorrect
bits in the interrupt enable and status registers.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
vc4_dsi was registering both dsi0 and dsi1 as VC4_ENCODER_TYPE_DSI1
which seemed to work OK for a single DSI display, but fails
if there are two DSI displays connected.
Update to register the correct type.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
For slightly unknown reasons, dsi0 takes a different pixel format
to dsi1, and that has to be set in the pixel valve.
Amend the setup accordingly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The new clock request API allows us to increase the rate of the
core clock as required during mode set while decreasing it when
we're done, resulting in a better power-efficiency.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
https://gist.github.com/popcornmix/6b3e23103c60170b02b148e0ba5d6ed7
is the script used to generate the 601, 709 and 2020 colourspaces.
I've regenerated the existing ones using script so it is reproducible
but there are lsb differences compared to values here (copied from spec)
whose origin is now lost.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
There is little harm in ignoring fractional coordinates
(they just get truncated).
Without this:
modetest -M vc4 -F tiles,gradient -s 32:1920x1080-60 -P89@74:1920x1080*.1.1@XR24
is rejected. We have the same issue in Kodi when trying to
use zoom options on video.
Note: even if all coordinates are fully integer. e.g.
src:[0,0,1920,1080] dest:[-10,-10,1940,1100]
it will still get rejected as drm_atomic_helper_check_plane_state
uses drm_rect_clip_scaled which transforms this to fractional src coords
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Spec says: bits [31:4] of the given address should point to
the 128-bit word containing the desired starting pixel,
and bits[3:0] should be between 0 and 11, indicating which
of the 12-pixels in that 128-bit word is the first pixel to be used
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
We are getting occasional VC4_HD_MAI_CTL_ERRORF in
HDMI_MAI_CTL which seem to correspond with audio dropouts.
Reduce the threshold where we deassert DREQ to avoid the fifo overfilling
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
DPI hasn't really been used up until now, so the default has
been meaningless.
In theory we should be able to pass the desired format for the
adjacent bridge chip through, but framework seems to be missing
for that.
As the main device to use DPI is the VGA666 or Adafruit Kippah,
both of which use RGB666, change the default to being RGB666 instead
of RGB888.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Using a hdmi analyser the bytes in packet ram
registers beyond the length were visible in the
infoframes and it flagged the checksum as invalid.
Zeroing unused words of packet RAM avoids this
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
With vc4-drv node not being under /soc on Pi4, we need to
adopt the correct DMA parameters from a suitable sub-component.
Add "brcm,bcm2711-hvs" to that list of components.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The vc5 HDMI registers hadn't been added into the debugfs
register sets, therefore weren't dumped on request.
Add them in.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Overlays are unable to remove properties in the base DTB, but they
can overwrite them. Allow a present but empty 'dmas' property
to also disable the HDMI audio interface.
See: https://github.com/raspberrypi/linux/issues/2489
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Under FKMS, the firmware (via FKMS) also requires the VideoCore cache
aliases for image planes, as defined by the dma-ranges under /soc.
Add rpi-firmware-kms to the list of acceptable nodes to look for
to copy dma config from.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The BT601/BT709 color encoding and limited vs full
range properties were not being exposed, defaulting
always to BT601 limited range.
Expose the parameters and set the registers appropriately.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
This currently doesn't handle non-zero source rectangles correctly,
but add support for DRM_FORMAT_P030 with DRM_FORMAT_MOD_BROADCOM_SAND128
modifier to planes when running on HVS5.
WIP still for source cropping SAND/P030 formats
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
This follows logic in hdmi-codec.c to use speaker layout
from ELD to choose a suitable speaker mapping based on
number of channels requested and signal that in audio
infoframe and report this back to userspace.
This allows apps like speaker-test and kodi to get the
output to the right speakers.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Without this bit set, HDMI_MAI_FORMAT doesn't pick up
the format and samplerate from DVP_CFG_MAI0_FMT and you
can't get HDMI_HDMI_13_AUDIO_STATUS_1 to indicate HBR mode
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
This was a workaround for bugs in hardware on earlier Pi models
and wasn't totally successful.
It makes audio quality worse on a Pi4 at the higher sample rates
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Configuring HDMI audio registers in prepare allows us to take
IEC958 bits into account which are set by the alsa hook after
the hw_params call.
Signed-off-by: Matthias Reichl <hias@horus.com>
Although vc4 get an IEC958 formatted stream passed in from userspace
the driver needs the info from the channel status bits to properly
set up the hardware, eg for HBR passthrough.
Add iec958 controls so the channel status bits can be passed in
from userspace.
Signed-off-by: Matthias Reichl <hias@horus.com>
The hardware uses this for generating the right audio
data island packets when using formats other than PCM
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
A matching media bus format was added and an overlay for using it,
both with FB and VC4 was added as well.
Signed-off-by: Joerg Quinten <aBUGSworstnightmare@gmail.com>
vc4_drv isn't necessarily under the /soc node in DT as it is a
virtual device, but it is the one that does the allocations.
The DMA addresses are consumed by primarily the HVS or V3D, and
those require VideoCore cache alias address mapping, and so will be
under /soc.
During probe find the a suitable device node for HVS or V3D,
and adopt the DMA configuration of that node.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The vidioc_enum_input() v4l2 ioctl is capable of returning
sensor/input status as well. This is used in current
GStreamer HEAD for signal detection [1].
bcm2835-unicam does handle this syscall, but it didn't ask
the subdevice driver about the input status. The input then
appeared as always present.
This commit adds the necessary query. There is a precedent for
this - the R-Car VIN V4L2 driver does a similar call [2].
[1]: ce0be27caf/sys/v4l2/gstv4l2src.c (L553)
[2]: 7fb9d006d3/drivers/media/platform/rcar-vin/rcar-v4l2.c (L548)
Signed-off-by: Jakub Vaněk <linuxtardis@gmail.com>
The RPI_FIRMWARE_NOTIFY_DISPLAY_DONE firmware call allows to tell the
firmware the kernel is in charge of the display now and the firmware can
free whatever resources it was using.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The vc4 driver will need to tell the firmware that it takes over the
display for the firmware to free its resources (lower the clock, free
some memory, etc.)
Let's add an optional phandle to our firmware node.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The raspberrypi,firmware property has been documented as required in the
binding but was never actually used in the final version of the binding.
Remove it.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
The imx378 sensor is almost identical to the imx477 and can be
supported as a "compatible" sensor with just a few extra register
writes.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
The bcm2835 ISP requires the base address of all input/output planes to have 32
byte alignment. Using a Y stride of 32 bytes would not guarantee that the V
plane would fulfil this, e.g. a height of 650 lines would mean the V plane
buffer is not 32 byte aligned for YUV420 formats.
Having a Y stride of 64 bytes would ensure both U and V planes have a 32 byte
alignment, as the luma height will always be an even number of lines.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
From the original Rockchip driver, the subdev was renamed
from the default to being "mov9281 <dev_name>" whereas the
default would have been "ov9281 <dev_name>".
Remove the override to drop back to the default rather than
a vendor custom string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
It is legitimate, though unusual, for an aux ent associated with a slot
to be selected in phase 0 before a previous selection has been used and
released in phase 2. Fix such that if the slot is found to be in use
that the aux ent associated with it is reused rather than an new aux
ent being created. This fixes a problem where when the first aux ent
was released the second was lost track of.
This bug spotted in Nick's testing. It may explain some other occasional,
unreliable decode error reports where dmesg included "Missing DPB AUX
ent" logging.
Signed-off-by: John Cox <jc@kynesim.co.uk>
When the clock setups were added for the alternate external clocks,
the settings for 2 lane 720p and 4 lane 1080p were transposed.
2 lane 720p still worked, but 4 lane 1080p didn't.
Correct the assignments.
Fixes: 6b0c094a5b (media: i2c: imx290: Add support for 74.25MHz clock")
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Whilst the datasheet lists the link frequency changing between
1080p and 720p modes, reality is that with the default blanking
we have
(1920 + 280) * (1080 + 45) * 60fps = 148.5MPix/s
and
(1280 + 2020) * (720 + 30) * 60fps = 148.5MPix/s
and this reflects reality whether in 10 or 12 bit readout modes.
How this relates to link frequency is unclear as it differs
from the datasheet, but all exposure and frame rate calcs need
the pixel rate to be correct, so make it so.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Commit "97589ad61c73 media: i2c: imx290: Add support for 2 data lanes"
added support for running in two lane mode (instead of 4), but
without changing the link frequency that resulted in a max of 30fps.
Commit "98e0500eadb7 media: i2c: imx290: Add configurable link frequency
and pixel rate" then doubled the link frequency when in 2 lane mode,
but didn't undo the correction for running at only 30fps, just extending
horizontal blanking instead.
It also didn't update the CSI timing registers in accordance with the
datasheet.
Remove the 30fps limit on 2 lane by correcting the register config
in accordance with the datasheet for 60fps operation over 2 lanes.
Frame rate control (via V4L2_CID_VBLANK or HBLANK) can still reduce
the frame rate on 2 lanes back to 30fps.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Do not scale IMX477_EXPOSURE_OFFSET with the long exposure factor during
the limit calculations. This allows larger exposure times, and does seem to be
what the sensor is doing internally.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Fix stream on & off such that failures leave the driver in the correct
state. Ensure that the clock is on when we are streaming and off when
all contexts attached to this device have stopped streaming.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Due to overheads in assembling controls and requests it is worth having
the slice assembly phase separate from the h/w pass1 processing. Create
a queue to service pass1 rather than have the pass1 finished callback
trigger the next slice job.
This requires a rework of the logic that splits up the buffer and
request done events. This code contains two ways of doing that, we use
Ezequiel Garcias <ezequiel@collabora.com> solution, but expect that
in the future this will be handled by the framework in a cleaner manner.
Fix up the handling of some of the memory exhaustion crashes uncovered
in the process of writing this code.
Signed-off-by: John Cox <jc@kynesim.co.uk>
This is probably not the API we will want to add, but it
should show what semantics are needed by drivers.
The goal is to allow the OUTPUT (aka source) buffer and the
controls associated to a request to be released from the request,
and in particular return the OUTPUT buffer back to userspace,
without signalling the media request fd.
This is useful for devices that are able to pre-process
the OUTPUT buffer, therefore able to release it before
the decoding is finished. These drivers should signal
the media request fd only after the CAPTURE buffer is done.
Tested-by: John Cox <jc@kynesim.co.uk>
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Add an enable count to the irq Q structures to allow the irq logic to
block further callbacks if resources associated with the irq are not
yet available.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Use multi-planar interface rather than single plane interface. This
allows dmabufs holding compressed data to be resized.
Signed-off-by: John Cox <jc@kynesim.co.uk>
VAAPI H265 has num entry points but never sets it. Allow a VAAPI
shim to work without requiring rewriting the VAAPI driver.
num_entry_points can be calculated from the slice_segment_addr
of the next slice so delay processing until we have that.
Also includes some minor cosmetics.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Fixes the following v4l2-compliance failure:
fail: v4l2-test-controls.cpp(871): subscribe event for control 'User Controls' failed test
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Trial and error reveals that the minimum vblank value appears to be 24
(the OV5647 data sheet does not give any clues). This fixes streaming
lock-ups in full resolution mode.
Fixes: 9b5a5ebedc ("media: i2c: ov5647: Add support for V4L2_CID_VBLANK")
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
The top offset in the pixel array is actually 6 (see page 3-1 of the
OV5647 data sheet).
Fixes: f2f7ad5ce5 ("media: i2c: ov5647: Selection compliance fixes")
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Keeping a copy of the old poweroff handler allows it to be restored
should this module be unloaded, but also provides a fallback if the
power hasn't been removed when the timeout elapses.
See: https://github.com/raspberrypi/rpi-eeprom/issues/330
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The result of dividing a u32 by a size_t is an unsigned int on arm32
and a long unsigned int on arm64. Use "%zu" (the size_t format) to
remove the build warning for 64-bit builds.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
During decode, setting the CAPTURE queue format was setting the crop
rectangle to the requested height before aligning up the format to
cater for simple clients that weren't expecting to deal with cropping
and the SELECTION API.
This caused problems on some resolution change events if the client
didn't also then use the selection API.
Disable the crop update after a resolution change.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Whilst the hardware can't achieve the limits of level 4.2 under
all situations, it can exceed level 4.0.
Allow selection of levels 4.1 and 4.2.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
MMAL has the flag MMAL_BUFFER_HEADER_FLAG_CORRUPTED but that
wasn't being passed through, so add it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Video decode supports YUV and RGB formats. YUV needs to report SMPTE170M
or REC709 appropriately, whilst RGB should report SRGB.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The driver said it supported H264 levels 4.1 and 4.2, but
was missing the V4L2 to MMAL mappings.
Add in those mappings.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The current code will first dereference the req pointer and then test if
it's NULL, resulting in a NULL pointer dereference if req is indeed
NULL. Reorder the test and derefence to avoid the issue
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
There is no reason why the control GPIOs for the panel can not
be connected to I2C or similar GPIO interfaces that may need to
sleep, therefore switch from gpiod_set_value to
gpiod_set_value_cansleep calls to configure them.
Without that you get complaints from gpiolib every time the state
is changed.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>
It's not unusual to find clocks being shared across multiple devices
that need to change the rate depending on what the device is doing at a
given time.
The SoC found on the RaspberryPi4 (BCM2711) is in such a situation
between its two HDMI controllers that share a clock that needs to be
raised depending on the output resolution of each controller.
The current clk_set_rate API doesn't really allow to support that case
since there's really no synchronisation between multiple users, it's
essentially a fire-and-forget solution.
clk_set_min_rate does allow for such a synchronisation, but has another
drawback: it doesn't allow to reduce the clock rate once the work is
over.
In our previous example, this means that if we were to raise the
resolution of one HDMI controller to the largest resolution and then
changing for a smaller one, we would still have the clock running at the
largest resolution rate resulting in a poor power-efficiency.
In order to address both issues, let's create an API that allows user to
create temporary requests to increase the rate to a minimum, before
going back to the initial rate once the request is done.
This introduces mainly two side-effects:
* There's an interaction between clk_set_rate and requests. This has
been addressed by having clk_set_rate increasing the rate if it's
greater than what the requests asked for, and in any case changing
the rate the clock will return to once all the requests are done.
* Similarly, clk_round_rate has been adjusted to take the requests
into account and return a rate that will be greater or equal to the
requested rates.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
be suspended
Webcams with microphones are composite devices, and autosuspend is set
at the device level. If uvcvideo is probed after snd-usb-audio, the effect
of the quirk applied by snd-usb-audio is undone by uvcvideo's global
application of autosuspend.
Incrementing the interface's PM refcount in such cases prevents runtime PM
from happening, thus the device is left active.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
These devices use a type of Sonix chipset that produces broken microphone
data if suspended/resumed.
They also don't support readback of the sample rate.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
When importing there was a missing call to detach the buffer,
so each import leaked the sg table entry.
Actually the release process for both locally allocated and
imported buffers is identical, so fix them to both use the same
function.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'
The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.
The Adafruit Mini-PiTFT13 display needs offsets applying when rotated,
so use the "variant" mechanism to select a custom set_addr_win method
using a dedicated compatible string of "fbtft,minipitft13".
See: https://github.com/raspberrypi/firmware/issues/1524
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
DMABUFs are all handled by videobuf2, so there is no reason not
to enable support for them.
Note that this driver is still using the vmalloc allocator, so
the buffers it allocates will not be compatible with the codec
or ISP driver that require contiguous buffers. However this
driver should be able to import the buffers allocated by them.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Parse device properties and register controls for them using the V4L2
fwnode properties helpers.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Previously they were returning positive non-zero codes for success,
which were getting passed up the call stack. Since release 4.19,
do_dentry_open (fs/open.c) has been catching these and flagging an
error. (So this driver has been broken since that date.)
Fixes: 3c2472a [media] media: i2c: Add support for OV5647 sensor
Signed-off-by: David Plowman <david.plowman@raspberrypi.org>
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Breaks out common register set and adds the different registers
for 1280x720 (cropped) and 640x480 (skipped) modes
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The sensor supports 8 bit mode as well as 10bit, so add the
relevant code to allow selection of this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Vision Components have made an OV9281 module which blocks reading
back the majority of registers to comply with NDAs, and in doing
so doesn't allow auto-increment register reading as used when
reading the chip ID.
Use two reads and manually combine the results.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The Rockchip driver was based on a 4.4 kernel, and had several custom
Rockchip parts.
Update to 5.4 kernel APIs, with the relevant controls required by
libcamera, and remove custom Rockchip parts.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Adds the ov9281 parts of the Rockchip patch adding enum_frame_interval to
a large number of drivers.
Change-Id: I03344cd6cf278dd7c18fce8e97479089ef185a5c
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
Takes the ov9281 part only from the Rockchip's patch.
Change-Id: I30e833baf2c1bb07d6d87ddb3b00759ab45a90e4
Signed-off-by: Zefa Chen <zefa.chen@rock-chips.com>
v4l_cropcap calls our vidioc_g_pixelaspect function to get the pixel
aspect ratio, but also calls g_selection for V4L2_SEL_TGT_CROP_BOUNDS
and V4L2_SEL_TGT_CROP_DEFAULT. Whilst it allows for vidioc_g_pixelaspect
not to be implemented, it doesn't allow for either of the other two.
Add in support for the additional selection targets.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
If the format is detected by the driver and a V4L2_EVENT_SOURCE_CHANGE
event is generated, then pass on the pixel aspect ratio as well.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Fixes: "staging/bcm2835-codec: Log the number of excess supported formats"
Which used %u for printing a size_t, and 64bit builds then log a warning.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
In order to get the intended behaviour of the stateful video
decoder API where only the OUTPUT queue needs to be enabled and fed
buffers in order to get the SOURCE_CHANGED event that configures the
CAPTURE queue, we want the device to run should either queue be
streaming.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The kernel modules aes-neon-blk and aes-neon-bs perform poorly, at least on
Cortex-A72 without crypto extensions. In fact, aes-arm64 outperforms them
on benchmarks, despite it being a simpler implementation (only accelerating
the single-block AES cipher).
For modes of operation where multiple cipher blocks can be processed in
parallel, aes-neon-bs outperforms aes-neon-blk by around 60-70% and aes-arm64
is another 10-20% faster still. But the difference is even more marked with
modes of operation with dependencies between neighbouring blocks, such as
CBC encryption, which defeat parallelism: in these cases, aes-arm64 is
typically around 250% faster than either aes-neon-blk or aes-neon-bs.
The key trade-off with aes-arm64 is that the look-up tables are situated in
RAM. This leaves them potentially open to cache timing attacks. The two other
modules, by contrast, load the look-up tables into NEON registers and so are
able to perform in constant time.
This patch aims to load aes-arm64 more often.
If none of the currently-loaded crypto modules implement a given algorithm,
a new one is typically selected for loading using a platform-neutral alias
describing the required algorithm. To enable users to still
load aes-neon-blk or aes-neon-bs if they really want them, while still
ensuring that aes-arm64 is usually selected, remove the aliases from
aes-neonbs-glue.c and aes-glue.c and apply them to aes-cipher-glue.c, but
still build the two NEON modules.
Since aes-glue.c can also be used to build aes-ce-blk, leave them enabled
if USE_V8_CRYPTO_EXTENSIONS is defined, to ensure they are selected if we
in future use a CPU which has the crypto extensions enabled.
Note that the algorithm priority specifiers are unchanged, so if
aes-neon-bs is loaded at the same time as aes-arm64, the former will be
used in preference. However, aes-neon-blk and aes-arm64 have tied priority,
so whichever module was loaded first will be used (assuming aes-neon-bs is
not loaded).
Signed-off-by: Ben Avison <bavison@riscosopen.org>
If multiple sets of interrupts occur simultaneously, it may be unsafe
to swap buffers, as the hardware may already be re-using the current
buffers. In such cases, avoid swapping buffers, and wait for the next
opportunity at the Frame End interrupt to signal completion.
Additionally, check the packet compare status when watching for frame
end for buffers swaps, as this could also signify a frame end event.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The struct imx477 *ctrl parameter is not used in the function
imx477_adjust_exposure_range(), so remove it.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The only field in this struct that is used is the format code, so
replace the struct with this single field.
Save the format code in imx477_set_pad_format() when setting up a new
mode so that imx477_get_pad_format() performs the right lookup.
Otherwise, this caused a bug where the mode lookup occurred on the
12-bit table rather than the 10-bit table.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The existing 1012x760 120 fps mode has significant IQ problem using
the internal sensor scaler. Replace this mode with a 1332x990 120 fps
mode instead. This new mode has a smaller field of view, but does not
suffer from the bad IQ of the original mode.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
When vblank changes we must modify the exposure range. Also, with this
sensor, the effective exposure time implicitly changes when vblank
does, so we have to reset it after every vblank update.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Add support for very long exposures by using the exposure multiplier
register. Userland does not need to pass any additional controls to
enable long exposures, it simply requests a larger vblank to extend the
exposure control range appropriately.
Currently, since hblank is fixed, a maximum of approximately 124 seconds
of exposure time can be used. In a future change, hblank could also be
controlled in userland to give over 200 seconds of exposure time.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The V4L2_CID_EXPOSURE_AUTO_PRIORITY was used to let the sensor control
frame length (effectively framerate) based on the requested exposure
time requested. Remove this feature as it is never used, and goes
against how V4L2 likes to handle exposure and vblank controls.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
If realloc to increase coeff size fails then attempt to re-allocate
the original size. If that also fails then flag a fatal error to abort
all further decode.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Each supported format now includes a mask showing the allowed colour
spaces, as well as a default colour space for when one was not
specified.
Additionally we translate the colour space to mmal format and pass it
over to the VideoCore.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
The CEC and hotplug interrupts were missing when that binding was
introduced, let's add them in now that we've figured out how it works.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Acked-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The number is only a hint, but may as well be correct.
Fixes: 471e0029e9 ("media: i2c: imx290: Convert HMAX setting into V4L2_CID_HBLANK")
Fixes: be0b9b7ad1 ("media: i2c: imx290: Add support for V4L2_CID_VBLANK")
Fixes: 8483f0d759 ("media: i2c: imx290: Add exposure control to the driver.")
Fixes: 9764f3459c ("media: i2c: imx290: Add H and V flip controls")
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Most software (including libcamera) requires V4L2_CID_ANALOGUE_GAIN,
not V4L2_CID_GAIN.
The range for the control is 0 to 100 for which the sensor uses only
analogue gain; higher values would involve digital gain which this
control should not apply.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Much effort has been put into finding ways to avoid warnings from dtc
about overlays, usually to do with the presence of #address-cells and
size-cells, but not exclusively so. Since the issues being warned about
are harmless, suppress the warnings to declutter the build output and
to avoid alarming users.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
A relatively recent commit ([1]) contained optimisation for the PIO
SPI FIFO-filling functions. The commit message includes the phrase
"[t]he blind and counted loops are always called with nonzero count".
This is technically true, but it is still possible for count to become
zero before the loop is entered - if tfr->len is zero. Moving the loop
exit condition to the end of the loop saves a few cycles, but results
in a near-infinite loop should the revised count be zero on entry.
Strangely, zero-lengthed transfers aren't filtered by the SPI framework
and, even more strangely, the Python3 spidev library is triggering them
for no obvious reason.
Avoid the problem completely by bailing out of the main transfer
function early if trf->len is zero, although there may be a case for
moving the mitigation into the framework.
See: https://github.com/raspberrypi/linux/issues/4100
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[1] 26751de25d ("spi: bcm2835: Micro-optimise FIFO loops")
Add colour denoise control to the bcm2835 driver through a new v4l2
control: V4L2_CID_USER_BCM2835_ISP_CDN.
Add the accompanying MMAL configuration structure definitions as well.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Use (reserved) bits 24 and 25 of the dreq value
(the second cell of the DT DMA descriptor) to request
that wide source reads or wide dest writes are required
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
memset_rpi.S is an optimised memset implementation, but doesn't define
__memset (which was just added to memset.S). As a result, building
for the BCM2835 platform causes a link failure.
Add __memset as yet another alias to our common implementation.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
When logging that the firmware has provided more supported formats
than we had allocated storage for, log the number allocated and
returned.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Now that the firmware supports the unpacked (16bpp) variants
of the MIPI raw formats, add the mappings.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
When logging that the firmware has provided more supported formats
than we had allocated storage for, log the number allocated and
returned.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Now that the firmware supports the unpacked (16bpp) variants
of the MIPI raw formats, add the mappings.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Support has been added for the unpacked (16bpp) versions of
the MIPI raw 10, 12, and 14 formats, so add the 4CCs for them.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
error logs:
drivers/staging/vc04_services/vc-sm-cma/Kconfig:1:error: recursive dependency detected!
drivers/staging/vc04_services/vc-sm-cma/Kconfig:1: symbol BCM_VC_SM_CMA is selected by BCM2835_VCHIQ_MMAL
drivers/staging/vc04_services/vchiq-mmal/Kconfig:1: symbol BCM2835_VCHIQ_MMAL depends on BCM2835_VCHIQ
drivers/staging/vc04_services/Kconfig:14: symbol BCM2835_VCHIQ is selected by BCM_VC_SM_CMA
For a resolution refer to Documentation/kbuild/kconfig-language.rst
subsection "Kconfig recursive dependency limitations"
Tested-by: make ARCH=arm64 bcm2711_defconfig
Test platform: fedora 33
Branch: rpi-5.10.y
To comply with the intended usage of the V4L2 selection target when
used to retrieve a sensor image properties, adjust the rectangles
returned by the imx477 driver.
The top/left crop coordinates of the TGT_CROP rectangle were set to
(0, 0) instead of (8, 16) which is the offset from the larger physical
pixel array rectangle. This was also a mismatch with the default values
crop rectangle value, so this is corrected. Found with v4l2-compliance.
While at it, add V4L2_SEL_TGT_CROP_BOUNDS support: CROP_DEFAULT and
CROP_BOUNDS have the same size as the non-active pixels are not readable
using the selection API. Found with v4l2-compliance.
This commit mirrors 543790f777 done for
the imx219 sensor.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
lan78xx_link_reset explicitly clears the MAC's view of the PHY's IRQ
status. In doing so it potentially leaves the PHY with a pending
interrupt that will never be acknowledged, at which point no further
interrupts will be generated.
Avoid the problem by acknowledging any pending PHY interrupt after
clearing the MAC's status bit.
See: https://github.com/raspberrypi/linux/issues/2937
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Although the BRCMSTB PCIe interface doesn't technically support the
MSI-X spec, in practise it seems to work provided no more than 32
MSI-Xs are required. Add the required flag to the driver to allow
experimentation with devices that demand MSI-X support.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Commit 65e08c4650 failed to clear the
clock state when the device stopped streaming. Fix this, as it might
again cause the same problems when doing an unprepare.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
clk_disable_unprepare() is called unconditionally in stop_streaming().
This is incorrect in the cases where start_streaming() fails, and
unprepares all clocks as part of the failure cleanup. To avoid this,
ensure that clk_disable_unprepare() is only called in stop_streaming()
if the clocks are in a prepared state.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
On a failure in start_streaming(), the error code would not propagate to
the calling function on all conditions. This would cause the userland
caller to not know of the failure.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
DSI1 on BCM2711 doesn't require the DMA workaround that is used
on BCM2835/6/7, therefore it needs a new compatible string.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
In switching to the hardware I2C controller there is an issue
where we seem to not get back the correct state from the Pi
touchscreen.
Insert a delay before polling to avoid this condition.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
We now have the hardware I2C controller pinmuxed to the drive the
display I2C, but this controller does not support clock stretching.
The Atmel micro-controller in the panel requires clock stretching
to allow it to prepare any data to be read.
Split the rpi_touchscreen_i2c_read into two independent transactions with
a delay between them for the Atmel to prepare the data.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Not all systems have the interrupt line wired up, so switch to
polling the touchscreen off a timer if no interrupt line is
configured.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
[1] replaced a single reset function with a pointer to one of two
implementations, but also removed the call asserting the reset
at the start of brcm_pcie_setup. Doing so breaks Raspberry Pis with
VL805 XHCI controllers lacking dedicated SPI EEPROMs, which have been
used for USB booting but then need to be reset so that the kernel
can reconfigure them. The lack of a reset causes the firmware's loading
of the EEPROM image to RAM to fail, breaking USB for the kernel.
See: https://www.raspberrypi.org/forums/viewtopic.php?p=1758157#p1758157
Fixes: 04356ac307 ("PCI: brcmstb: Add bcm7278 PERST# support")
[1] 04356ac307 ("PCI: brcmstb: Add bcm7278 PERST# support")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The last nibble is a revision ID, and the 54213pe is a later rev
than the 54210e. Running the 54210e setup code on a 54213pe results
in a broken RGMII interface.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Define a new mailbox (SET_REBOOT_FLAGS) which may be used to
pass optional flags to the Raspberry Pi firmware that changes
the behaviour of the bootloader and firmware during a reboot.
Currently this just defines the 'tryboot' flag which causes
the firmware to load tryboot.txt instead config.txt. This
alternate configuration file can be used to specify the
path of an alternate firmware and kernels allowing a fallback
mechanism to be implemented for OS upgrades.
Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Add a property to allow the headphone output to be disabled. Use an
integer property rather than a boolean so that an overlay can clear it.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The gpio-fsm driver implements simple state machines that allow GPIOs
to be controlled in response to inputs from other GPIOs - real and
soft/virtual - and time delays. It can:
+ create dummy GPIOs for drivers that demand them,
+ drive multiple GPIOs from a single input, with optional delays,
+ add a debounce circuit to an input,
+ drive pattern sequences onto LEDs
etc.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Fix a build warning
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Rename 'num-soft-gpios' to avoid warning
As of 5.10, the Device Tree parser warns about properties that look
like references to "suppliers" of various services. "num-soft-gpios"
resembles a declaration of a GPIO called "num-soft", causing the value
to be interpreted as a phandle, the owner of which is checked for a
"#gpio-cells" property.
To avoid this warning, rename the gpio-fsm property to "num-swgpios".
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Show state info in /sys/class/gpio-fsm
Add gpio-fsm sysfs entries under /sys/class/gpio-fsm. For each state
machine show the current state, which state (if any) will be entered
after a delay, and the current value of that delay.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Fix shutdown timeout handling
The driver is intended to jump directly to a shutdown state in the
event of a timeout during shutdown, but the sense of the test was
inverted.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
gpio-fsm: Clamp the delay time to zero
The sysfs delay_ms value is calculated live, and it is possible for
the time left to appear to be negative briefly if the timer handling
hasn't completed. Ensure the displayed value never goes below zero,
for the sake of appearances.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Driver for the BCM2835 ISP hardware block. This driver uses the MMAL
component to program the ISP hardware through the VC firmware.
The ISP component can produce two video stream outputs, and Bayer
image statistics. This can't be encompassed in a simple V4L2
M2M device, so create a new device that registers 4 video nodes.
This patch squashes all the development patches from the earlier
rpi-5.4.y branch into one
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
This file defines the userland interface to the bcm2835-isp driver
that will follow in a separate commit.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
If CONFIG_DMA_BCM2708 isn't enabled there's no need to mask out
one of the already scarce DMA channels.
Signed-off-by: Matthias Reichl <hias@horus.com>
This adds a V4L2 memory to memory device that wraps the MMAL
video decode and video_encode components for H264 and MJPEG encode
and decode, MPEG4, H263, and VP8 decode (and MPEG2 decode
if the appropriate licence has been purchased).
This patch squashes all the work done in developing the driver
on the Raspberry Pi rpi-5.4.y kernel branch.
Thanks to Kieran Bingham, Aman Gupta, Chen-Yu Tsai, and
Marek Behún for their contributions. Please refer to the
rpi-5.4.y branch for the full history.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/bcm2835-codec: Ensure OUTPUT timestamps are always forwarded
The firmware by default tries to ensure that decoded frame
timestamps always increment. This is counter to the V4L2 API
which wants exactly the OUTPUT queue timestamps passed to the
CAPTURE queue buffers.
Disable the firmware option.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/vc04_services/codec: Add support for CID MPEG_HEADER_MODE
Control V4L2_CID_MPEG_VIDEO_HEADER_MODE controls whether the encoder
is meant to emit the header bytes as a separate packet or with the
first encoded frame.
Add support for it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/vc04_services/codec: Clear last buf dequeued flag on START
It appears that the V4L2 M2M framework requires the driver to manually
call vb2_clear_last_buffer_dequeued on the CAPTURE queue during a
V4L2_DEC_CMD_START.
Add such a call.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
staging/vc04-services/codec: Fix logical precedence issue
Two issues identified with operator precedence in logical
expressions. Fix them.
https://github.com/raspberrypi/linux/issues/4040
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
With the vc-sm-cma driver we can support zero copy of buffers between
the kernel and VPU. Add this support to mmal-vchiq.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add Broadcom VideoCore Shared Memory support.
This new driver allows contiguous memory blocks to be imported
into the VideoCore VPU memory map, and manages the lifetime of
those objects, only releasing the source dmabuf once the VPU has
confirmed it has finished with it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
If the RBUF logic is not reset when the kernel starts then there
may be some data left over from any network boot loader. If the
64-byte packet headers are enabled then this can be fatal.
Extend bcmgenet_dma_disable to do perform the reset, but not when
called from bcmgenet_resume in order to preserve a wake packet.
N.B. This different handling of resume is just based on a hunch -
why else wouldn't one reset the RBUF as well as the TBUF? If this
isn't the case then it's easy to change the patch to make the RBUF
reset unconditional.
See: https://github.com/raspberrypi/linux/issues/3850
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Increase the delay before entering the lower power state to 2 seconds
(the maximum allowed) in order to reduce the packet latencies,
particularly for inbound packets.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Display variants are intended as a replacement for the now-deleted
fbtft_device drivers. Drivers can register additional compatible
strings with a custom callback that can make the required changes
to the fbtft_display structure.
Start the ball rolling by adding adafruit18, adafruit18_green and
sainsmart18 displays.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Since the unicam driver was modified to write to a dummy buffer when no
user-supplied buffer is available, it can now write to and return a
buffer even when there's only a single one. Enable this by changing the
min_buffers_needed in the vb2_queue; it will be useful for enabling
still captures without allocating more memory than absolutely necessary.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
The change to retrieve the pixel format always on g_fmt didn't
check whether the native or unpacked version of the format
had been requested, and always returned the packed one.
Correct this so that the packing setting is retained whereever
possible.
Fixes "9d59e89 media: bcm2835-unicam: Re-fetch mbus code from subdev
on a g_fmt call"
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
From when bringing up the driver, there was a check in the isr
to ignore interrupts (claiming them handled) should the driver
not be streaming.
The VPU now will not register a camera driver if it finds a
CSI2 node enabled in device tree, therefore this flawed check is
redundant.
https://github.com/raspberrypi/linux/issues/3602
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Parse device properties and register controls for them using the V4L2
fwnode properties helpers.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Use V4L2_CID_EXPOSURE_AUTO_PRIORITY to control if the driver should
automatically adjust the sensor frame length based on exposure time,
allowing variable frame rates and longer exposures.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Adds a driver for the 12MPix Sony IMX477 CSI2 sensor.
Whilst the sensor supports 2 or 4 CSI2 data lanes, this driver
currently only supports 2 lanes.
The following Bayer modes are currently available:
4056x3040 12-bit @ 10fps
2028x1520 12-bit (binned) @ 40fps
2028x1050 12-bit (cropped/binned) @ 50fps
1012x760 10-bit (scaled) @ 120 fps
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
During a bulk transfer we request a DMA allocation to hold the
scatter-gather list. Most of the time, this allocation is small
(<< PAGE_SIZE), however it can be requested at a high enough frequency
to cause fragmentation and/or stress the CMA allocator (think time
spent in compaction here, or during allocations elsewhere).
Implement a pool to serve up small DMA allocations, falling back
to a coherent allocation if the request is greater than
VCHIQ_DMA_POOL_SIZE.
Signed-off-by: Oliver Gjoneski <ogjoneski@gmail.com>
Fix commit "media: tc358743: Return an appropriate colorspace from
tc358743_set_fmt" to ensure that the format passed in to set_fmt
is checked to be valid, and reset to the current format if not.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Pi 0&1 pass all ARM accesses through the VPU L2 cache, therefore
the dma-ranges property sets the cache alias bits to other
than the direct alias, hence this WARN was firing.
It was overprotective coding, so assume that everything is OK
with the dma-ranges, and remove the WARN.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
MEDIA_CONTROLLER_REQUEST_API is a hidden option. If rpivid depends on it,
the user would need to first enable another driver that selects
MEDIA_CONTROLLER_REQUEST_API, and only then rpivid would become available.
By selecting it instead of depending on it, it becomes possible to enable
rpivid without having to enable other potentially unnecessary drivers.
Signed-off-by: Hristo Venev <hristo@venev.name>
Although it is no longer necessary for vchiq's children to have a
different DMA configuration to the parent, they do still need to
explicitly to have their DMA configuration set - to be that of the
parent.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The actpwr trigger is a meta trigger that cycles between an inverted
mmc0 and default-on. It is written in a way that could fairly easily
be generalised to support alternative sets of source triggers.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
When streaming with Unicam, the VPU must have a clock frequency of at
least 250Mhz. Otherwise, the input fifos could overrun, causing
image corruption.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
[g|s]_selection pass in a buffer type that needs to be validated
before passing on to the sensor subdev.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
v4l2-compliance throws a failure if the device doesn't advertise
V4L2_CAP_READWRITE but allows read or write operations.
We do support read, so reinstate the flag.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The colorspace fields were left untouched in imx290_set_fmt
which lead to a v4l2-compliance failure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Userspace needs to know the cropping arrangements for each mode,
so expose this through g_selection.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
__v4l2_ctrl_modify_range only updates the current value should
it be invalid within the new range. That can leave modes producing
odd frame rates.
Explicitly update the HBLANK and VBLANK values so that on mode
change we revert to the default frame rate for the mode.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Use bit 27 of the dreq value (the second cell of the DT DMA descriptor)
to request that the WAIT_RESP bit is not set.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Now that the 14bit non-packed Bayer formats are defined, add them
into the supported formats lookup table.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Now that V4L2_PIX_FMT_Y14 and V4L2_PIX_FMT_Y14P are defined,
allow passing 14bit mono data through the peripheral.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Now that V4L2_PIX_FMT_Y12P is defined, allow passing raw 12bit
mono packed data through the peripheral.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
imx290_set_hmax was using two independent writes to set up hmax,
when all other multi-register writes were using imx290_write_buffered_reg
which claims the group hold first.
Switch imx290_set_hmax to using imx290_write_buffered_reg too.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The IMX290 module is available as either mono or colour (Bayer).
Update the driver so that it can advertise the correct mono
formats instead of the colour ones.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The IMX290 module is available as either monochrome or colour and
the variant is not detectable at runtime.
Add a new compatible string for the monochrome version.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The sensor supports horizontal and vertical flips, so support them
through V4L2_CID_HFLIP and V4L2_CID_VFLIP.
This sensor does NOT change the Bayer order when changing the
direction of readout, therefore no special handling is required for
that.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Adds support for V4L2_CID_EXPOSURE so that userspace can control
the sensor exposure time.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
In order to calculate framerate and durations userspace needs
the vertical blanking information. This can be configurable,
and indeed the datasheet lists different values for VBLANK for
the 1080p and 720p modes.
Add the new control, and adopt the datasheet values for each mode.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Userspace needs to know HBLANK if it is to work out exposure times
and frame rates, therefore convert it to map onto V4L2_CID_HBLANK
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The datasheet lists the gain as being 0.0 to 72.0dB in 0.3dB steps, which
makes 238 steps total.
Correct the 0-72 range defined in the driver.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The existing driver only supported a clock of 37.125MHz, but the
sensor also supports 74.25MHz.
Add the relevant register modifications to support this alternate
clock frequency.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Commit d46cfdc86c upstream.
With the current driver 'media-ctl -p' issued right after the imx290 driver
is loaded prints:
pad0: Source
[fmt:unknown/0x0]
The format value of zero is due to the current_format field of the imx290
struct not being initialized yet.
As imx290_entity_init_cfg() calls imx290_set_fmt(), the current_mode field
is also initialized, so the line which set current_mode to a default value
in driver's probe() function is no longer needed.
Signed-off-by: Andrey Konovalov <andrey.konovalov@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Older gcc versions object to = { 0 } initialisation if the first
elemtn in the structure is a substructure.
Use = { } to avoid this compiler warning.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Use the get_mbus_config pad subdev call to allow a source to use
fewer than the number of CSI2 lanes defined in device tree.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Add a driver for the Unicam camera receiver block on BCM283x processors.
Compared to the bcm2835-camera driver present in staging, this driver
handles the Unicam block only (CSI-2 receiver), and doesn't depend on
the VC4 firmware running on the VPU.
The commit is made up of a series of changes cherry-picked from the
rpi-5.4.y branch of https://github.com/raspberrypi/linux/ with
additional enhancements, forward-ported to the mainline kernel.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reported-by: kbuild test robot <lkp@intel.com>
Commit 8353fe6f1e ("Revert "staging: bcm2835-audio: Drop DT
dependency"") reverts the upstream change and makes bcm2835-audio use
device tree again, however, it also removes the MODULE_ALIAS for the
platform device. This MODULE_ALIAS is needed, because VCHIQ registers
bcm2835-audio as a child platform device since commit 25c7597af2
("staging: vchiq_arm: Register a platform device for audio"), and this
mechanism is adopted also in the downstream kernel.
This commit puts back that MODULE_ALIAS to make bcm2835-audio
autoprobing work again. The rest of VCHIQ children have their
MODULE_ALIASes in place.
Fixes: 8353fe6f1e ("Revert "staging: bcm2835-audio: Drop DT dependency"")
Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
When closing the video device, the irs1125 is put in power down state.
To keep V4L2 ctrls and the HW in sync, v4l2_ctrl_handler_setup is
called after power up.
The compound ctrl IRS1125_CID_MOD_PLL however has a default value
of all zeros, which puts the imager into a non responding state.
Thus, this ctrl is not written by the driver into HW after power up.
The userspace has to take care to write senseful data.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
Instead of changing the exposure and framerate settings for all sequences,
they can be changed for every sequence individually now. Therefore the
IRS1125_CID_SAFE_RECONFIG ctrl has been removed and replaced by
IRS1125_CID_SAFE_RECONFIG_S<seq_num>_EXPO and *_FRAME ctrls.
The consistency check in the sequence ctrl IRS1125_CID_SEQ_CONFIG
is removed.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
Reading data over i2c is done by using i2c_transfer to ensure that this
operation can't be interrupted.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
The BRCM PCIe block has controls to enable control of the CLKREQ#
signal by the L1SS, and to gate the refclk with the CLKREQ# input.
These controls are mutually exclusive - the upstream code sets the
latter, but some use cases require the former.
Add a Device Tree property - brcm,enable-l1ss - to switch to the
L1SS configuration.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Upstream Linux deems using output GPIOs to generate IRQs as a bogus
use case, even though the BCM2835 GPIO controller is capable of doing
so. A number of users would like to make use of this facility, so
disable the checks.
See: https://github.com/raspberrypi/linux/issues/2527
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Things don't work too well when both the vc4 driver and the firmware
driver are trying to control the same audio output:
[ 763.569406] bcm2835_audio bcm2835_audio: vchi message timeout, msg=5
Hence, when the vc4 HDMI driver is used, let it control audio. This is done
by introducing a new device tree property to the audio node, and
extending the vc4-kms-v3d overlays to set it appropriately.
Signed-off-by: Hristo Venev <hristo@venev.name>
Since the unicam driver was modified to write to a dummy buffer when no
user-supplied buffer is available, it can now write to and return a
buffer even when there's only a single one. Enable this by changing the
min_buffers_needed in the vb2_queue; it will be useful for enabling
still captures without allocating more memory than absolutely necessary.
Signed-off-by: David Plowman <david.plowman@raspberrypi.com>
Manage the split between addresses for the VPU and addresses for the
40-bit DMA controller with a dedicated DMA device pointer that on non-
BCM2711 platforms is the same as the main VCHIQ device. This allows
the VCHIQ node to stay in the usual place in the DT, and removes the
ugly VC_SAFE macros.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
staging: vchiq_arm: Use g_dma_dev for dma_unmap_sg
Commit "staging: vchiq_arm: Clean up 40-bit DMA support" failed to
change one of the calls to dma_unmap_sg to pass in g_dma_dev (rather
than g_dev). Correct that oversight.
See: https://github.com/raspberrypi/linux/issues/3647
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Enabling zswap support in the kernel configuration costs about 1.5MB
of RAM, even when zswap is not enabled at runtime. This cost can be
reduced significantly by deferring initialisation (including pool
creation) until the "enabled" parameter is set to true. There is a
small cost to this in that some initialisation code has to remain in
memory after the init phase, just in case they are needed later,
but the total size increase is negligible.
See: https://github.com/raspberrypi/linux/pull/3432
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The change to retrieve the pixel format always on g_fmt didn't
check whether the native or unpacked version of the format
had been requested, and always returned the packed one.
Correct this so that the packing setting is retained whereever
possible.
Fixes "9d59e89 media: bcm2835-unicam: Re-fetch mbus code from subdev
on a g_fmt call"
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
V4L2 wishes to have the codec header bytes in the same buffer as the
first encoded frame, so it does become 1-in 1-out for encoding.
The firmware now has an option to do this, so enable it.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The SC16IS7XX hardware flow control is mishandled by the driver in
a number of ways:
1. The set_baud method accidentally clears it when setting EFR bit.
2. Even though hardware flow control is enabled, it isn't indicated
back to the serial framework.
3. Applying the flow control clears the EFR bit.
4. The CTS support is not indicated in the return from
sc16is7xx_get_mctrl.
Address all of those issues using a mixture of patches found on the
linked pages.
See: https://github.com/raspberrypi/linux/issues/2542
See: https://www.spinics.net/lists/linux-serial/msg21794.html
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
From when bringing up the driver, there was a check in the isr
to ignore interrupts (claiming them handled) should the driver
not be streaming.
The VPU now will not register a camera driver if it finds a
CSI2 node enabled in device tree, therefore this flawed check is
redundant.
https://github.com/raspberrypi/linux/issues/3602
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
If the firmware hasn't detected a display, the driver would assume
one display was available, but because it had failed to retrieve the
display size it would try to allocate a zero-sized buffer.
Avoid the allocation failure by bailing out early if no display is
found.
See: https://github.com/raspberrypi/linux/issues/3598
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The reference counting of node->open was only incremented after
a check that the node was v4l2_fh_is_singular_file, which resulted
in the counting going wrong and s_power not being called at an
appropriate time.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
unicam_release calls _vb2_fop_release, which will call stop_streaming
if that particular node was streaming. Calling it unconditionally (as
the code was) means that if a second handle was opened eg to alter
a setting, on closing that connection it also stopped Unicam.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Sensors are now reflecting cropping and scaling parameters through
the selection API, therefore Unicam needs to forward the requests
through to the subdev.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
BCM2711 has 4 DMA channels with a 40-bit address range, allowing them
to access the full 4GB of memory on a Pi 4.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcmn2835_isp is a platform driver dependent on vchiq,
therefore add the load/unload functions for it to vchiq.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add V4L2_META_FMT_BCM2835_ISP_STATS V4L2 format type.
This new format will be used by the BCM2835 ISP device to return
out ISP statistics for 3A.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The sensor subdevice may change the Bayer order if a H/V flip is
requested after a s_fmt call. Unicam g_fmt must call the subdev get_fmt
in case this has happened and return out the correct format 4cc.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
The unicam driver supports both the SOURCE_CHANGE and CTRL events. Both
events are only generated on the image video node, so the event-related
ioctls are useless on the medatada node. Disable them.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>
Reviewed-by: Naushir Patuck <naush@raspberrypi.com>
If no buffer has been queued by a userland application, we use an
internal dummy buffer for the hardware to spin in. This will allow
the driver to release the existing userland buffer back to the
application for processing.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
This patch adds a new node in the bcm2835-unicam driver to support
CSI-2 embedded data streams. The subdevice is queried to see if
embedded data is available from the sensor.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Move device node specific state out of the device state structure and
into a new node structure. This separation will be needed for future
changes where we will add an embedded data node to the driver to work
alongside the existing image data node.
Currently only use a single image node, so this commit does not add
any functional changes.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
This patch adds MEDIA_BUS_FMT_SENSOR_DATA used by the bcm2835-unicam
driver to support CSI-2 embedded data streams from camera sensors.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add V4L2_META_FMT_SENSOR_DATA format 4CC.
This new format will be used by the BCM2835 Unicam device to return
out camera sensor embedded data.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Add driver for the Unicam camera receiver block on
BCM283x processors.
This commit is made up of a series of changes cherry-picked from the
rpi-4.19.y branch.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Limit mappings to the permitted range, but don't map more than asked
for otherwise we walk off the end of the allocated VMA.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Commit f3186dd876 ("spi: Optionally use GPIO descriptors for CS GPIOs")
amended of_spi_parse_dt() to always set SPI_CS_HIGH for SPI slaves whose
Chip Select is defined by a "cs-gpios" devicetree property.
This change breaks drivers whose probe functions set the mode field of
the spi_device because in doing so they clear the SPI_CS_HIGH flag.
Fix by setting SPI_CS_HIGH in spi_setup (under the same conditions as
in of_spi_parse_dt()).
See also: 83b2a8fe43 ("spi: spidev: Fix CS polarity if GPIO descriptors are used")
Fixes: f3186dd876 ("spi: Optionally use GPIO descriptors for CS GPIOs")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
SQUASH: spi: Demote SPI_CS_HIGH warning to KERN_DEBUG
This warning is unavoidable from a client's perspective and
doesn't indicate anything wrong (just surprising).
SQUASH with "spi: use_gpio_descriptor fixup moved to spi_setup"
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
framebuffer_check was computing a minimum pitch value and ensuring
that the provided value was greater than this.
That check is only valid if the format is linear.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The HDMI controllers found in the BCM2711 SoC need some adjustments to the
bindings, especially since the registers have been shuffled around in more
register ranges.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: devicetree@vger.kernel.org
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
When the MMC isn't plugged in, the driver will spam the console which is
pretty annoying when using NFS.
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
This driver is for the HEVC/H265 decoder block on the Raspberry
Pi 4, and conforms to the V4L2 stateless decoder API.
Signed-off-by: John Cox <jc@kynesim.co.uk>
Some of the Broadcom codec blocks use a column based YUV4:2:0 image
format, so add the documentation and defines for both 8 and 10 bit
versions.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
WPP can allow greater parallelism within the decode, but needs
offset information to be passed in.
Adds num_entry_point_offsets and entry_point_offset_minus1 to
v4l2_ctrl_hevc_slice_params.
This is based on Jernej Skrabec's patches for cedrus which
implement the same feature.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
From https://patchwork.linuxtv.org/patch/60725/
Changes requested, but mainly docs.
If HEVC frame consists of multiple slices, segment address has to be
known in order to properly decode it.
Add segment address field to slice parameters.
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Adds a format that is 3 10bit YUV 4:2:0 samples packed into
a 32bit work (with 2 spare bits).
Supported on Broadcom BCM2711 chips.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The DT bindings description of the Brcmstb PCIe device is described. This
node can be used by almost all Broadcom settop box chips, using
ARM, ARM64, or MIPS CPU architectures.
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
When symbols from overlays are added to the live tree their paths must
be rebased. The translated symbol is normally the result of joining
the fragment-relative path (with a leading "/") to the target path
(either copied directly from the "target-path" property or resolved
from the phandle). This translation fails when the target is the root
node (a common case for Raspberry Pi overlays) because the resulting
path starts with a double slash. For example, if target-path is "/" and
the fragment adds a node called "newnode", the label associated with
that node will be assigned the path "//newnode", which can't be found
in the tree.
Fix the failure case by explicitly replacing a target path of "/" with
an empty string.
Fixes: d1651b03c2 ("of: overlay: add overlay symbols to live device tree")
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The definition of compat_ptr is now common for most platforms, but
requires the inclusion of <linux/compat.h>.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The definition of compat_ptr is now common for most platforms, but
requires the inclusion of <linux/compat.h>.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
pinctrl-bcm2835 is a combined pinctrl/gpio driver. Currently the gpio
side is registered first, but this breaks gpio hogs (which are
configured during gpiochip_add_data). Part of the hog initialisation
is a call to pinctrl_gpio_request, and since the pinctrl driver hasn't
yet been registered this results in an -EPROBE_DEFER from which it can
never recover.
Change the initialisation sequence to register the pinctrl driver
first.
See: https://www.raspberrypi.org/forums/viewtopic.php?f=107&t=260600
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
A failure in gpiochip_irqchip_add leads to a leak of a gpiochip. Fix
the leak with the use of devm_gpiochip_add_data.
Fixes: 85ae9e512f ("pinctrl: bcm2835: switch to GPIOLIB_IRQCHIP")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
vchiq kernel clients are now instantiated as platform drivers rather
than using DT, but the children of the vchiq interface may still
benefit from access to DT properties. Give them the option of a
a sub-node of the vchiq parent for configuration and to allow
them to be disabled.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The IMA (Integrity Measurement Architecture) looks for a TPM (Trusted
Platform Module) having been registered when it initialises; otherwise
it assumes there is no TPM. It has been observed on BCM2835 that IMA
is initialised before TPM, and that initialising the BCM2835 clock
driver before the firmware driver has the effect of reversing this
order.
Change the firmware driver to initialise at core_initcall, delaying the
BCM2835 clock driver to postcore_initcall.
See: https://github.com/raspberrypi/linux/issues/3291https://github.com/raspberrypi/linux/pull/3297
Signed-off-by: Luke Hinds <lhinds@redhat.com>
Co-authored-by: Phil Elwell <phil@raspberrypi.org>
vchiq on Pi4 is no longer under the soc node, therefore it
doesn't get the dma-ranges for the VPU.
Switch to using the configuration of the old dma controller as
that will set the dma-ranges correctly.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The VCHIQ driver now loads the audio, camera, codec, and vc-sm
drivers as platform drivers. However they were not being given
the correct DMA configuration.
Call of_dma_configure with the parent (VCHIQ) parameters to be
inherited by the child.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
For performance/power it is beneficial to adjust gpu clocks with arm clock.
This is how the downstream cpufreq driver works
Signed-off-by: popcornmix <popcornmix@gmail.com>
Setting the v3d clock to low value allows firmware to handle dvfs in case
where v3d hardware is not being actively used (e.g. console use).
Signed-off-by: popcornmix <popcornmix@gmail.com>
Add device tree entries and code to allow the specification of
the lighting modes for the LED's on the ethernet connector.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
net:phy:2711 Change the default ethernet LED actions
This should return default behaviour back to that of previous
releases.
Following the same pattern as bcm2835-camera and bcm2835-audio,
register the V4L2 codec driver as a platform driver
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Following the same pattern as bcm2835-camera and bcm2835-audio,
register the vcsm-cma driver as a platform driver
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
The irq_fence and done_fence are given a reference that is never
released. The necessary dma_fence_put()s seem to have been
deleted in error in an earlier commit.
Fixes: 0b73676836b2 ("drm/v3d: Clock V3D down when not in use.")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The v3d driver currently encounters a lot of MMU PTE exceptions, so
only log the first to avoid swamping the kernel log.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The Infineon IRS1125 is a time of flight depth sensor that
has a CSI-2 interface.
Add a V4L2 subdevice driver for this device.
Signed-off-by: Markus Proeller <markus.proeller@pieye.org>
HDMI Alsa devices renamed to match names used by DRM, to
HDMI 1 and HDMI 2
Check for which HDMI devices are connected and only create
devices for those that are present.
The rename of the devices might cause some backwards compatibility
issues, but since this particular part of the driver needs to be
specifically enabled, I suspect the number of people who will see
the problem will be very small.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
After the decision to use bcm2711 compatible for upstream, we should
switch all accepted compatibles to bcm2711. So we can boot with
one DTB the down- and the upstream kernel.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Adds a simple greyworld white balance preset, mainly for use
with cameras without an IR filter (eg Raspberry Pi NoIR)
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
staging: bcm2835-camera: Add greyworld AWB mode
This is mainly used for the NoIR camera which has no IR
filter and can completely confuse normal AWB presets.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
staging:bcm2835-camera: Fix the cherry-pick of AWB Greyworld
The cherry-pick of the patch that added the greyworld AWB mode
was incomplete. Fix it up.
Fixes: b3ef481fe2 "staging: bcm2835-camera: Add greyworld AWB mode"
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
media: v4l2: Add Greyworld AWB control name
Add name for greyworld to white_balance preset names.
This patch previously applied to v4l2-ctrl.c but that was split
and deleted.
Signed-off-by: John Cox <jc@kynesim.co.uk>
The IMX219 is an 8MPix CSI2 sensor, supporting 2 or 4 data lanes.
Document the binding for this device.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Users have reported log spam created by "Event Ring Full" xHC event
TRBs. These are caused by interrupt latency in conjunction with a very
busy set of devices on the bus. The errors are benign, but throughput
will suffer as the xHC will pause processing of transfers until the
event ring is drained by the kernel. Expand the number of event TRB slots
available by increasing the number of event ring segments in the ERST.
Controllers have a hardware-defined limit as to the number of ERST
entries they can process, so make the actual number in use
min(ERST_MAX_SEGS, hw_max).
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Some combinations of Pi 4Bs and Ethernet switches don't reliably get a
DCHP-assigned IP address, leaving the unit with a self=assigned 169.254
address. In the failure case, the Pi is left able to receive packets
but not send them, suggesting that the MAC<->PHY link is getting into
a bad state.
It has been found empirically that skipping a reset step by the genet
driver prevents the failures. No downsides have been discovered yet,
and unlike the forced renegotiation it doesn't increase the time to
get an IP address, so the workaround is enabled by default; add
genet.skip_umac_reset=n
to the command line to disable it.
See: https://github.com/raspberrypi/linux/issues/3108
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
These wireless mouse/keyboard combo remote control devices specify
multiple "wheel" events in their report descriptors. The wheel events
are incorrectly defined and apparently map to accelerometer data, leading
to spurious mouse scroll events being generated at an extreme rate when
the device is moved.
As a workaround, use HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE to mask
feeding the extra wheel events to the input subsystem.
See: https://github.com/raspberrypi/firmware/issues/1189
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Based on the gpiomem driver, allow mapping of the decoder register
spaces such that userspace can access control/status registers.
This driver is intended for use with a custom ffmpeg backend accelerator
prior to a v4l2 driver being written.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
driver: char: rpivid: Destroy the legacy device on remove
The legacy name support created a new device that was never destroyed.
If the driver was unloaded and reloaded, it failed due to the
device already existing.
Fixes: "75f1d14 driver: char: rpivid - also support legacy name"
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
driver: char: rpivid: Clean up error handling use of ERR_PTR/IS_ERR
The driver used an unnecessary intermediate void* variable so it
only called ERR_PTR once to convert to the error value.
Switch to converting as the error arises to remove these intermediate
variables.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
driver: char: rpivid: Add error handling to the legacy device load
The return value from device_create for the legacy device was never
checked or handled. Add the required error handling.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
driver: char: rpivid: Fix coding style whitespace issues.
Makes checkpatch happier.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
driver: char: rpimem: Add SPDX licence header.
Stops checkpatch complaining.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
driver: char: rpivid: Fix access to freed memory
The error path during probe frees the private memory block, and
then promptly dereferences it to log an error message.
Use the base device instead of the pointer to it in the private
structure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
My various attempts at re-enabling runtime PM have failed, so just
crank the clock down when V3D is idle to reduce power consumption.
Signed-off-by: Eric Anholt <eric@anholt.net>
Something is still unstable -- on starting a new glxgears from an idle
X11, I get an MMU violation in high addresses. The CTS also failed
quite quickly. With this, CTS progresses for an hour before OOMing
(allocating some big buffers when my board only has 600MB available to
Linux)
Signed-off-by: Eric Anholt <eric@anholt.net>
The BCM2835 I2C blocks have a register to set the clock-stretch
timeout - how long the device is allowed to hold SCL low - in bus
cycles. The current driver doesn't write to the register, therefore
the default value of 64 cycles is being used for all devices.
Set the timeout to the value recommended for SMBus - 35ms.
See: https://github.com/raspberrypi/linux/issues/3064
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
In translating the runtime PM code from vc4, I missed the ".pm"
assignment to actually connect them up. Fixes missing MMU setup if
runtime PM resets V3D.
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit ca197699af29baa8236c74c53d4904ca8957ee06)
If it's off, we know it will be reset on poweron, so the MMU won't
have any TLB cached from before this point. Avoids failed waits for
MMU flush to reply.
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3ee4e2e0a9e9587eacbb69b067bbc72ab2cdc47b)
Must be called in a non-atomic context, after the endpoint
has been registered with the hardware via xhci_add_endpoint
and before the first URB is submitted for the endpoint.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
xHCI caches device and endpoint data after the interface is configured,
so an explicit command needs to be issued for any device driver wanting
to alter the polling interval of an endpoint.
Add usb_fixup_endpoint() to allow drivers to do this. The fixup must be
called after calculating endpoint bandwidth requirements but before any
URBs are submitted.
If polling intervals are shortened, any bandwidth reservations are no
longer valid but in practice polling intervals are only ever relaxed.
Limit the scope to interrupt transfers for now.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
This falls under the same "we can reprogram glitch-free as long as we
pause generation" rule as updating the div/frac fields. This can be
used for runtime reclocking of V3D to manage power leakage.
Signed-off-by: Eric Anholt <eric@anholt.net>
Without the actual power management part any more, there's a lot less
to set up for V3D. We just need to clear the RSTN field for the power
domain, and expose the reset controller for toggling it again.
This is definitely incomplete -- the old ISP and H264 is in the old
bridge, but since we have no consumers of it I've just done the
minimum to get V3D working.
Signed-off-by: Eric Anholt <eric@anholt.net>
There are several warts surrounding bcmgenet_mii_probe() as this
function is called from ndo_open, but it's calling registration-type
functions. The probe should be called at probe time and refactored
such that the PHY device data can be extracted to limit the scope
of this flag to Broadcom PHYs.
For now, pass this flag in as it puts our attached PHY into a low-power
state when disconnected.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Set defaults for TX and RX packet coalescing to be equivalent to:
# ethtool -C eth0 tx-frames 10
# ethtool -C eth0 rx-usecs 50
This may be something we want to set via DT parameters in the
future.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Conditional on a new compatible string, change the pagelist encoding
such that the top 24 bits are the pfn, leaving 8 bits for run length
(-1).
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
staging/vchiq_arm: Fix bcm2711 compatible string
Fixes: "vchiq: Add 36-bit address support"
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The Linux support for controlling card power via regulators appears to
be contentious. I would argue that the default behaviour is contrary to
the SDHCI spec - turning off the power writes a reserved value to the
SD Bus Voltage Select field of the Power Control Register, which
seems to kill the Arasan/iProc controller - but fortunately there is a
hook in sdhci_ops to override the behaviour. Borrow the implementation
from sdhci_arasan_set_power.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The HWRNG on the BCM2838 is compatible to iproc-rng200, so add the
support to this driver instead of bcm2835-rng.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
hwrng: iproc-rng200: Correct SoC name
The Pi 4 SoC is called BCM2711, not BCM2838.
Fixes: "hwrng: iproc-rng200: Add BCM2838 support"
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
The legacy peripherals can only address the first gigabyte of RAM, so
ensure that DMA allocations are restricted to that region.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The ioremapping creates mappings within the vmalloc area. The
equivalent early function, create_mapping, now checks that the
requested explicit virtual address is between VMALLOC_START and
VMALLOC_END. As there is no reason to have any correlation between
the physical and virtual addresses, put the required mappings at
VMALLOC_START and above.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
On error, vchiq_mmal_component_init could leave the
event context allocated for ports.
Clean them up in the error path.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
vchiq_mmal_component_init calls init_event_context for the
control port, but vchiq_mmal_component_finalise didn't free
it, causing a memory leak..
Add the free call.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
mmal_parameters.h hasn't been updated to reflect additions made
over the last few years. Update it to reflect the currently
supported parameters.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The list of formats was copied before Bayer support was added.
The ISP supports Bayer and is being supported by the bcm2835_codec
driver, so add in the encodings for them.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The debug text for how many clocks have been registered
uses "%d" with a size_t. Correct it to "%zd".
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The MMAL client_component field is used with the event
mechanism to allow the client to identify the component for
which the event is generated.
The field is only 32bits in size, therefore we can't use a
pointer to the component in a 64 bit kernel.
Component handles are already held in an array per VCHI
instance, so use the array index as the client_component handle
to avoid having to create a new IDR for this purpose.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
videobuf2 only allowed exporting a dmabuf as a file descriptor,
but there are instances where having the struct dma_buf is
useful within the kernel.
Split the current implementation into two, one step which
exports a struct dma_buf, and the second which converts that
into an fd.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Add the ability to send data to ports. This only supports
zero copy mode as the required bulk transfer setup calls
are not done.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
(Preparation for the codec driver).
The codec uses the event mechanism to report things such as
resolution changes. It is signalled by the cmd field of the buffer
being non-zero.
Add support for passing this information out to the client.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Fixes up a checkpatch error "Avoid using bool structure members
because of possible alignment issues".
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
When calling tc358743_set_fmt, the code was calling tc358743_get_fmt
to choose a valid format. However that sets the colorspace
based on what was read back from the chip. When you set the format,
then the driver would choose and program the colorspace based
on the format code.
The result was that if you called try or set format for UYVY
when the current format was RGB3 then you would get told sRGB,
and try RGB3 when current was UYVY and you would get told
SMPTE170M.
The value programmed into the chip is determined by this driver,
therefore there is no need to read back the value. Return the
colorspace based on the format set/tried instead.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Document the DT bindings for the CSI2/CCP2 receiver peripheral
(known as Unicam) on BCM283x SoCs.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Acked-by: Rob Herring <robh@kernel.org>
New helper defines that allow printing of a FOURCC using
printf(V4L2_FOURCC_CONV, V4L2_FOURCC_CONV_ARGS(fourcc));
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The ADV7282M can support YPbPr on AIN1-3, but this was
not selectable from the driver. Add it to the list of
supported input modes.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The hardware default is differential CVBS on AIN1 & 2, which
isn't very useful.
Select the first input that is defined as valid for the
chip variant (typically CVBS_AIN1).
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The probe for the TC358743 reads the CHIPID register from
the device and compares it to the expected value of 0.
If the I2C request fails then that also returns 0, so
the driver loads thinking that the device is there.
Generally I2C communications are reliable so there is
limited need to check the return value on every transfer,
therefore only amend the one read during probe to check
for I2C errors.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Adds register setups for running the CSI lanes at 972Mbit/s,
which allows 1080P50 UYVY down 2 lanes.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
g_mbus_config was supposed to indicate all supported lane numbers, not
only the number of those currently in active use. Since the TC358743
can dynamically reduce the number of active lanes if the required
bandwidth allows for it, report all lane numbers up to the connected
number of lanes as supported in pdata mode.
In device tree mode, do not report lane count and clock mode at all, as
the receiver driver can determine these from the device tree.
To allow communicating the number of currently active lanes, add a new
bitfield to the v4l2_mbus_config flags. This is a temporary fix, to be
used only until a better solution is found.
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
The existing fixed value of 16 worked for UYVY 720P60 over
2 lanes at 594MHz, or UYVY 1080P60 over 4 lanes. (RGB888
1080P60 needs 6 lanes at 594MHz).
It doesn't allow for lower resolutions to work as the FIFO
underflows.
374 is required for 1080P24-30 UYVY over 2 lanes @ 972Mbit/s, but
>374 means that the FIFO underflows on 1080P50 UYVY over 2 lanes
@ 972Mbit/s.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The lan78xx uses a 12-byte hardware rx header, so there is no need
to allocate SKBs with NET_IP_ALIGN set. Removes alignment faults
in both dwc_otg and in ipv6 processing.
Add the ability to interpret the high bits of the dreq specifier as
flags to be included in the DMA_CS register. The motivation for this
change is the ability to set the DISDEBUG flag for SD card transfers
to avoid corruption when using the VPU debugger.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The bInterval is set to 4 (i.e. 8 microframes => 1ms) and the only bit
that the driver pays attention to is "link was reset". If there's a
flapping status bit in that endpoint data, (such as if PHY negotiation
needs a few tries to get a stable link) then polling at a slower rate
would act as a de-bounce.
See: https://github.com/raspberrypi/linux/issues/2447
The driver already reported the firmware build date during probe.
The mailbox calls have been extended to also report the variant
1 = standard start.elf
2 = start_x.elf (includes camera stack)
3 = start_db.elf (includes assert logging)
4 = start_cd.elf (cutdown version for smallest memory footprint).
Log the variant during probe.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
firmware: raspberrypi: Report the fw git hash during probe
The firmware can now report the git hash from which it was built
via the mailbox, so report it during probe.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Ethernet cables with faulty or missing pairs (specifically pairs C and
D) allow auto-negotiation to 1000Mbs, but do not support the successful
establishment of a link. Add a DT property, "microchip,downshift-after",
to configure the number of auto-negotiation failures after which it
falls back to 100Mbs. Valid values are 2, 3, 4, 5 and 0, where 0 means
never downshift.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Step wise governor increases the mitigation level when the temperature
goes above a threshold and will decrease the mitigation when the
temperature falls below the threshold. If it were a case, where the
temperature hovers around a threshold, the mitigation will be applied
and removed at every iteration. This reaction to the temperature is
inefficient for performance.
The use of hysteresis temperature could avoid this ping-pong of
mitigation by relaxing the mitigation to happen only when the
temperature goes below this lower hysteresis value.
Signed-off-by: Ram Chandrasekar <rkumbako@codeaurora.org>
Signed-off-by: Lina Iyer <ilina@codeaurora.org>
Avoid a hard userspace ABI change by adding a compatible get_throttled
sysfs entry. Its value is now feed by the GET_THROTTLED requests of the
new hwmon driver. The first access to get_throttled will generate
a warning.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Although the correct fix for low voltage warnings is to
improve the power supply, the current implementation
of the detection can fill the log if the warning
happens freqently. This replaces the logging with
slightly custom ratelimited logging.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Enable EEE mode as soon as possible after connecting to the PHY, and
before phy_start. This avoids a second link negotiation, which speeds
up booting and stops the interface failing to become ready.
See: https://github.com/raspberrypi/linux/issues/2437
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
As of 4.18, a firmware that implements the update_connect_params
method but doesn't claim to support roaming causes an error. We
disabled firmware roaming in 4.4 [1] because it appeared to
prevent disconnects, but let's try with the current firmware to see
if things have improved.
[1] dd91880117
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The published API to the dynamic overlay application mechanism now
takes a Flattened Device Tree blob as input so that it can manage the
lifetime of the unflattened tree. Conveniently, the new API call -
of_overlay_fdt_apply - is virtually a drop-in replacement for
create_overlay, which can now be deleted.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
ad83c7cb2f ("irqchip/irq-bcm2836: Add support for DT interrupt polarity")
changed the way that the BCM2836/7 local interrupts are mapped; instead
of being pre-mapped they are now mapped on-demand. A side effect of this
change is that the call to irq_of_parse_and_map from armctrl_of_init
creates a new mapping, forming a gap between the IRQs and the FIQs. This
gap breaks the FIQ<->IRQ mapping which up to now has been done by assuming:
1) that the value of FIQ_START is the same as the number of normal IRQs
that will be mapped (still true), and
2) that this value is also the offset between an IRQ and its equivalent
FIQ (which is no longer the case).
Remove both assumptions by measuring the interval between the last IRQ
and the last FIQ, passing it as the parameter to init_FIQ().
Fixes: https://github.com/raspberrypi/linux/issues/2432
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Register for reboot notifications, sending RPI_FIRMWARE_NOTIFY_REBOOT
over the mailbox interface on reception.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add two new DT properties:
* microchip,eee-enabled - a boolean to enable EEE
* microchip,tx-lpi-timer - time in microseconds to wait before entering
low power state
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
I2C busses can be assigned specific bus numbers using aliases in
Device Tree - string properties where the name is the alias and the
value is the path to the node. The current DT parameter mechanism
does not allow property names to be derived from a parameter value
in any way, so it isn't possible to generate unique or matching
aliases for nodes from an overlay that can generate multiple
instances, e.g. i2c-gpio.
Work around this limitation (at least temporarily) by allowing
the i2c adapter number to be initialised from the "reg" property
if present.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
There is a new test in __irq_startup that the IRQ is activated, which
hasn't been the case for FIQs since they bypass some of the usual setup.
Augment enable_fiq to include a call to irq_activate to avoid the
warning.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Create a semi-static mapping for the USB registers early in the boot
process, before additional kernel threads are started, so all threads
will have the mappings from the start. This avoids the need for
data aborts to lazily update them.
See: https://github.com/raspberrypi/linux/issues/2450
Signed-off-by: Floris Bos <bos@je-eigen-domein.nl>
The VideoCore bootloader passes in Serial number and
Revision number through Device Tree. Make these available to
userspace through /proc/cpuinfo.
Mainline status:
There is a commit in linux-next that standardize passing the serial
number through Device Tree (string: /serial-number):
ARM: 8355/1: arch: Show the serial number from devicetree in cpuinfo
There was an attempt to do the same with the revision number, but it
didn't get in:
[PATCH v2 1/2] arm: devtree: Set system_rev from DT revision
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Uses the debugfs I/F to provide access to the AXI
bus performance monitors.
Requires the new mailbox peripheral access for access
to the VPU performance registers, system bus access
is done using direct register reads.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
raspberrypi_axi_monitor: suppress warning
Suppress the following warning by casting the pointer to and uintptr_t
before to u32:
Signed-off-by: Matteo Croce <mcroce@redhat.com>
IRQ-CPU mapping is round robined on ARM64 to increase
concurrency and allow multiple interrupts to be serviced
at a time. This reduces the need for FIQ.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
brcmfmac: Disable power management
Disable wireless power saving in the brcmfmac WLAN driver. This is a
temporary measure until the connectivity loss resulting from power
saving is resolved.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: Use original country code as a fallback
Commit 73345fd212:
brcmfmac: Configure country code using device specific settings
prevents region codes from working on devices that lack a region code
translation table. In the event of an absent table, preserve the old
behaviour of using the provided code as-is.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: Plug memory leak in brcmf_fill_bss_param
See: https://github.com/raspberrypi/linux/issues/1471
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: do not use internal roaming engine by default
Some evidence of curing disconnects with this disabled, so make it a default.
Can be overridden with module parameter roamoff=0
See: http://projectable.me/optimize-my-pi-wi-fi/
brcmfmac: Change stop_ap sequence
Patch from Broadcom/Cypress to resolve a customer error
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Revert "brcmfmac: Disable power management"
Shortly after the release of the Pi 3B, a loss of SSH connectivity
over WiFi was traced to the power management handling, so power
management was disabled. And so it has remained ever since.
Enabling power management saves 55mA (~270mW) on a Pi 4B, so is very
much worth the minimal effort of reverting this patch, which was
squashed and rebased many times since then to the commit hash is
meaningless.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
This is a port of Pantelis Antoniou's v3 port that makes use of the
new upstreamed configfs support for binary attributes.
Original commit message:
Add a runtime interface to using configfs for generic device tree overlay
usage. With it its possible to use device tree overlays without having
to use a per-platform overlay manager.
Please see Documentation/devicetree/configfs-overlays.txt for more info.
Changes since v2:
- Removed ifdef CONFIG_OF_OVERLAY (since for now it's required)
- Created a documentation entry
- Slight rewording in Kconfig
Changes since v1:
- of_resolve() -> of_resolve_phandles().
Originally-signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
DT configfs: Fix build errors on other platforms
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
DT configfs: fix build error
There is an error when compiling rpi-4.6.y branch:
CC drivers/of/configfs.o
drivers/of/configfs.c:291:21: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
.default_groups = of_cfs_def_groups,
^
drivers/of/configfs.c:291:21: note: (near initialization for 'of_cfs_subsys.su_group.default_groups.next')
The .default_groups is linked list since commit
1ae1602de0.
This commit uses configfs_add_default_group to fix this problem.
Signed-off-by: Slawomir Stepien <sst@poczta.fm>
configfs: New of_overlay API
Add a mailbox-driven backlight controller for the Raspberry Pi DSI
touchscreen display. Requires updated GPU firmware to recognise the
mailbox request.
Signed-off-by: Gordon Hollingworth <gordon@raspberrypi.org>
Add Raspberry Pi firmware driver to the dependencies of backlight driver
Otherwise the backlight driver fails to build if the firmware
loading driver is not in the kernel
Signed-off-by: Alex Riesen <alexander.riesen@cetitec.com>
ASoC: Add support for Rpi-DAC
ASoC: Add prompt for ICS43432 codec
Without a prompt string, a config setting can't be included in a
defconfig. Give CONFIG_SND_SOC_ICS43432 a prompt so that Pi soundcards
can use the driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add IQaudIO Sound Card support for Raspberry Pi
Set a limit of 0dB on Digital Volume Control
The main volume control in the PCM512x DAC has a range up to
+24dB. This is dangerously loud and can potentially cause massive
clipping in the output stages. Therefore this sets a sensible
limit of 0dB for this control.
Allow up to 24dB digital gain to be applied when using IQAudIO DAC+
24db_digital_gain DT param can be used to specify that PCM512x
codec "Digital" volume control should not be limited to 0dB gain,
and if specified will allow the full 24dB gain.
Modify IQAudIO DAC+ ASoC driver to set card/dai config from dt
Add the ability to set the card name, dai name and dai stream name, from
dt config.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
IQaudIO: auto-mute for AMP+ and DigiAMP+
IQAudIO amplifier mute via GPIO22. Add dt params for "one-shot" unmute
and auto mute.
Revision 2, auto mute implementing HiassofT suggestion to mute/unmute
using set_bias_level, rather than startup/shutdown....
"By default DAPM waits 5 seconds (pmdown_time) before shutting down
playback streams so a close/stop immediately followed by open/start
doesn't trigger an amp mute+unmute."
Tested on both AMP+ (via DAC+) and DigiAMP+, with both options...
dtoverlay=iqaudio-dacplus,unmute_amp
"one-shot" unmute when kernel module loads.
dtoverlay=iqaudio-dacplus,auto_mute_amp
Unmute amp when ALSA device opened by a client. Mute, with 5 second delay
when ALSA device closed. (Re-opening the device within the 5 second close
window, will cancel mute.)
Revision 4, using gpiod.
Revision 5, clean-up formatting before adding mute code.
- Convert tab plus 4 space formatting to 2x tab
- Remove '// NOT USED' commented code
Revision 6, don't attempt to "one-shot" unmute amp, unless card is
successfully registered.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
ASoC: iqaudio-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: iqaudio-dac: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
Added support for HiFiBerry DAC+
The driver is based on the HiFiBerry DAC driver. However HiFiBerry DAC+ uses
a different codec chip (PCM5122), therefore a new driver is necessary.
Add support for the HiFiBerry DAC+ Pro.
The HiFiBerry DAC+ and DAC+ Pro products both use the existing bcm sound driver with the DAC+ Pro having a special clock device driver representing the two high precision oscillators.
An addition bug fix is included for the PCM512x codec where by the physical size of the sample frame is used in the calculation of the LRCK divisor as it was found to be wrong when using 24-bit depth sample contained in a little endian 4-byte sample frame.
Limit PCM512x "Digital" gain to 0dB by default with HiFiBerry DAC+
24db_digital_gain DT param can be used to specify that PCM512x
codec "Digital" volume control should not be limited to 0dB gain,
and if specified will allow the full 24dB gain.
Add dt param to force HiFiBerry DAC+ Pro into slave mode
"dtoverlay=hifiberry-dacplus,slave"
Add 'slave' param to use HiFiBerry DAC+ Pro in slave mode,
with Pi as master for bit and frame clock.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
Fixed a bug when using 352.8kHz sample rate
Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>
ASoC: pcm512x: revert downstream changes
This partially reverts commit 185ea05465
which was added by https://github.com/raspberrypi/linux/pull/1152
The downstream pcm512x changes caused a regression, it broke normal
use of the 24bit format with the codec, eg when using simple-audio-card.
The actual bug with 24bit playback is the incorrect usage
of physical_width in various drivers in the downstream tree
which causes 24bit data to be transmitted with 32 clock
cycles. So it's not the pcm512x that needs fixing, it's the
soundcard drivers.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplus: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplus: transmit S24_LE with 64 BCLK cycles
Signed-off-by: Matthias Reichl <hias@horus.com>
hifiberry_dacplus: switch to snd_soc_dai_set_bclk_ratio
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplus: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add driver for rpi-proto
Forward port of 3.10.x driver from https://github.com/koalo
We are using a custom board and would like to use rpi 3.18.x
kernel. Patch works fine for our embedded system.
URL to the audio chip:
http://www.mikroe.com/add-on-boards/audio-voice/audio-codec-proto/
Playback tested with devicetree enabled.
Signed-off-by: Waldemar Brodkorb <wbrodkorb@conet.de>
ASoC: rpi-proto: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add Support for JustBoom Audio boards
justboom-dac: Adjust for ALSA API change
As of 4.4, snd_soc_limit_volume now takes a struct snd_soc_card *
rather than a struct snd_soc_codec *.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
ASoC: justboom-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params as it's no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: justboom-dac: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
New AudioInjector.net Pi soundcard with low jitter audio in and out.
Contains the sound/soc/bcm ALSA machine driver and necessary alterations to the Kconfig and Makefile.
Adds the dts overlay and updates the Makefile and README.
Updates the relevant defconfig files to enable building for the Raspberry Pi.
Thanks to Phil Elwell (pelwell) for the review, simple-card concepts and discussion. Thanks to Clive Messer for overlay naming suggestions.
Added support for headphones, microphone and bclk_ratio settings.
This patch adds headphone and microphone capability to the Audio Injector sound card. The patch also sets the bit clock ratio for use in the bcm2835-i2s driver. The bcm2835-i2s can't handle an 8 kHz sample rate when the bit clock is at 12 MHz because its register is only 10 bits wide which can't represent the ch2 offset of 1508. For that reason, the rate constraint is added.
ASoC: audioinjector-pi-soundcard: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
New driver for RRA DigiDAC1 soundcard using WM8741 + WM8804
ASoC: digidac1-soundcard: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for Dion Audio LOCO DAC-AMP HAT
Using dedicated machine driver and pcm5102a codec driver.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
ASoC: dionaudio_loco: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Allo Piano DAC boards: Initial 2 channel (stereo) support (#1645)
Add initial 2 channel (stereo) support for Allo Piano DAC (2.0/2.1) boards,
using allo-piano-dac-pcm512x-audio overlay and allo-piano-dac ALSA ASoC
machine driver.
NB. The initial support is 2 channel (stereo) ONLY!
(The Piano DAC 2.1 will only support 2 channel (stereo) left/right output,
pending an update to the upstream pcm512x codec driver, which will have
to be submitted via upstream. With the initial downstream support,
provided by this patch, the Piano DAC 2.1 subwoofer outputs will
not function.)
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Signed-off-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
ASoC: allo-piano-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params and ops as they are no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: allo-piano-dac: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for Allo Piano DAC 2.1 plus add-on board for Raspberry Pi.
The Piano DAC 2.1 has support for 4 channels with subwoofer.
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com>
Add clock changes and mute gpios (#1938)
Also improve code style and adhere to ALSA coding conventions.
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com>
PianoPlus: Dual Mono & Dual Stereo features added (#2069)
allo-piano-dac-plus: Master volume added + fixes
Master volume added, which controls both DACs volumes.
See: https://github.com/raspberrypi/linux/pull/2149
Also fix initial max volume, default mode value, and unmute.
Signed-off-by: allocom <sparky-dev@allo.com>
ASoC: allo-piano-dac-plus: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
sound: bcm: Fix memset dereference warning
This warning appears with GCC 6.4.0 from toolchains.bootlin.com:
../sound/soc/bcm/allo-piano-dac-plus.c: In function ‘snd_allo_piano_dac_init’:
../sound/soc/bcm/allo-piano-dac-plus.c:711:30: warning: argument to ‘sizeof’ in ‘memset’ call is the same expression as the destination; did you mean to dereference it? [-Wsizeof-pointer-memaccess]
memset(glb_ptr, 0x00, sizeof(glb_ptr));
^
Suggested-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
ASoC: allo-piano-dac-plus: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for Allo Boss DAC add-on board for Raspberry Pi. (#1924)
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Deepak <deepak@zilogic.com>
Reviewed-by: BabuSubashChandar <babusubashchandar@zilogic.com>
Add support for new clock rate and mute gpios.
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Deepak <deepak@zilogic.com>
Reviewed-by: BabuSubashChandar <babusubashchandar@zilogic.com>
ASoC: allo-boss-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: allo-boss-dac: transmit S24_LE with 64 BCLK cycles
Signed-off-by: Matthias Reichl <hias@horus.com>
allo-boss-dac: switch to snd_soc_dai_set_bclk_ratio
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: allo-boss-dac: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Support for Blokas Labs pisound board
Pisound dynamic overlay (#1760)
Restructuring pisound-overlay.dts, so it can be loaded and unloaded dynamically using dtoverlay.
Print a logline when the kernel module is removed.
pisound improvements:
* Added a writable sysfs object to enable scripts / user space software
to blink MIDI activity LEDs for variable duration.
* Improved hw_param constraints setting.
* Added compatibility with S16_LE sample format.
* Exposed some simple placeholder volume controls, so the card appears
in volumealsa widget.
Add missing SND_PISOUND selects dependency to SND_RAWMIDI
Without it the Pisound module fails to compile.
See https://github.com/raspberrypi/linux/issues/2366
Updates for Pisound module code:
* Merged 'Fix a warning in DEBUG builds' (1c8b82b).
* Updating some strings and copyright information.
* Fix for handling high load of MIDI input and output.
* Use dual rate oversampling ratio for 96kHz instead of single
rate one.
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
Fixing memset call in pisound.c
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
Fix for Pisound's MIDI Input getting blocked for a while in rare cases.
There was a possible race condition which could lead to Input's FIFO queue
to be underflown, causing high amount of processing in the worker thread for
some period of time.
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
Fix for Pisound kernel module in Real Time kernel configuration.
When handler of data_available interrupt is fired, queue_work ends up
getting called and it can block on a spin lock which is not allowed in
interrupt context. The fix was to run the handler from a thread context
instead.
Pisound: Remove spinlock usage around spi_sync
ASoC: pisound: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
ASoC: pisound: fix the parameter for spi_device_match
Signed-off-by: Hui Wang <hui.wang@canonical.com>
ASoC: Add driver for Cirrus Logic Audio Card
Note: due to problems with deferred probing of regulators
the following softdep should be added to a modprobe.d file
softdep arizona-spi pre: arizona-ldo1
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: rpi-cirrus: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
sound: Support for Dion Audio LOCO-V2 DAC-AMP HAT
Signed-off-by: Miquel Blauw <info@dionaudio.nl>
ASoC: dionaudio_loco-v2: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params and ops as they are no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: dionaudio_loco-v2: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for Fe-Pi audio sound card. (#1867)
Fe-Pi Audio Sound Card is based on NXP SGTL5000 codec.
Mechanical specification of the board is the same the Raspberry Pi Zero.
3.5mm jacks for Headphone/Mic, Line In, and Line Out.
Signed-off-by: Henry Kupis <fe-pi@cox.net>
ASoC: fe-pi-audio: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Add support for the AudioInjector.net Octo sound card
AudioInjector Octo: sample rates, regulators, reset
This patch adds new sample rates to the Audioinjector Octo sound card. The
new supported rates are (in kHz) :
96, 48, 32, 24, 16, 8, 88.2, 44.1, 29.4, 22.05, 14.7
Reference the bcm270x DT regulators in the overlay.
This patch adds a reset GPIO for the AudioInjector.net octo sound card.
Audioinjector octo : Make the playback and capture symmetric
This patch ensures that the sample rate and channel count of the audioinjector
octo sound card are symmetric.
audioinjector-octo: Add continuous clock feature
By user request, add a switch to prevent the clocks being stopped when
the stream is paused, stopped or shutdown. Provide access to the switch
by adding a 'non-stop-clocks' parameter to the audioinjector-addons
overlay.
See: https://github.com/raspberrypi/linux/issues/2409
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
sound: Fixes for audioinjector-octo under 4.19
1. Move the DT alias declaration to the I2C shim in the cases
where the shim is enabled. This works around a problem caused by a
4.19 commit [1] that generates DT/OF uevents for I2C drivers.
2. Fix the diagnostics in an error path of the soundcard driver to
correctly identify the reason for the failure to load.
3. Move the declaration of the clock node in the overlay outside
the I2C node to avoid warnings.
4. Sort the overlay nodes so that dependencies are only to earlier
fragments, in an attempt to get runtime dtoverlay application to
work (it still doesn't...)
See: https://github.com/Audio-Injector/Octo/issues/14
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
[1] af503716ac ("i2c: core: report OF style module alias for devices registered via OF")
ASoC: audioinjector-octo-soundcard: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Driver support for Google voiceHAT soundcard.
ASoC: googlevoicehat-codec: Use correct device when grabbing GPIO
The fixup for the VoiceHAT in 4.18 incorrectly tried to find the
sdmode GPIO pin under the card device, not the codec device.
This failed, and therefore caused the device probe to fail.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
ASoC: googlevoicehat-codec: Reformat for kernel coding standards
Fix all whitespace, indentation, and bracing errors.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
ASoC: googlevoicehat-codec: Make driver function structure const
Make voicehat_component_driver a const structure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
ASoC: googlevoicehat-codec: Only convert from ms to jiffies once
Minor optimisation and allows to become checkpatch clean.
A msec value is read out of DT or from a define, and convert once to
jiffies, rather than every time that it is used.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Driver and overlay for Allo Katana DAC
Allo Katana DAC: Updated default values
Signed-off-by: Jaikumar <jaikumar@cem-solutions.com>
Added mute stream func
Signed-off-by: Jaikumar <jaikumar@cem-solutions.net>
codecs: Correct Katana minimum volume
Update Katana minimum volume to get the exact 0.5 dB value in each step.
Signed-off-by: Sudeep Kumar <sudeepkumar@cem-solutions.net>
ASoC: Add generic RPI driver for simple soundcards.
The RPI simple sound card driver provides a generic ALSA SOC card driver
supporting a variety of Pi HAT soundcards. The intention is to avoid
the duplication of code for cards that can't be fully supported by
the soc simple/graph cards but are otherwise almost identical.
This initial commit adds support for the ADAU1977 ADC, Google VoiceHat,
HifiBerry AMP, HifiBerry DAC and RPI DAC.
Signed-off-by: Tim Gover <tim.gover@raspberrypi.org>
ASoC: Use correct card name in rpi-simple driver
Use the specific card name from drvdata instead of the snd_rpi_simple
rpi-simple-soundcard: Use nicer driver name "RPi-simple"
Rename the driver from "RPI simple soundcard" to "RPi-simple" so that
the driver name won't be mangled allowing to be used unaltered as the
card conf filename.
ASoC: rpi-simple-soundcard: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
ASoC: Add Kconfig and Makefile for sound/soc/bcm
Signed-off-by: popcornmix <popcornmix@gmail.com>
ASoC: Create a generic Pi Hat WM8804 driver
Reduce the amount of duplicated code by creating a generic driver for
Pi Hat digi cards using the WM8804 codec.
This replaces the
Allo DigiOne, Hifiberry Digi/Pro, JustBoom Digi and IQAudIO Digi
dedicate soundcard drivers with a generic driver.
There are no significant changes to the runtime behavior of the drivers
and end users should not have to change any configuration settings
after upgrading.
Minor changes
* Check the return value of snd_soc_component_update_bits
* Added some pr_debug tracing
* Various checkpatch tidyups
* Updated allodigi-one to use use 128FS at > 96 Khz. This appears to
be an omission in the original driver code so followed the Hifiberry
DAC driver approach.
ASoC: rpi-wm8804-soundcard: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
rpi-wm8804-soundcard: drop PWRDN register writes
Since kernel 4.0 the PWRDN register bits are under DAPM
control from the wm8804 driver.
Drop code that modifies that register to avoid interfering
with DAPM.
Signed-off-by: Matthias Reichl <hias@horus.com>
rpi-wm8804-soundcard: configure wm8804 clocks only on rate change
This should avoid clicks when stopping and immediately afterwards
starting a stream with the same samplerate as before.
Signed-off-by: Matthias Reichl <hias@horus.com>
rpi-wm8804-soundcard: Fixed MCLKDIV for Allo Digione
The Allo Digione board wants a fixed MCLKDIV of 256.
See: https://github.com/raspberrypi/linux/issues/3296
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
ASoC: Add support for AudioSense-Pi add-on soundcard
AudioSense-Pi is a RPi HAT based on a TI's TLV320AIC32x4 stereo codec
This hardware provides multiple audio I/O capabilities to the RPi.
The codec connects to the RPi's SoC through the I2S Bus.
The following devices can be connected through a 3.5mm jack
1. Line-In: Plain old audio in from mobile phones, PCs, etc.,
2. Mic-In: Connect a microphone
3. Line-Out: Connect the output to a speaker
4. Headphones: Connect a Headphone w or w/o microphones
Multiple Inputs:
It supports the following combinations
1. Two stereo Line-Inputs and a microphone
2. One stereo Line-Input and two microphones
3. Two stereo Line-Inputs, a microphone and
one mono line-input (with h/w hack)
4. One stereo Line-Input, two microphones and
one mono line-input (with h/w hack)
Multiple Outputs:
Audio output can be routed to the headphones or
speakers (with additional hardware)
Signed-off-by: b-ak <anur.bhargav@gmail.com>
ASoC: audiosense-pi: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Added driver for the HiFiBerry DAC+ ADC (#2694)
Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>
hifiberry_dacplusadc: switch to snd_soc_dai_set_bclk_ratio
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplusadc: fix DAI link setup
The driver only defines a single DAI link and the code that tries
to setup the second (non-existent) DAI link looks wrong - using dmic
as a CPU/platform driver doesn't make any sense.
The DT overlay doesn't define a dmic property, so the code was never
executed (otherwise it would have resulted in a memory corruption).
So drop the offending code to prevent issues if a dmic property
should be added to the DT overlay.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplusadc: use modern dai_link style
Signed-off-by: Matthias Reichl <hias@horus.com>
Audiophonics I-Sabre 9038Q2M DAC driver
Signed-off-by: Audiophonics <contact@audiophonics.fr>
ASoC: i-sabre-q2m: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Added IQaudIO Pi-Codec board support (#2969)
Add support for the IQaudIO Pi-Codec board.
Signed-off-by: Gordon <gordon@iqaudio.com>
Fixed 48k timing issue
ASoC: iqaudio-codec: use modern dai_link style
Signed-off-by: Hui Wang <hui.wang@canonical.com>
adds the Hifiberry DAC+ADC PRO version
This adds the driver for the DAC+ADC PRO version of the Hifiberry soundcard with software controlled PCM1863 ADC
Signed-off-by: Joerg Schambacher joerg@i2audio.com
Add Hifiberry DAC+DSP soundcard driver (#3224)
Adds the driver for the Hifiberry DAC+DSP. It supports capture and
playback depending on the DSP firmware.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
Allow simultaneous use of JustBoom DAC and Digi
Signed-off-by: Johannes Krude <johannes@krude.de>
Pisound: MIDI communication fixes for scaled down CPU.
* Increased maximum SPI communication speed to avoid running too slow
when the CPU is scaled down and losing MIDI data.
* Keep track of buffer usage in millibytes for higher precision.
Signed-off-by: Giedrius Trainavičius <giedrius@blokas.io>
sound: Add the HiFiBerry DAC+HD version
This adds the driver for the DAC+HD version supporting HiFiBerry's
PCM179x based DACs. It also adds PLL control for clock generation.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
Fix master mode settings of HiFiBerry DAC+ADC PRO card (#3424)
This patch fixes the board DAI setting when in master-mode.
Wrong setting could have caused random pop noise.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
adds LED OFF feature to HiFiBerry DAC+ADC PRO sound card
This adds a DT overlay parameter 'leds_off' which allows
to switch off the onboard activity LEDs at all times
which has been requested by some users.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
adds LED OFF feature to HiFiBerry DAC+ADC sound card
This adds a DT overlay parameter 'leds_off' which allows
to switch off the onboard activity LEDs at all times
which has been requested by some users.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
adds LED OFF feature to HiFiBerry DAC+/DAC+PRO sound cards
This adds a DT overlay parameter 'leds_off' which allows
to switch off the onboard activity LEDs at all times
which has been requested by some users.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
pisound: Added reading Pisound board hardware revision and exposing it (#3425)
pisound: Added reading Pisound board hardware revision and exposing it in kernel log and sysfs file:
/sys/kernel/pisound/hw_version
Signed-off-by: Giedrius <giedrius@blokas.io>
Added driver for HiFiBerry Amp amplifier add-on board
The driver contains a low-level hardware driver for the TAS5713 and the
drivers for the Raspberry Pi I2S subsystem.
TAS5713: return error if initialisation fails
Existing TAS5713 driver logs errors during initialisation, but does not return
an error code. Therefore even if initialisation fails, the driver will still be
loaded, but won't work. This patch fixes this. I2C communication error will now
reported correctly by a non-zero return code.
HiFiBerry Amp: fix device-tree problems
Some code to load the driver based on device-tree-overlays was missing. This is added by this patch.
According to 5713 pdf doc CLOCK_CTRL is a readonly status register, and it behaves so. Remove useless setting
sound: pcm512x-codec: Adding 352.8kHz samplerate support
sound/soc: only first codec is master in multicodec setup
When using multiple codecs, at most one codec should generate the master
clock. All codecs except the first are therefore configured for slave
mode.
Signed-off-by: Johannes Krude <johannes@krude.de>
ASoC: Fix snd_soc_get_pcm_runtime usage
Commit [1] changed the snd_soc_get_pcm_runtime to take a dai_link
pointer instead of a string. Patch up the downstream drivers to use
the modified API.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
[1] 4468189ff3 ("ASoC: soc-core: find rtd via dai_link pointer at snd_soc_get_pcm_runtime()")
Add support for the AudioInjector.net Isolated sound card
This patch adds support for the Audio Injector Isolated sound card.
Signed-off-by: Matt Flax <flatmax@flatmax.org>
Add support for merus-amp soundcard and ma120x0p codec
Add 96KHz rate support to MA120X0P codec and make enable and mute gpio
pins optional.
Signed-off-by: AMuszkat <ariel.muszkat@gmail.com>
Fixes a problem with clock settings of HiFiBerry DAC+ADC PRO (#3545)
This patch fixes a problem of the re-calculation of
i2s-clock and -parameter settings when only the ADC is activated.
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
configs: Enable the AD193x codecs
See: https://github.com/raspberrypi/linux/issues/2850
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Switch to snd_soc_dai_set_bclk_ratio
Replaces obsolete function snd_soc_dai_set_tdm_slot
Signed-off-by: Joerg Schambacher <joerg@i2audio.com>
Enhances the DAC+ driver to control the optional headphone amplifier
Probes on the I2C bus for TPA6130A2, if successful, it sets DT-parameter
'status' from 'disabled' to 'okay' using change_sets to enable
the headphone control.
Signed-off-by: Joerg Schambacher joerg@i2audio.com
Update Allo Piano Dac Driver
Add unique names to the individual dac coded drivers
Remove some of the codec controls that are not used.
Signed-off-by: Paul Hermann <paul@picoreplayer.org>
Fixes an onboard clock detection problem of the PRO versions
Increasing the sleep time after clock selection to 3-4ms
allows the correct detection of all combinations of DAC+ Pro
and DAC+ADC Pro sound cards and the various PI revisions.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
The Raspberry Pi firmware manages the power-down and reboot
process. To do this it installs a pm_power_off handler, causing
the gpio-poweroff module to abort the probe function.
This patch introduces a "force" DT property that overrides that
behaviour, and also adds a DT overlay to enable and control it.
Note that running in an active-low configuration (DT parameter
"active_low") requires a custom dt-blob.bin and probably won't
allow a reboot without switching off, so an external inversion
of the trigger signal may be preferable.
Provide a __copy_from_user that uses memcpy. On BCM2708, use
optimised memcpy/memmove/memcmp/memset implementations.
arch/arm: Add mmiocpy/set aliases for memcpy/set
See: https://github.com/raspberrypi/linux/issues/1082
copy_from_user: CPU_SW_DOMAIN_PAN compatibility
The downstream copy_from_user acceleration must also play nice with
CONFIG_CPU_SW_DOMAIN_PAN.
See: https://github.com/raspberrypi/linux/issues/1381
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Fix copy_from_user if BCM2835_FAST_MEMCPY=n
The change which introduced CONFIG_BCM2835_FAST_MEMCPY unconditionally
changed the behaviour of arm_copy_from_user. The page pinning code
is not safe on ARMv7 if LPAE & high memory is enabled and causes
crashes which look like PTE corruption.
Make __copy_from_user_memcpy conditional on CONFIG_2835_FAST_MEMCPY=y
which is really an ARMv6 / Pi1 optimization and not necessary on newer
ARM processors.
arm: fix mmap unlocks in uaccess_with_memcpy.c
This is a regression that was added with the commit 192a4e923e as of rpi-5.8.y, since that is when the move to the mmap locking API was introduced - d8ed45c5dc
The issue is that when the patch to improve performance for the __copy_to_user and __copy_from_user functions were added for the Raspberry Pi, some of the mmaps were incorrectly mapped to write instead of read. This would cause a verity of issues, and in my case, prevent the booting of a squashfs filesystem on rpi-5.8-y and above. An example of the panic you would see from this can be seen at https://pastebin.com/raw/jBz5xCzL
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Christopher Blake <chrisrblake93@gmail.com>
The "input" trigger makes the associated GPIO an input. This is to support
the Raspberry Pi PWR LED, which is driven by external hardware in normal use.
N.B. pwr_led is not available on Model A or B boards.
leds-gpio: Implement the brightness_get method
The power LED uses some clever logic that means it is driven
by a voltage measuring circuit when configured as input, otherwise
it is driven by the GPIO output value. This patch wires up the
brightness_get method for leds-gpio so that user-space can monitor
the LED value via /sys/class/gpio/led1/brightness. Using the input
trigger this returns an indication of the system power health,
otherwise it is just whatever value the trigger has written most
recently.
See: https://github.com/raspberrypi/linux/issues/1064
Add the bare minimum needed to boot BCM2708 from a Device Tree.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
BCM2708: DT: change 'axi' nodename to 'soc'
Change DT node named 'axi' to 'soc' so it matches ARCH_BCM2835.
The VC4 bootloader fills in certain properties in the 'axi' subtree,
but since this is part of an upstreaming effort, the name is changed.
Signed-off-by: Noralf Tronnes notro@tronnes.org
BCM2708_DT: Correct length of the peripheral space
Use dts-dirs feature for overlays.
The kernel makefiles have a dts-dirs target that is for vendor subdirectories.
Using this fixes the install_dtbs target, which previously did not install the overlays.
BCM270X_DT: configure I2S DMA channels
Signed-off-by: Matthias Reichl <hias@horus.com>
BCM270X_DT: switch to bcm2835-i2s
I2S soundcard drivers with proper devicetree support (i.e. not linking
to the cpu_dai/platform via name but to cpu/platform via of_node)
will work out of the box without any modifications.
When the kernel is compiled without devicetree support the platform
code will instantiate the bcm2708-i2s driver and I2S soundcard drivers
will link to it via name, as before.
Signed-off-by: Matthias Reichl <hias@horus.com>
SDIO-overlay: add poll_once-boolean parameter
Add paramter to toggle sdio-device-polling
done every second or once at boot-time.
Signed-off-by: Patrick Boettcher <patrick.boettcher@posteo.de>
BCM270X_DT: Make mmc overlay compatible with current firmware
The original DT overlay logic followed a merge-then-patch procedure,
i.e. parameters are applied to the loaded overlay before the overlay
is merged into the base DTB. This sequence has been changed to
patch-then-merge, in order to support parameterised node names, and
to protect against bad overlays. As a result, overrides (parameters)
must only target labels in the overlay, but the overlay can obviously target nodes in the base DTB.
mmc-overlay.dts (that switches back to the original mmc sdcard
driver) is the only overlay violating that rule, and this patch
fixes it.
bcm270x_dt: Use the sdhost MMC controller by default
The "mmc" overlay reverts to using the other controller.
squash: Add cprman to dt
BCM270X_DT: Use clk_core for I2C interfaces
BCM270X_DT: Use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi
The mainline Device Tree files are quite close to downstream now.
Let's use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi as base files
for our dts files.
Mainline dts files are based on these files:
bcm2835-rpi.dtsi
bcm2835.dtsi bcm2836.dtsi
bcm283x.dtsi
Current downstream are based on these:
bcm2708.dtsi bcm2709.dtsi bcm2710.dtsi
bcm2708_common.dtsi
This patch introduces this dependency:
bcm2708.dtsi bcm2709.dtsi
bcm2708-rpi.dtsi
bcm270x.dtsi
bcm2835.dtsi bcm2836.dtsi
bcm283x.dtsi
And:
bcm2710.dtsi
bcm2708-rpi.dtsi
bcm270x.dtsi
bcm283x.dtsi
bcm270x.dtsi contains the downstream bcm283x.dtsi diff.
bcm2708-rpi.dtsi is the downstream version of bcm2835-rpi.dtsi.
Other changes:
- The led node has moved from /soc/leds to /leds. This is not a problem
since the label is used to reference it.
- The clk_osc reg property changes from 6 to 3.
- The gpu nodes has their interrupt property set in the base file.
- the clocks label does not point to the /clocks node anymore, but
points to the cprman node. This is not a problem since the overlays
that use the clock node refer to it directly: target-path = "/clocks";
- some nodes now have 2 labels since mainline and downstream differs in
this respect: cprman/clocks, spi0/spi, gpu/vc4.
- some nodes doesn't have an explicit status = "okay" since they're not
disabled in the base file: watchdog and random.
- gpiomem doesn't need an explicit status = "okay".
- bcm2708-rpi-cm.dts got the hpd-gpios property from bcm2708_common.dtsi,
it's now set directly in that file.
- bcm2709-rpi-2-b.dts has the timer node moved from /soc/timer to /timer.
- Removed clock-frequency property on the bcm{2709,2710}.dtsi timer nodes.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270X_DT: Use raspberrypi-power to turn on USB power
Use the raspberrypi-power driver to turn on USB power.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270X_DT: Add a .dtbo target, use for overlays
Change the filenames and extensions to keep the pre-DDT style of
overlay (<name>-overlay.dtb) distinct from new ones that use a
different style of local fixups (<name>.dtbo), and to match other
platforms.
The RPi firmware uses the DDTK trailer atom to choose which type of
overlay to use for each kernel.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Don't generate "linux,phandle" props
The EPAPR standard says to use "phandle" properties to store phandles,
rather than the deprecated "linux,phandle" version. By default, dtc
generates both, but adding "-H epapr" causes it to only generate
"phandle"s, saving some space and clutter.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Add overlay for enc28j60 on SPI2
Works on SPI2 for compute module
BCM270X_DT: Add midi-uart0 overlay
MIDI requires 31.25kbaud, a baudrate unsupported by Linux. The
midi-uart0 overlay configures uart0 (ttyAMA0) to use a fake clock
so that requesting 38.4kbaud actually gets 31.25kbaud.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Add i2c-sensor overlay
The i2c-sensor overlay is a container for various pressure and
temperature sensors, currently bmp085 and bmp280. The standalone
bmp085_i2c-sensor overlay is now deprecated.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: overlays/*-overlay.dtb -> overlays/*.dtbo (#1752)
We now create overlays as .dtbo files.
build: support for .dtbo files for dtb overlays
Kernel 4.4.6+ on RaspberryPi support .dtbo files for overlays, instead of .dtb.
Patch the kernel, which has faulty rules to generate .dtbo the way yocto does
Signed-off-by: Herve Jourdain <herve.jourdain@neuf.fr>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
BCM270X: Drop position requirement for CMA in VC4 overlay.
No longer necessary since 2aefcd5761,
and will probably let peeople that want to choose a larger CMA
allocation (particularly on pi0/1).
Signed-off-by: Eric Anholt <eric@anholt.net>
BCM270X_DT: RPi Device Tree tidy
Use the upstream sdhost node, add thermal-zones, and factor out some
common elements.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
kbuild: Silence unhelpful DTC warnings
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: DT build rules no longer arch-specific
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Support booting without Device Tree.
Turn on USB power.
Load driver early because of lacking support for deferred probing
in many drivers.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
firmware: bcm2835: Don't turn on USB power
The raspberrypi-power driver is now used to turn on USB power.
This partly reverts commit:
firmware: bcm2835: Support ARCH_BCM270x
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Add module for accessing the mailbox property channel through
/dev/vcio. Was previously in bcm2708-vcio.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
char: vcio: Add compat ioctl handling
There was no compat ioctl handler, so 32 bit userspace on a
64 bit kernel failed as IOCTL_MBOX_PROPERTY used the size
of char*.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
char: vcio: Fail probe if rpi_firmware is not found.
Device Tree is now the only supported config mechanism, therefore
uncomment the block of code that fails the probe if the
firmware node can't be found.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
i2c-bcm2708: fixed baudrate
Fixed issue where the wrong CDIV value was set for baudrates below 3815 Hz (for 250MHz bus clock).
In that case the computed CDIV value was more than 0xffff. However the CDIV register width is only 16 bits.
This resulted in incorrect setting of CDIV and higher baudrate than intended.
Example: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0x1704 -> 42430Hz
After correction: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0xffff -> 3815Hz
The correct baudrate is shown in the log after the cdiv > 0xffff correction.
Perform I2C combined transactions when possible
Perform I2C combined transactions whenever possible, within the
restrictions of the Broadcomm Serial Controller.
Disable DONE interrupt during TA poll
Prevent interrupt from being triggered if poll is missed and transfer
starts and finishes.
i2c: Make combined transactions optional and disabled by default
i2c: bcm2708: add device tree support
Add DT support to driver and add to .dtsi file.
Setup pins in .dts file.
i2c is disabled by default.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
bcm2708: don't register i2c controllers when using DT
The devices for the i2c controllers are in the Device Tree.
Only register devices when not using DT.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
I2C: Only register the I2C device for the current board revision
i2c_bcm2708: Fix clock reference counting
Fix grabbing lock from atomic context in i2c driver
2 main changes:
- check for timeouts in the bcm2708_bsc_setup function as indicated by this comment:
/* poll for transfer start bit (should only take 1-20 polls) */
This implies that the setup function can now fail so account for this everywhere it's called
- Removed the clk_get_rate call from inside the setup function as it locks a mutex and that's not ok since we call it from under a spin lock.
i2c-bcm2708: When using DT, leave the GPIO setup to pinctrl
i2c-bcm2708: Increase timeouts to allow larger transfers
Use the timeout value provided by the I2C_TIMEOUT ioctl when waiting
for completion. The default timeout is 1 second.
See: https://github.com/raspberrypi/linux/issues/260
i2c-bcm2708/BCM270X_DT: Add support for I2C2
The third I2C bus (I2C2) is normally reserved for HDMI use. Careless
use of this bus can break an attached display - use with caution.
It is recommended to disable accesses by VideoCore by setting
hdmi_ignore_edid=1 or hdmi_edid_file=1 in config.txt.
The interface is disabled by default - enable using the
i2c2_iknowwhatimdoing DT parameter.
bcm2708-spi: Don't use static pin configuration with DT
Also remove superfluous error checking - the SPI framework ensures the
validity of the chip_select value.
i2c-bcm2708: Remove non-DT support
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Set the BSC_CLKT clock streching timeout to 35ms as per SMBus specs.
Fixes i2c_bcm2708: Write to FIFO correctly - v2 (#1574)
* i2c: fix i2c_bcm2708: Clear FIFO before sending data
Make sure FIFO gets cleared before trying to send
data in case of a repeated start (COMBINED=Y).
* i2c: fix i2c_bcm2708: Only write to FIFO when not full
Check if FIFO can accept data before writing.
To avoid a peripheral read on the last iteration of a loop,
both bcm2708_bsc_fifo_fill and ~drain are changed as well.
Signed-off-by: Luke Wren <wren6991@gmail.com>
MISC: bcm2835: smi: use clock manager and fix reload issues
Use clock manager instead of self-made clockmanager.
Also fix some error paths that showd up during development
(especially missing release of dma resources on rmmod)
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
bcm2835_smi: re-add dereference to fix DMA transfers
Signed-off-by: popcornmix <popcornmix@gmail.com>
BCM270x: Move vc_mem
Make the vc_mem module available for ARCH_BCM2835 by moving it.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
char: vc_mem: Fix up compat ioctls for 64bit kernel
compat_ioctl wasn't defined, so 32bit user/64bit kernel
always failed.
VC_MEM_IOC_MEM_PHYS_ADDR was defined with parameter size
unsigned long, so the ioctl cmd changes between sizes.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
char: vc_mem: Fix all coding style issues.
Cleans up all checkpatch errors in vc_mem.c and vc_mem.h
No functional change to the code.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
char: vc_mem: Delete dead code
There are no error exists once device_create has succeeded, and
therefore no need to call device_destroy from vc_mem_init.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
BCM2835 has two SD card interfaces. This driver uses the other one.
bcm2835-sdhost: Error handling fix, and code clarification
bcm2835-sdhost: Adding overclocking option
Allow a different clock speed to be substitued for a requested 50MHz.
This option is exposed using the "overclock_50" DT parameter.
Note that the sdhost interface is restricted to integer divisions of
core_freq, and the highest sensible option for a core_freq of 250MHz
is 84 (250/3 = 83.3MHz), the next being 125 (250/2) which is much too
high.
Use at your own risk.
bcm2835-sdhost: Round up the overclock, so 62 works for 62.5Mhz
Also only warn once for each overclock setting.
bcm2835-sdhost: Improve error handling and recovery
1) Expose the hw_reset method to the MMC framework, removing many
internal calls by the driver.
2) Reduce overclock setting on error.
3) Increase timeout to cope with high capacity cards.
4) Add properties and parameters to control pio_limit and debug.
5) Reduce messages at probe time.
bcm2835-sdhost: Further improve overclock back-off
bcm2835-sdhost: Clear HBLC for PIO mode
Also update pio_limit default in overlay README.
bcm2835-sdhost: Add the ERASE capability
See: https://github.com/raspberrypi/linux/issues/1076
bcm2835-sdhost: Ignore CRC7 for MMC CMD1
It seems that the sdhost interface returns CRC7 errors for CMD1,
which is the MMC-specific SEND_OP_COND. Returning these errors to
the MMC layer causes a downward spiral, but ignoring them seems
to be harmless.
bcm2835-mmc/sdhost: Remove ARCH_BCM2835 differences
The bcm2835-mmc driver (and -sdhost driver that copied from it)
contains code to handle SDIO interrupts in a threaded interrupt
handler rather than waking the MMC framework thread. The change
follows a patch from Russell King that adds the facility as the
preferred way of working.
However, the new code path is only present in ARCH_BCM2835
builds, which I have taken to be a way of testing the waters
rather than making the change across the board; I can't see
any technical reason why it wouldn't be enabled for MACH_BCM270X
builds. So this patch standardises on the ARCH_BCM2835 code,
removing the old code paths.
bcm2835-sdhost: Don't log timeout errors unless debug=1
The MMC card-discovery process generates timeouts. This is
expected behaviour, so reporting it to the user serves no purpose.
Suppress the reporting of timeout errors unless the debug flag
is on.
bcm2835-sdhost: Add workaround for odd behaviour on some cards
For reasons not understood, the sdhost driver fails when reading
sectors very near the end of some SD cards. The problem could
be related to the similar issue that reading the final sector
of any card as part of a multiple read never completes, and the
workaround is an extension of the mechanism introduced to solve
that problem which ensures those sectors are always read singly.
bcm2835-sdhost: Major revision
This is a significant revision of the bcm2835-sdhost driver. It
improves on the original in a number of ways:
1) Through the use of CMD23 for reads it appears to avoid problems
reading some sectors on certain high speed cards.
2) Better atomicity to prevent crashes.
3) Higher performance.
4) Activity logging included, for easier diagnosis in the event
of a problem.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Restore ATOMIC flag to PIO sg mapping
Allocation problems have been seen in a wireless driver, and
this is the only change which might have been responsible.
SQUASH: bcm2835-sdhost: Only claim one DMA channel
With both MMC controllers enabled there are few DMA channels left. The
bcm2835-sdhost driver only uses DMA in one direction at a time, so it
doesn't need to claim two channels.
See: https://github.com/raspberrypi/linux/issues/1327
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Workaround for "slow" sectors
Some cards have been seen to cause timeouts after certain sectors are
read. This workaround enforces a minimum delay between the stop after
reading one of those sectors and a subsequent data command.
Using CMD23 (SET_BLOCK_COUNT) avoids this problem, so good cards will
not be penalised by this workaround.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Firmware manages the clock divisor
The bcm2835-sdhost driver hands control of the CDIV clock divisor
register to matching firmware, allowing it to adjust to a changing
core clock. This removes the need to use the performance governor or
to enable io_is_busy on the on-demand governor in order to get the
best SD performance.
N.B. As SD clocks must be an integer divisor of the core clock, it is
possible that the SD clock for "turbo" mode can be different (even
lower) than "normal" mode.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Reset the clock in task context
Since reprogramming the clock can now involve a round-trip to the
firmware it must not be done at atomic context, and a tasklet
is not a task.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Don't exit cmd wait loop on error
The FAIL flag can be set in the CMD register before command processing
is complete, leading to spurious "failed to complete" errors. This has
the effect of promoting harmless CRC7 errors during CMD1 processing
into errors that can delay and even prevent booting.
Also:
1) Convert the last KERN_ERROR message in the register dumping to
KERN_INFO.
2) Remove an unnecessary reset call from bcm2835_sdhost_add_host.
See: https://github.com/raspberrypi/linux/pull/1492
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: mmc_card_blockaddr fix
Get the definition of mmc_card_blockaddr from drivers/mmc/core/card.h.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: New timer API
mmc: bcm2835-sdhost: Support underclocking
Support underclocking of the SD bus in two ways:
1. using the max-frequency DT property (which currently has no DT
parameter), and
2. using the exiting sd_overclock parameter.
The two methods differ slightly - in the former the MMC subsystem is
aware of the underclocking, while in the latter it isn't - but the
end results should be the same.
See: https://github.com/raspberrypi/linux/issues/2350
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc: bcm2835-sdhost: Add include
highmem.h (needed for kmap_atomic) is pulled in by one of the other
include files, but only with some CONFIG settings. Make the inclusion
explicit to cater for cases where the CONFIG setting is absent.
See: https://github.com/raspberrypi/linux/issues/2366
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc/bcm2835-sdhost: Recover from MMC_SEND_EXT_CSD
If the user issues an "mmc extcsd read", the SD controller receives
what it thinks is a SEND_IF_COND command with an unexpected data block.
The resulting operations leave the FSM stuck in READWAIT, a state which
persists until the MMC framework resets the controller, by which point
the root filesystem is likely to have been unmounted.
A less heavyweight solution is to detect the condition and nudge the
FSM by asserting the (self-clearing) FORCE_DATA_MODE bit.
N.B. This workaround was essentially discovered by accident and without
a full understanding the inner workings of the controller, so it is
fortunate that the "fix" only modifies error paths.
See: https://github.com/raspberrypi/linux/issues/2728
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc: bcm2835-sdhost: Fix warnings on arm64
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Allow for sg entries that cross pages
The dma_complete handling code calculates a virtual address for a page
then adds an offset, but if the offset is more than a page and HIGHMEM
is in use then the summed address could be in an unmapped (or just
incorrect) page.
The upstream SDHOST driver allows for this possibility - copy the code
that does so.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Fix DMA channel leak on error/remove
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc: bcm2835-sdhost: Support 64-bit physical addresses
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Replace obsolete struct timeval
struct timeval has been retired due to the impending linux 32-bit tv_sec
rollover (only 18 years to go) - timespec64 is the obvious replacement.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
mmc: sdhost: Pass DT pointer to rpi_firmware_get
Using the rpi_firmware API as intended allows proper reference counting
of the firmware device and means we can remove a downstream patch to
the firmware driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
mmc: Disable CMD23 transfers on all cards
Pending wire-level investigation of these types of transfers
and associated errors on bcm2835-mmc, disable for now. Fallback of
CMD18/CMD25 transfers will be used automatically by the MMC layer.
Reported/Tested-by: Gellert Weisz <gellert@raspberrypi.org>
mmc: bcm2835-mmc: enable DT support for all architectures
Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now.
Enable Device Tree support for all architectures.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
mmc: bcm2835-mmc: fix probe error handling
Probe error handling is broken in several places.
Simplify error handling by using device managed functions.
Replace pr_{err,info} with dev_{err,info}.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-mmc: Add locks when accessing sdhost registers
bcm2835-mmc: Add range of debug options for slowing things down
bcm2835-mmc: Add option to disable some delays
bcm2835-mmc: Add option to disable MMC_QUIRK_BLK_NO_CMD23
bcm2835-mmc: Default to disabling MMC_QUIRK_BLK_NO_CMD23
bcm2835-mmc: Adding overclocking option
Allow a different clock speed to be substitued for a requested 50MHz.
This option is exposed using the "overclock_50" DT parameter.
Note that the mmc interface is restricted to EVEN integer divisions of
250MHz, and the highest sensible option is 63 (250/4 = 62.5), the
next being 125 (250/2) which is much too high.
Use at your own risk.
bcm2835-mmc: Round up the overclock, so 62 works for 62.5Mhz
Also only warn once for each overclock setting.
mmc: bcm2835-mmc: Make available on ARCH_BCM2835
Make the bcm2835-mmc driver available for use on ARCH_BCM2835.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: add bcm2835-mmc entry
Add Device Tree entry for bcm2835-mmc.
In non-DT mode, don't add the device in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-mmc: Don't overwrite MMC capabilities from DT
bcm2835-mmc: Don't override bus width capabilities from devicetree
Take out the force setting of the MMC_CAP_4_BIT_DATA host capability
so that the result read from devicetree via mmc_of_parse() is
preserved.
bcm2835-mmc: Only claim one DMA channel
With both MMC controllers enabled there are few DMA channels left. The
bcm2835-mmc driver only uses DMA in one direction at a time, so it
doesn't need to claim two channels.
See: https://github.com/raspberrypi/linux/issues/1327
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-mmc: New timer API
mmc: bcm2835-mmc: Support underclocking
Support underclocking of the SD bus using the max-frequency DT property
(which currently has no DT parameter). The sd_overclock parameter
already provides another way to achieve the same thing which should be
equivalent in end result, but it is a bug not to support max-frequency
as well.
See: https://github.com/raspberrypi/linux/issues/2350
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc/bcm2835: Recover from MMC_SEND_EXT_CSD
If the user issues an "mmc extcsd read", the SD controller receives
what it thinks is a SEND_IF_COND command with an unexpected data block.
The resulting operations leave the FSM stuck in READWAIT, a state which
persists until the MMC framework resets the controller, by which point
the root filesystem is likely to have been unmounted.
A less heavyweight solution is to detect the condition and nudge the
FSM by asserting the (self-clearing) FORCE_DATA_MODE bit.
N.B. This workaround was essentially discovered by accident and without
a full understanding the inner workings of the controller, so it is
fortunate that the "fix" only modifies error paths.
See: https://github.com/raspberrypi/linux/issues/2728
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-mmc: Fix DMA channel leak
The BCM2835 MMC host driver requests a DMA channel on probe but neglects
to release the channel in the probe error path and on driver unbind.
I'm seeing this happen on every boot of the Compute Module 3: On first
driver probe, DMA channel 2 is allocated and then leaked with a "could
not get clk, deferring probe" message. On second driver probe, channel 4
is allocated.
Fix it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835-mmc: Fix struct mmc_host leak on probe
The BCM2835 MMC host driver requests the bus address of the host's
register map on probe. If that fails, the driver leaks the struct
mmc_host allocated earlier.
Fix it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835-mmc: Fix duplicate free_irq() on remove
The BCM2835 MMC host driver requests its interrupt as a device-managed
resource, so the interrupt is automatically freed after the driver is
unbound.
However on driver unbind, bcm2835_mmc_remove() frees the interrupt
explicitly to avoid invocation of the interrupt handler after driver
structures have been torn down.
The interrupt is thus freed twice, leading to a WARN splat in
__free_irq(). Fix by not requesting the interrupt as a device-managed
resource.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835-mmc: Handle mmc_add_host() errors
The BCM2835 MMC host driver calls mmc_add_host() but doesn't check its
return value. Errors occurring in that function are therefore not
handled. Fix it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835-mmc: Deduplicate reset of driver data on remove
The BCM2835 MMC host driver sets the device's driver data pointer to
NULL on ->remove() even though the driver core subsequently does the
same in __device_release_driver(). Drop the duplicate assignment.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Frank Pavlic <f.pavlic@kunbus.de>
bcm2835_mmc: Remove vestigial threaded IRQ
With SDIO processing now managed by the MMC framework with a
workqueue, the bcm2835_mmc driver no longer needs a threaded
IRQ.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add missing dma_unmap_sg calls to free relevant swiotlb bounce buffers.
This prevents DMA leaks.
Signed-off-by: Yaroslav Rosomakho <yaroslavros@gmail.com>
Limit max_req_size under arm64 (or any other platform that uses swiotlb) to prevent potential buffer overflow due to bouncing.
Signed-off-by: Yaroslav Rosomakho <yaroslavros@gmail.com>
Add support for DMA controller of BCM2708 as used in the Raspberry Pi.
Currently it only supports cyclic DMA.
Signed-off-by: Florian Meier <florian.meier@koalo.de>
dmaengine: expand functionality by supporting scatter/gather transfers sdhci-bcm2708 and dma.c: fix for LITE channels
DMA: fix cyclic LITE length overflow bug
dmaengine: bcm2708: Remove chancnt affectations
Mirror bcm2835-dma.c commit 9eba5536a7:
chancnt is already filled by dma_async_device_register, which uses the channel
list to know how much channels there is.
Since it's already filled, we can safely remove it from the drivers' probe
function.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: overwrite dreq only if it is not set
dreq is set when the DMA channel is fetched from Device Tree.
slave_id is set using dmaengine_slave_config().
Only overwrite dreq with slave_id if it is not set.
dreq/slave_id in the cyclic DMA case is not touched, because I don't
have hardware to test with.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: do device registration in the board file
Don't register the device in the driver. Do it in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: don't restrict DT support to ARCH_BCM2835
Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now.
Add Device Tree support to the non ARCH_BCM2835 case.
Use the same driver name regardless of architecture.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: add bcm2835-dma entry
Add Device Tree entry for bcm2835-dma.
The entry doesn't contain any resources since they are handled
by the arch/arm/mach-bcm270x/dma.c driver.
In non-DT mode, don't add the device in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2708-dmaengine: Add debug options
BCM270x: Add memory and irq resources to dmaengine device and DT
Prepare for merging of the legacy DMA API arch driver dma.c
with bcm2708-dmaengine by adding memory and irq resources both
to platform file device and Device Tree node.
Don't use BCM_DMAMAN_DRIVER_NAME so we don't have to include mach/dma.h
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Merge with arch dma.c driver and disable dma.c
Merge the legacy DMA API driver with bcm2708-dmaengine.
This is done so we can use bcm2708_fb on ARCH_BCM2835 (mailbox
driver is also needed).
Changes to the dma.c code:
- Use BIT() macro.
- Cutdown some comments to one line.
- Add mutex to vc_dmaman and use this, since the dev lock is locked
during probing of the engine part.
- Add global g_dmaman variable since drvdata is used by the engine part.
- Restructure for readability:
vc_dmaman_chan_alloc()
vc_dmaman_chan_free()
bcm_dma_chan_free()
- Restructure bcm_dma_chan_alloc() to simplify error handling.
- Use device irq resources instead of hardcoded bcm_dma_irqs table.
- Remove dev_dmaman_register() and code it directly.
- Remove dev_dmaman_deregister() and code it directly.
- Simplify bcm_dmaman_probe() using devm_* functions.
- Get dmachans from DT if available.
- Keep 'dma.dmachans' module argument name for backwards compatibility.
Make it available on ARCH_BCM2835 as well.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: set residue_granularity field
bcm2708-dmaengine supports residue reporting at burst level
but didn't report this via the residue_granularity field.
Without this field set properly we get playback issues with I2S cards.
dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer
bcm2708-dmaengine: Use more DMA channels (but not 12)
1) Only the bcm2708_fb drivers uses the legacy DMA API, and
it requires a BULK-capable channel, so all other types
(FAST, NORMAL and LITE) can be made available to the regular
DMA API.
2) DMA channels 11-14 share an interrupt. The driver can't
handle this, so don't use channels 12-14 (12 was used, probably
because it appears to have an interrupt, but in reality that
interrupt is for activity on ANY channel). This may explain
a lockup encountered when running out of DMA channels.
The combined effect of this patch is to leave 7 DMA channels
available + channel 0 for bcm2708_fb via the legacy API.
See: https://github.com/raspberrypi/linux/issues/1110https://github.com/raspberrypi/linux/issues/1108
dmaengine: bcm2708: Make legacy API available for bcm2835-dma
bcm2708_fb uses the legacy DMA API, so in order to start using
bcm2835-dma, bcm2835-dma has to support the legacy API. Make this
possible by exporting bcm_dmaman_probe() and bcm_dmaman_remove().
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Change DT compatible string
Both bcm2835-dma and bcm2708-dmaengine have the same compatible string.
So change compatible to "brcm,bcm2708-dma".
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Remove driver but keep legacy API
Dropping non-DT support means we don't need this driver,
but we still need the legacy DMA API.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2708-dmaengine - Fix arm64 portability/build issues
dma-bcm2708: Fix module compilation of CONFIG_DMA_BCM2708
bcm2708-dmaengine.c defines functions like bcm_dma_start which are
defined as well in dma-bcm2708.h as inline versions when
CONFIG_DMA_BCM2708 is not defined. This works fine when
CONFIG_DMA_BCM2708 is built in, but when it is selected as module build
fails with redefinition errors because in the build system when
CONFIG_DMA_BCM2708 is selected as module, the macro becomes
CONFIG_DMA_BCM2708_MODULE.
This patch makes the header use CONFIG_DMA_BCM2708_MODULE too when
available.
Fixes https://github.com/raspberrypi/linux/issues/2056
Signed-off-by: Andrei Gherzan <andrei@gherzan.com>
Especially on platforms with a slower CPU but a relatively high
framebuffer fill bandwidth, like current ARM devices, the existing
console monochrome imageblit function used to draw console text is
suboptimal for common pixel depths such as 16bpp and 32bpp. The existing
code is quite general and can deal with several pixel depths. By creating
special case functions for 16bpp and 32bpp, by far the most common pixel
formats used on modern systems, a significant speed-up is attained
which can be readily felt on ARM-based devices like the Raspberry Pi
and the Allwinner platform, but should help any platform using the
fb layer.
The special case functions allow constant folding, eliminating a number
of instructions including divide operations, and allow the use of an
unrolled loop, eliminating instructions with a variable shift size,
reducing source memory access instructions, and eliminating excessive
branching. These unrolled loops also allow much better code optimization
by the C compiler. The code that selects which optimized variant is used
is also simplified, eliminating integer divide instructions.
The speed-up, measured by timing 'cat file.txt' in the console, varies
between 40% and 70%, when testing on the Raspberry Pi and Allwinner
ARM-based platforms, depending on font size and the pixel depth, with
the greater benefit for 32bpp.
Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
Based on the patch authored by Ali Gholami Rudi at
https://lkml.org/lkml/2009/7/13/153
Provide an ioctl for userspace applications, but only if this operation
is hardware accelerated (otherwide it does not make any sense).
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
bcm2708_fb: Add ioctl for reading gpu memory through dma
video: bcm2708_fb: Add compat_ioctl support.
When using a 64 bit kernel with 32 bit userspace we need
compat ioctl handling for FBIODMACOPY as one of the
parameters is a pointer.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
bcm2708_fb : Implement blanking support using the mailbox property interface
bcm2708_fb: Add pan and vsync controls
bcm2708_fb: DMA acceleration for fb_copyarea
Based on http://www.raspberrypi.org/phpBB3/viewtopic.php?p=62425#p62425
Also used Simon's dmaer_master module as a reference for tweaking DMA
settings for better performance.
For now busylooping only. IRQ support might be added later.
With non-overclocked Raspberry Pi, the performance is ~360 MB/s
for simple copy or ~260 MB/s for two-pass copy (used when dragging
windows to the right).
In the case of using DMA channel 0, the performance improves
to ~440 MB/s.
For comparison, VFP optimized CPU copy can only do ~114 MB/s in
the same conditions (hindered by reading uncached source buffer).
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
bcm2708_fb: report number of dma copies
Add a counter (exported via debugfs) reporting the
number of dma copies that the framebuffer driver
has done, in order to help evaluate different
optimization strategies.
Signed-off-by: Luke Diamand <luked@broadcom.com>
bcm2708_fb: use IRQ for DMA copies
The copyarea ioctl() uses DMA to speed things along. This
was busy-waiting for completion. This change supports using
an interrupt instead for larger transfers. For small
transfers, busy-waiting is still likely to be faster.
Signed-off-by: Luke Diamand <luke@diamand.org>
bcm2708: Make ioctl logging quieter
video: fbdev: bcm2708_fb: Don't panic on error
No need to panic the kernel if the video driver fails.
Just print a message and return an error.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
fbdev: bcm2708_fb: Add ARCH_BCM2835 support
Add Device Tree support.
Pass the device to dma_alloc_coherent() in order to get the
correct bus address on ARCH_BCM2835.
Use the new DMA legacy API header file.
Including <mach/platform.h> is not necessary.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: Add bcm2708-fb device
Add bcm2708-fb to Device Tree and don't add the
platform device when booting in DT mode.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Cleanup of bcm2708_fb file to kernel coding standards
Some minor change to function - remove a use of
in_atomic, plus replacing various debug messages
that manually specify the function name with
("%s",.__func__)
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
video: bcm2708_fb: Try allocating on the ARM and passing to VPU
Currently the VPU allocates the contiguous buffer for the
framebuffer.
Try an alternate path first where we use dma_alloc_coherent
and pass the buffer to the VPU. Should the VPU firmware not
support that path, then free the buffer and revert to the
old behaviour of using the VPU allocation.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
usb: dwc: fix lockdep false positive
Signed-off-by: Kari Suvanto <karis79@gmail.com>
usb: dwc: fix inconsistent lock state
Signed-off-by: Kari Suvanto <karis79@gmail.com>
Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas
Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.
Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh
Make sure we wait for the reset to finish
dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
memory corruption, escalating to OOPS under high USB load.
dwc_otg: Fix unsafe access of QTD during URB enqueue
In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.
This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).
dwc_otg: Fix incorrect URB allocation error handling
If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.
dwc_otg: fix potential use-after-free case in interrupt handler
If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.
dwc_otg: add handling of SPLIT transaction data toggle errors
Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).
dwc_otg: implement tasklet for returning URBs to usbcore hcd layer
The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.
This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.
dwc_otg: fix NAK holdoff and allow on split transactions only
This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.
Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.
dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler
usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4a1a).
This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.
NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer. This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.
USB fix using a FIQ to implement split transactions
This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux
dwc_otg: fix device attributes and avoid kernel warnings on boot
dcw_otg: avoid logging function that can cause panics
See: https://github.com/raspberrypi/firmware/issues/21
Thanks to cleverca22 for fix
dwc_otg: mask correct interrupts after transaction error recovery
The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.
By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.
dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ
In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.
Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.
Fix function tracing
dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue
dwc_otg: prevent OOPSes during device disconnects
The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.
Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.
dwc_otg: prevent BUG() in TT allocation if hub address is > 16
A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.
This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.
Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.
dwc_otg: make channel halts with unknown state less damaging
If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.
Add catchall handling to treat as a transaction error and retry.
dwc_otg: fiq_split: use TTs with more granularity
This fixes certain issues with split transaction scheduling.
- Isochronous multi-packet OUT transactions now hog the TT until
they are completed - this prevents hubs aborting transactions
if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
allows simultaneous use of the TT's bulk/control and periodic
transaction buffers
This commit will mainly affect USB audio playback.
dwc_otg: fix potential sleep while atomic during urb enqueue
Fixes a regression introduced with eb1b482a. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.
dwc_otg: make fiq_split_enable imply fiq_fix_enable
Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.
dwc_otg: prevent crashes on host port disconnects
Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.
- Clean up queue heads properly in kill_urbs_in_qh_list by
removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
device drivers so they respond in a more correct fashion
and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
interrupts if the associated URB was dequeued (and the
driver state was cleared)
dwc_otg: prevent leaking URBs during enqueue
A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.
dwc_otg: Enable NAK holdoff for control split transactions
Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb238.
In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.
dwc_otg: Fix for occasional lockup on boot when doing a USB reset
dwc_otg: Don't issue traffic to LS devices in FS mode
Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.
Fix ARM architecture issue with local_irq_restore()
If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.
Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.
Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.
dwc_otg: fiq_fsm: Base commit for driver rewrite
This commit removes the previous FIQ fixes entirely and adds fiq_fsm.
This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.
All driver options have been removed and replaced with:
- dwc_otg.fiq_enable (bool)
- dwc_otg.fiq_fsm_enable (bool)
- dwc_otg.fiq_fsm_mask (bitmask)
- dwc_otg.nak_holdoff (unsigned int)
Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.
fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used
If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).
Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.
In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.
Push the error count reset into the FIQ handler.
fiq_fsm: Implement timeout mechanism
For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.
Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.
fiq_fsm: fix bounce buffer utilisation for Isochronous OUT
Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.
Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.
dwc_otg: Return full-speed frame numbers in HS mode
The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.
fiq_fsm: save PID on completion of interrupt OUT transfers
Also add edge case handling for interrupt transports.
Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.
fiq_fsm: add missing case for fiq_fsm_tt_in_use()
Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.
fiq_fsm: clear hcintmsk for aborted transactions
Prevents the FIQ from erroneously handling interrupts
on a timed out channel.
fiq_fsm: enable by default
fiq_fsm: fix dequeues for non-periodic split transactions
If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.
fiq_fsm: Disable by default
fiq_fsm: Handle HC babble errors
The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.
dwc_otg: fix interrupt registration for fiq_enable=0
Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.
fiq_fsm: Enable by default
dwc_otg: Fix various issues with root port and transaction errors
Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.
Fix a few thinkos with the transaction error passthrough for fiq_fsm.
fiq_fsm: Implement hack for Split Interrupt transactions
Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.
It goes without saying that this is a fairly egregious USB specification
violation, but it works.
Original idea by Hans Petter Selasky @ FreeBSD.org.
dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.
dwc_otg: introduce fiq_fsm_spin(un|)lock()
SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.
This also makes it possible to run the FIQ and IRQ handlers on different
cores.
fiq_fsm: fix build on bcm2708 and bcm2709 platforms
dwc_otg: put some barriers back where they should be for UP
bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active
dwc_otg: fixup read-modify-write in critical paths
Be more careful about read-modify-write on registers that the FIQ
also touches.
Guard fiq_fsm_spin_lock with fiq_enable check
fiq_fsm: Falling out of the state machine isn't fatal
This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.
Also get rid of the useless return value.
squash: dwc_otg: Allow to build without SMP
usb: core: make overcurrent messages more prominent
Hub overcurrent messages are more serious than "debug". Increase loglevel.
usb: dwc_otg: Don't use dma_to_virt()
Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.
Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.
transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Fix crash when fiq_enable=0
dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly
Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.
Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.
dwc_otg: Force host mode to fix incorrect compute module boards
dwc_otg: Add ARCH_BCM2835 support
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Simplify FIQ irq number code
Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Remove duplicate gadget probe/unregister function
dwc_otg: Properly set the HFIR
Douglas Anderson reported:
According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1
This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)
and reported lower timing jitter on a USB analyser
dcw_otg: trim xfer length when buffer larger than allocated size is received
dwc_otg: Don't free qh align buffers in atomic context
dwc_otg: Enable the hack for Split Interrupt transactions by default
dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.
See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437
Signed-off-by: popcornmix <popcornmix@gmail.com>
dwc_otg: Use kzalloc when suitable
dwc_otg: Pass struct device to dma_alloc*()
This makes it possible to get the bus address from Device Tree.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: fix summarize urb->actual_length for isochronous transfers
Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixesraspberrypi/linux#903
fiq_fsm: Use correct states when starting isoc OUT transfers
In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.
For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.
Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.
Set the FSM state up properly if we select a single-packet transfer.
Fixes https://github.com/raspberrypi/linux/issues/1842
dwc_otg: make nak_holdoff work as intended with empty queues
If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.
Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.
Fixes https://github.com/raspberrypi/linux/issues/1709
dwc_otg: fix split transaction data toggle handling around dequeues
See https://github.com/raspberrypi/linux/issues/1709
Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()
dwc_otg: fix several potential crash sources
On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.
dwc_otg: delete hcd->channel_lock
The lock serves no purpose as it is only held while the HCD spinlock
is already being held.
dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt
Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.
There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.
dwcotg: Allow to build without FIQ on ARM64
Signed-off-by: popcornmix <popcornmix@gmail.com>
dwc_otg: make periodic scheduling behave properly for FS buses
If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.
Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.
Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.
Related issue: https://github.com/raspberrypi/linux/issues/2020
dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly
Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.
dwc_otg: add module parameter int_ep_interval_min
Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.
The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).
dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints
Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.
Constrain queuing non-periodic split transactions so they are performed
serially in such cases.
Related: https://github.com/raspberrypi/linux/issues/2024
dwc_otg: Fixup change to DRIVER_ATTR interface
dwc_otg: Fix compilation warnings
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
USB_DWCOTG: Disable building dwc_otg as a module (#2265)
When dwc_otg is built as a module, build will fail with the following
error:
ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2
Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.
As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.
See: https://github.com/raspberrypi/linux/issues/2258
Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>
dwc_otg: New timer API
dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE
dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()
Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.
dwc_otg: Fix a regression when dequeueing isochronous transfers
In 282bed95 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.
This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.
Restore the state machine fence for isochronous transfers.
fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)
See: https://github.com/raspberrypi/linux/issues/2140
dwc_otg: add smp_mb() to prevent driver state corruption on boot
Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.
The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.
usb: dwc_otg: fix memory corruption in dwc_otg driver
[Upstream commit 51b1b64917]
The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.
The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.
Thanks to Stephen Warren for tracking down the cause of this.
Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
usb: dwb_otg: Fix unreachable switch statement warning
This warning appears with GCC 7.3.0 from toolchains.bootlin.com:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
~~~~~~~~~~~~~~~~~^~~~
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation
Rationalise the offset and update all call sites.
Fixes https://github.com/raspberrypi/linux/issues/2408
dwc_otg: fix bug with port_addr assignment for single-TT hubs
See https://github.com/raspberrypi/linux/issues/2734
The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.
usb: dwc_otg: Clean up build warnings on 64bit kernels
No functional changes. Almost all are changes to logging lines.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
usb: dwc_otg: Use dma allocation for mphi dummy_send buffer
The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.
Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
dwc_otg: only do_split when we actually need to do a split
The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.
dwc_otg: fix locking around dequeueing and killing URBs
kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.
Also fix up a case where the global interrupt register could be read with
the fiq lock not held.
Fixes the deadlock seen in https://github.com/raspberrypi/linux/issues/2907
ARM64/DWC_OTG: Port dwc_otg driver to ARM64
In ARM64, the FIQ mechanism used by this driver is not current
implemented. As a workaround, reqular IRQ is used instead
of FIQ.
In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time. This reduces the need for FIQ.
Tests Run:
This mechanism is most likely to break when multiple USB devices
are attached at the same time. So the system was tested under
stress.
Devices:
1. USB Speakers playing back a FLAC audio through VLC
at 96KHz.(Higher then typically, but supported on my speakers).
2. sftp transferring large files through the buildin ethernet
connection which is connected through USB.
3. Keyboard and mouse attached and being used.
Although I do occasionally hear some glitches, the music seems to
play quite well.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
usb: dwc_otg: Clean up interrupt claiming code
The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.
See: https://github.com/raspberrypi/linux/issues/2612
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
dwc_otg: Choose appropriate IRQ handover strategy
2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
usb: host: dwc_otg: fix compiling in separate directory
The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.
Signed-off-by: Marek Behún <marek.behun@nic.cz>
dwc_otg: use align_buf for small IN control transfers (#3150)
The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.
Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.
In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.
See: https://github.com/raspberrypi/linux/issues/3148
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
dwc_otg: Declare DMA capability with HCD_DMA flag
Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.
[1] 7b81cb6bdd ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
dwc_otg: checking the urb->transfer_buffer too early (#3332)
After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0
When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.
The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.
But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.
BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>
dwc_otg: constrain endpoint max packet and transfer size on split IN
The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.
Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.
The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
dwc_otg: fiq_fsm: pause when cancelling split transactions
Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.
A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)
On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.
Add dsb(sy) on entry so that all outstanding reads are retired.
The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.
On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2709: Drop platform smp and timer init code
irq-bcm2836 handles this through these functions:
bcm2835_init_local_timer_frequency()
bcm2836_arm_irqchip_smp_init()
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm270x: Use watchdog for reboot/poweroff
The watchdog driver already has support for reboot/poweroff.
Make use of this and remove the code from the platform files.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
board_bcm2835: Remove coherent dma pool increase - API has gone
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
SQUASH: pinctrl: bcm2835: Set base for bcm2711 GPIO to 0
Without this patch GPIOs don't seem to work properly, primarily
noticeable as broken LEDs.
Squash with "pinctrl-bcm2835: Set base to 0 give expected gpio numbering"
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Under some circumstances on BCM283x processors data loss can be
observed - a single byte missing from the TX output stream. These bytes
are always the last byte of a batch of 8 written from pl011_tx_chars
when from_irq is true, meaning that the FIFO full flag is not checked
before writing.
The transmit optimisation relies on the FIFO being half-empty when the
TX interrupt is raised. Instrumenting the driver further showed that
the failure case correlated with the TX FIFO full flag being set at the
point where the last byte was written to the data register, which
explains the data loss but not how the FIFO appeared to be prematurely
full. A possible explanation is that a FIFO write was in flight at the
time the interrupt was raised, but as yet there is no hypothesis as to
how this might occur.
In the absence of a clear understanding of the failure mechanism, avoid
the problem by checking the FIFO levels before writing the last byte of
the group, which will have minimal performance impact.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The PL011 driver lacks throttle and unthrottle methods. As a result,
sending more data to the Pi than it can immediately sink while CRTSCTS
is enabled causes a NULL pointer to be followed.
Add a throttle handler that disables the RX interrupts, and an
unthrottle handler that reenables them.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The BCM2835 PL011 implementation seems to have a bug that can lead to a
transmission lockup if CTS changes frequently. A workaround was added to
the driver with a vendor-specific flag to enable it, but this flag is
currently not set for ARM implementations.
Add a "cts-event-workaround" property to Pi DTBs and use the presence
of that property to force the flag to be enabled in the driver.
See: https://github.com/raspberrypi/linux/issues/1280
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The pl011 register accessor functions use the _relaxed versions of the
standard readl() and writel() functions, meaning that there are no
automatic memory barriers. When polling a FIFO status register to check
for fullness, it is necessary to ensure that any outstanding writes have
completed; otherwise the flags are effectively stale, making it possible
that the next write is to a full FIFO.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The UART clock is initialised to be as close to the requested
frequency as possible without exceeding it. Now that there is a
clock manager that returns the actual frequencies, an expected
48MHz clock is reported as 47999625. If the requested baudrate
== requested clock/16, there is no headroom and the slight
reduction in actual clock rate results in failure.
Detect cases where it looks like a "round" clock was chosen and
adjust the reported clock to match that "round" value. As the
code comment says:
/*
* If increasing a clock by less than 0.1% changes it
* from ..999.. to ..000.., round up.
*/
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The pl011 driver looks for DT aliases of the form "serial<n>",
and if found uses <n> as the device ID. This can cause
/dev/ttyAMA0 to become /dev/ttyAMA1, which is confusing if the
other serial port is provided by the 8250 driver which doesn't
use the same logic.
For applications of the LAN78xx that don't have valid programmed
EEPROMs or OTPs, enabling both LEDs and auto-negotiation by default
seems reasonable.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The syscon node defines a register range that duplicates that used by
the local_intc node on bcm2836/7. Since irq-bcm2835 and irq-bcm2836 are
built in and always present together (both drivers are enabled by
CONFIG_ARCH_BCM2835), it is possible to replace the syscon usage with a
global variable that simplifies the code. Doing so does lose the
locking provided by regmap, but as only one side is using the regmap
interface (irq-bcm2835 uses readl and write) there is no loss of
atomicity.
See: https://github.com/raspberrypi/firmware/issues/926
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
This adds a debug module parameter to aid in debugging transfer issues
by printing info to the kernel log. When enabled, status values are
collected in the interrupt routine and msg info in
bcm2835_i2c_start_transfer(). This is done in a way that tries to avoid
affecting timing. Having printk in the isr can mask issues.
debug values (additive):
1: Print info on error
2: Print info on all transfers
3: Print messages before transfer is started
The value can be changed at runtime:
/sys/module/i2c_bcm2835/parameters/debug
Example output, debug=3:
[ 747.114448] bcm2835_i2c_xfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1]
[ 747.114463] bcm2835_i2c_xfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1]
[ 747.117809] start_transfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1]
[ 747.117825] isr: remain=2, status=0x30000055 : TA TXW TXD TXE [i2c1]
[ 747.117839] start_transfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1]
[ 747.117849] isr: remain=32, status=0xd0000039 : TA RXR TXD RXD [i2c1]
[ 747.117861] isr: remain=20, status=0xd0000039 : TA RXR TXD RXD [i2c1]
[ 747.117870] isr: remain=8, status=0x32 : DONE TXD RXD [i2c1]
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Christopher Alexander Tobias Schulze - May 2, 2015, 11:57 a.m.
This patch fixes a problem with VFP state save and restore related
to exception handling (panic with message "BUG: unsupported FP
instruction in kernel mode") present on VFP11 floating point units
(as used with ARM1176JZF-S CPUs, e.g. on first generation Raspberry
Pi boards). This patch was developed and discussed on
https://github.com/raspberrypi/linux/issues/859
A precondition to see the crashes is that floating point exception
traps are enabled. In this case, the VFP11 might determine that a FPU
operation needs to trap at a point in time when it is not possible to
signal this to the ARM11 core any more. The VFP11 will then set the
FPEXC.EX bit and store the trapped opcode in FPINST. (In some cases,
a second opcode might have been accepted by the VFP11 before the
exception was detected and could be reported to the ARM11 - in this
case, the VFP11 also sets FPEXC.FP2V and stores the second opcode in
FPINST2.)
If FPEXC.EX is set, the VFP11 will "bounce" the next FPU opcode issued
by the ARM11 CPU, which will be seen by the ARM11 as an undefined opcode
trap. The VFP support code examines the FPEXC.EX and FPEXC.FP2V bits
to decide what actions to take, i.e., whether to emulate the opcodes
found in FPINST and FPINST2, and whether to retry the bounced instruction.
If a user space application has left the VFP11 in this "pending trap"
state, the next FPU opcode issued to the VFP11 might actually be the
VSTMIA operation vfp_save_state() uses to store the FPU registers
to memory (in our test cases, when building the signal stack frame).
In this case, the kernel crashes as described above.
This patch fixes the problem by making sure that vfp_save_state() is
always entered with FPEXC.EX cleared. (The current value of FPEXC has
already been saved, so this does not corrupt the context. Clearing
FPEXC.EX has no effects on FPINST or FPINST2. Also note that many
callers already modify FPEXC by setting FPEXC.EN before invoking
vfp_save_state().)
This patch also addresses a second problem related to FPEXC.EX: After
returning from signal handling, the kernel reloads the VFP context
from the user mode stack. However, the current code explicitly clears
both FPEXC.EX and FPEXC.FP2V during reload. As VFP11 requires these
bits to be preserved, this patch disables clearing them for VFP
implementations belonging to architecture 1. There should be no
negative side effects: the user can set both bits by executing FPU
opcodes anyway, and while user code may now place arbitrary values
into FPINST and FPINST2 (e.g., non-VFP ARM opcodes) the VFP support
code knows which instructions can be emulated, and rejects other
opcodes with "unhandled bounce" messages, so there should be no
security impact from allowing reloading FPEXC.EX and FPEXC.FP2V.
Signed-off-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>
At present there is no mechanism to specify driver load order,
which can lead to deferrals and repeated retries until successful.
Since this situation is expected, reduce the dmesg level to
INFO and mention that the operation will be retried.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The VPU is responsible for managing the core clock, usually under
direction from the bcm2835-cpufreq driver but not via the clk-bcm2835
driver. Since the core frequency can change without warning, it is
safer to report the maximum clock rate to users of the core clock -
I2C, SPI and the mini UART - to err on the safe side when calculating
clock divisors.
If the DT node for the clock driver includes a reference to the
firmware node, use the firmware API to query the maximum core clock
instead of reading the divider registers.
Prior to this patch, a "100KHz" I2C bus was sometimes clocked at about
160KHz. In particular, switching to the 4.9 kernel was likely to break
SenseHAT usage on a Pi3.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The claim-clocks property can be used to prevent PLLs and dividers
from being marked as critical. It contains a vector of clock IDs,
as defined by dt-bindings/clock/bcm2835.h.
Use this mechanism to claim PLLD_DSI0, PLLD_DSI1, PLLH_AUX and
PLLH_PIX for the vc4_kms_v3d driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The VPU configures and relies on several PLLs and dividers. Mark all
enabled dividers and their PLLs as CRITICAL to prevent the kernel from
switching them off.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The Raspberry Pi firmware looks at the RSTS register to know which
partition to boot from. The reboot syscall command
LINUX_REBOOT_CMD_RESTART2 supports passing in a string argument.
Add support for passing in a partition number 0..63 to boot from.
Partition 63 is a special partiton indicating halt.
If the partition doesn't exist, the firmware falls back to partition 0.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Load driver early since at least bcm2708_fb doesn't support deferred
probing and even if it did, we don't want the video driver deferred.
Support the legacy DMA API which is needed by bcm2708_fb.
Don't mask out channel 2.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
An alternative strategy would be to use "rpi,spidev" instead, but that
would require many Raspberry Pi Device Tree changes.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add a duplicate irq range with an offset on the hwirq's so the
driver can detect that enable_fiq() is used.
Tested with downstream dwc_otg USB controller driver.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
smsc95xx is adjusting truesize when it shouldn't, and following a recent patch from Eric this is now triggering warnings.
This patch stops smsc95xx from changing truesize.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
This reverts commit 92516cd97f.
Thi commit "Bluetooth: Always request for user confirmation for Just
Works" prevents BLE devices pairing in (at least) the Raspberry Pi OS
GUI. After reverting it, pairing works again.
If another solution to the problem is found then this reversion will
be removed.
See: https://github.com/raspberrypi/linux/issues/4139
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
This reverts commit ffee202a78.
The commit "Bluetooth: Always request for user confirmation for Just
Works" prevents BLE devices pairing in (at least) the Raspberry Pi OS
GUI. After reverting it, pairing works again. Although this companion
commit ("... (LE SC)") has not been demonstrated to be problematic,
it follows the same logic and therefore could affect some use cases.
If another solution to the problem is found then this reversion will
be removed.
See: https://github.com/raspberrypi/linux/issues/4139
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
This reverts commit c7dacf5b0f.
The Pi 400 shutdown/poweroff mechanism relies on being able to set
a GPIO on the expander in the pm_power_off handler, something that
requires two mailbox calls - GET_GPIO_STATE and SET_GPIO_STATE. A
recent kernel change introduces a reasonable possibility that the
GET call doesn't completes, and bisecting led to a commit from
October that changes the timer usage of the mailbox.
My theory is that there is a race condition in the new code that breaks
the poll timer, but that it normally goes unnoticed because subsequent
mailbox activity wakes it up again. The power-off mailbox calls happen
at a time when other subsystems have been shut down, so if one of them
fails then there is nothing to allow it to recover.
See: https://github.com/raspberrypi/linux/issues/3941
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
commit 96cfe05051 upstream.
of_parse_thermal_zones() parses the thermal-zones node and registers a
thermal_zone device for each subnode. However, if a thermal zone is
consuming a thermal sensor and that thermal sensor device hasn't probed
yet, an attempt to set trip_point_*_temp for that thermal zone device
can cause a NULL pointer dereference. Fix it.
console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp
...
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
Call trace:
of_thermal_set_trip_temp+0x40/0xc4
trip_point_temp_store+0xc0/0x1dc
dev_attr_store+0x38/0x88
sysfs_kf_write+0x64/0xc0
kernfs_fop_write_iter+0x108/0x1d0
vfs_write+0x2f4/0x368
ksys_write+0x7c/0xec
__arm64_sys_write+0x20/0x30
el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc
do_el0_svc+0x28/0xa0
el0_svc+0x14/0x24
el0_sync_handler+0x88/0xec
el0_sync+0x1c0/0x200
While at it, fix the possible NULL pointer dereference in other
functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(),
of_thermal_get_trend().
Suggested-by: David Collins <quic_collinsd@quicinc.com>
Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4716023a8f upstream.
PEBS PERF_SAMPLE_PHYS_ADDR events use perf_virt_to_phys() to convert PMU
sampled virtual addresses to physical using get_user_page_fast_only()
and page_to_phys().
Some get_user_page_fast_only() error cases return false, indicating no
page reference, but still initialize the output page pointer with an
unreferenced page. In these error cases perf_virt_to_phys() calls
put_page(). This causes page reference count underflow, which can lead
to unintentional page sharing.
Fix perf_virt_to_phys() to only put_page() if get_user_page_fast_only()
returns a referenced page.
Fixes: fc7ce9c74c ("perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR")
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211111021814.757086-1-gthelen@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ec18fc783 upstream.
commit 8779e05ba8 ("parisc: Fix ptrace check on syscall return")
fixed testing of TI_FLAGS. This uncovered a bug in the test mask.
syscall_restore_rfi is only used when the kernel needs to exit to
usespace with single or block stepping and the recovery counter
enabled. The test however used _TIF_SYSCALL_TRACE_MASK, which
includes a lot of bits that shouldn't be tested here.
Fix this by using TIF_SINGLESTEP and TIF_BLOCKSTEP directly.
I encountered this bug by enabling syscall tracepoints. Both in qemu and
on real hardware. As soon as i enabled the tracepoint (sys_exit_read,
but i guess it doesn't really matter which one), i got random page
faults in userspace almost immediately.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 964b7aa0b0 upstream.
In 64-bit mode, x86 instruction encoding allows us to use the low 8 bits
of any GPR as an 8-bit operand. In 32-bit mode, however, we can only use
the [abcd] registers. For which, GCC has the "q" constraint instead of
the less restrictive "r".
Also fix st->preempted, which is an input/output operand rather than an
input.
Fixes: 7e2175ebd6 ("KVM: x86: Fix recording of guest steal time / preempted status")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <89bf72db1b859990355f9c40713a34e0d2d86c98.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d55c3ee6b4 upstream.
Commit a4b83deb3e ("media: videobuf2: rework vb2_mem_ops API")
added a new vb member to struct vb2_dma_sg_buf, but it only added
code setting this to the vb2_dma_sg_alloc() function and not to the
vb2_dma_sg_get_userptr() and vb2_dma_sg_attach_dmabuf() which also
create vb2_dma_sg_buf objects.
This is causing a crash due to a NULL pointer deref when using
libcamera on devices with an Intel IPU3 (qcam app).
Fix these crashes by assigning buf->vb in the other 2 functions too,
note libcamera tests the vb2_dma_sg_get_userptr() path, the change
to the vb2_dma_sg_attach_dmabuf() path is untested.
Fixes: a4b83deb3e ("media: videobuf2: rework vb2_mem_ops API")
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 541ac97186 upstream.
The size of the exception stacks was increased by the commit in Fixes,
resulting in stack sizes greater than a page in size. The #VC exception
handling was only mapping the first (bottom) page, resulting in an
SEV-ES guest failing to boot.
Make the #VC exception stacks part of the default exception stacks
storage and allocate them with a CONFIG_AMD_MEM_ENCRYPT=y .config. Map
them only when a SEV-ES guest has been detected.
Rip out the custom VC stacks mapping and storage code.
[ bp: Steal and adapt Tom's commit message. ]
Fixes: 7fae4c24a2 ("x86: Increase exception stack sizes")
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Brijesh Singh <brijesh.singh@amd.com>
Link: https://lkml.kernel.org/r/YVt1IMjIs7pIZTRR@zn.tnic
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb181da161 upstream.
The new function validate_hash_algo() assumed that ima_get_hash_algo()
always return a valid 'enum hash_algo', but it returned the
user-supplied value present in the digital signature without
any bounds checks.
Update ima_get_hash_algo() to always return a valid hash algorithm,
defaulting on 'ima_hash_algo' when the user-supplied value inside
the xattr is invalid.
Signed-off-by: THOBY Simon <Simon.THOBY@viveris.fr>
Reported-by: syzbot+e8bafe7b82c739eaf153@syzkaller.appspotmail.com
Fixes: 50f742dd91 ("IMA: block writes of the security.ima xattr with unsupported algorithms")
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a20eac0af0 upstream.
Previous fix aded bpf_clamp_umax() helper use to re-validate boundaries.
While that works correctly, it introduces more branches, which blows up
past 1 million instructions in no-alu32 variant of strobemeta selftests.
Switching len variable from u32 to u64 also fixes the issue and reduces
the number of validated instructions, so use that instead. Fix this
patch and bpf_clamp_umax() removed, both alu32 and no-alu32 selftests
pass.
Fixes: 0133c20480 ("selftests/bpf: Fix strobemeta selftest regression")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211101230118.1273019-1-andrii@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a72fdfd21e upstream.
Commit in Fixes changed the iopl emulation to not #GP on CLI and STI
because it would break some insane luserspace tools which would toggle
interrupts.
The corresponding selftest would rely on the fact that executing CLI/STI
would trigger a #GP and thus detect it this way but since that #GP is
not happening anymore, the detection is now wrong too.
Extend the test to actually look at the IF flag and whether executing
those insns had any effect on it. The STI detection needs to have the
fact that interrupts were previously disabled, passed in so do that from
the previous CLI test, i.e., STI test needs to follow a previous CLI one
for it to make sense.
Fixes: b968e84b50 ("x86/iopl: Fake iopl(3) CLI/STI usage")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20211030083939.13073-1-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0eab756f88 upstream.
There are several error return paths that dereference the null pointer
host because the pointer has not yet been set to a valid value.
Fix this by adding a new out_mmc label and exiting via this label
to avoid the host clean up and hence the null pointer dereference.
Addresses-Coverity: ("Explicit null dereference")
Fixes: 8105c2abbf ("mmc: moxart: Fix reference count leaks in moxart_probe")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20211013100052.125461-1-colin.king@canonical.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 937e79c677 upstream.
Using a kernel pointer in place of a dma_addr_t token can
lead to undefined behavior if that makes it into cache
management functions. The compiler caught one such attempt
in a cast:
drivers/net/wireless/ath/ath10k/mac.c: In function 'ath10k_add_interface':
drivers/net/wireless/ath/ath10k/mac.c:5586:47: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
5586 | arvif->beacon_paddr = (dma_addr_t)arvif->beacon_buf;
| ^
Looking through how this gets used down the way, I'm fairly
sure that beacon_paddr is never accessed again for ATH10K_DEV_TYPE_HL
devices, and if it was accessed, that would be a bug.
Change the assignment to use a known-invalid address token
instead, which avoids the warning and makes it easier to catch
bugs if it does end up getting used.
Fixes: e263bdab9c ("ath10k: high latency fixes for beacon buffer")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211014075153.3655910-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 112024a3b6 upstream.
Adding kfree(dvb) to vidtv_bridge_remove() will remove the memory
too soon: if an application still has an open filehandle to the device
when the driver is unloaded, then when that filehandle is closed, a
use-after-free access takes place to the freed memory.
Move the kfree(dvb) to vidtv_bridge_dev_release() instead.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 76e21bb8be ("media: vidtv: Fix memory leak in remove")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea7a1019d8 upstream.
The premise of commit 6f9f17287e ("SUNRPC: Mitigate cond_resched() in
xprt_transmit()") was that cond_resched() is expensive and unnecessary
when there has been just a single send.
The point of cond_resched() is to ensure that tasks that should pre-empt
this one get a chance to do so when it is safe to do so. The code prior
to commit 6f9f17287e failed to take into account that it was keeping a
rpc_task pinned for longer than it needed to, and so rather than doing a
full revert, let's just move the cond_resched.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a4e17d65da upstream.
Change PCIe Max Payload Size setting in PCIe Device Control register to 512
bytes to align with PCIe Link Initialization sequence as defined in Marvell
Armada 3700 Functional Specification. According to the specification,
maximal Max Payload Size supported by this device is 512 bytes.
Without this kernel prints suspicious line:
pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 16384, max 512)
With this change it changes to:
pci 0000:01:00.0: Upstream bridge's Max Payload Size set to 256 (was 512, max 512)
Link: https://lore.kernel.org/r/20211005180952.6812-3-kabel@kernel.org
Fixes: 8c39d71036 ("PCI: aardvark: Add Aardvark PCI host controller driver")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c45361abb9 upstream.
When CONFIG_SMP=y, timebase synchronization is required when the second
kernel is started.
arch/powerpc/kernel/smp.c:
int __cpu_up(unsigned int cpu, struct task_struct *tidle)
{
...
if (smp_ops->give_timebase)
smp_ops->give_timebase();
...
}
void start_secondary(void *unused)
{
...
if (smp_ops->take_timebase)
smp_ops->take_timebase();
...
}
When CONFIG_HOTPLUG_CPU=n and CONFIG_KEXEC_CORE=n,
smp_85xx_ops.give_timebase is NULL,
smp_85xx_ops.take_timebase is NULL,
As a result, the timebase is not synchronized.
Timebase synchronization does not depend on CONFIG_HOTPLUG_CPU.
Fixes: 56f1ba2807 ("powerpc/mpc85xx: refactor the PM operations")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210929033646.39630-3-nixiaoming@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 319fa1a52e upstream.
On VMs with NX encryption, compression, and/or RNG offload, these
capabilities are described by nodes in the ibm,platform-facilities device
tree hierarchy:
$ tree -d /sys/firmware/devicetree/base/ibm,platform-facilities/
/sys/firmware/devicetree/base/ibm,platform-facilities/
├── ibm,compression-v1
├── ibm,random-v1
└── ibm,sym-encryption-v1
3 directories
The acceleration functions that these nodes describe are not disrupted by
live migration, not even temporarily.
But the post-migration ibm,update-nodes sequence firmware always sends
"delete" messages for this hierarchy, followed by an "add" directive to
reconstruct it via ibm,configure-connector (log with debugging statements
enabled in mobility.c):
mobility: removing node /ibm,platform-facilities/ibm,random-v1:4294967285
mobility: removing node /ibm,platform-facilities/ibm,compression-v1:4294967284
mobility: removing node /ibm,platform-facilities/ibm,sym-encryption-v1:4294967283
mobility: removing node /ibm,platform-facilities:4294967286
...
mobility: added node /ibm,platform-facilities:4294967286
Note we receive a single "add" message for the entire hierarchy, and what
we receive from the ibm,configure-connector sequence is the top-level
platform-facilities node along with its three children. The debug message
simply reports the parent node and not the whole subtree.
Also, significantly, the nodes added are almost completely equivalent to
the ones removed; even phandles are unchanged. ibm,shared-interrupt-pool in
the leaf nodes is the only property I've observed to differ, and Linux does
not use that. So in practice, the sum of update messages Linux receives for
this hierarchy is equivalent to minor property updates.
We succeed in removing the original hierarchy from the device tree. But the
vio bus code is ignorant of this, and does not unbind or relinquish its
references. The leaf nodes, still reachable through sysfs, of course still
refer to the now-freed ibm,platform-facilities parent node, which makes
use-after-free possible:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 3 PID: 1706 at lib/refcount.c:25 refcount_warn_saturate+0x164/0x1f0
refcount_warn_saturate+0x160/0x1f0 (unreliable)
kobject_get+0xf0/0x100
of_node_get+0x30/0x50
of_get_parent+0x50/0xb0
of_fwnode_get_parent+0x54/0x90
fwnode_count_parents+0x50/0x150
fwnode_full_name_string+0x30/0x110
device_node_string+0x49c/0x790
vsnprintf+0x1c0/0x4c0
sprintf+0x44/0x60
devspec_show+0x34/0x50
dev_attr_show+0x40/0xa0
sysfs_kf_seq_show+0xbc/0x200
kernfs_seq_show+0x44/0x60
seq_read_iter+0x2a4/0x740
kernfs_fop_read_iter+0x254/0x2e0
new_sync_read+0x120/0x190
vfs_read+0x1d0/0x240
Moreover, the "new" replacement subtree is not correctly added to the
device tree, resulting in ibm,platform-facilities parent node without the
appropriate leaf nodes, and broken symlinks in the sysfs device hierarchy:
$ tree -d /sys/firmware/devicetree/base/ibm,platform-facilities/
/sys/firmware/devicetree/base/ibm,platform-facilities/
0 directories
$ cd /sys/devices/vio ; find . -xtype l -exec file {} +
./ibm,sym-encryption-v1/of_node: broken symbolic link to
../../../firmware/devicetree/base/ibm,platform-facilities/ibm,sym-encryption-v1
./ibm,random-v1/of_node: broken symbolic link to
../../../firmware/devicetree/base/ibm,platform-facilities/ibm,random-v1
./ibm,compression-v1/of_node: broken symbolic link to
../../../firmware/devicetree/base/ibm,platform-facilities/ibm,compression-v1
This is because add_dt_node() -> dlpar_attach_node() attaches only the
parent node returned from configure-connector, ignoring any children. This
should be corrected for the general case, but fixing that won't help with
the stale OF node references, which is the more urgent problem.
One way to address that would be to make the drivers respond to node
removal notifications, so that node references can be dropped
appropriately. But this would likely force the drivers to disrupt active
clients for no useful purpose: equivalent nodes are immediately re-added.
And recall that the acceleration capabilities described by the nodes remain
available throughout the whole process.
The solution I believe to be robust for this situation is to convert
remove+add of a node with an unchanged phandle to an update of the node's
properties in the Linux device tree structure. That would involve changing
and adding a fair amount of code, and may take several iterations to land.
Until that can be realized we have a confirmed use-after-free and the
possibility of memory corruption. So add a limited workaround that
discriminates on the node type, ignoring adds and removes. This should be
amenable to backporting in the meantime.
Fixes: 410bccf978 ("powerpc/pseries: Partition migration in the kernel")
Cc: stable@vger.kernel.org
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020194703.2613093-1-nathanl@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a5cb51f3d upstream.
The check_return_regs_valid() can cause a false positive if the return
regs are marked as norestart and they are an HSRR type interrupt,
because the low bit in the bottom of regs->trap causes interrupt type
matching to fail.
This can occcur for example on bare metal with a HV privileged doorbell
interrupt that causes a signal, but do_signal returns early because
get_signal() fails, and takes the "No signal to deliver" path. In this
case no signal was delivered so the return location is not changed so
return SRRs are not invalidated, yet set_trap_norestart is called, which
messes up the match. Building go-1.16.6 is known to reproduce this.
Fix it by using the TRAP() accessor which masks out the low bit.
Fixes: 6eaaf9de35 ("powerpc/64s/interrupt: Check and fix srr_valid without crashing")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211026122531.3599918-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c12b4df8d upstream.
The mitigation-patching.sh script in the powerpc selftests toggles
all mitigations on and off simultaneously, revealing that rfi_flush
and stf_barrier cannot safely operate at the same time due to races
in updating the static key.
On some systems, the static key code throws a warning and the kernel
remains functional. On others, the kernel will hang or crash.
Fix this by slapping on a mutex.
Fixes: 13799748b9 ("powerpc/64: use interrupt restart table to speed up return from interrupt")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211027072410.40950-1-ruscur@russell.cc
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 81291383ff upstream.
A e5500 machine running a 32-bit kernel sometimes hangs at boot,
seemingly going into an infinite loop of instruction storage interrupts.
The ESR (Exception Syndrome Register) has a value of 0x800000 (store)
when this happens, which is likely set by a previous store. An
instruction TLB miss interrupt would then leave ESR unchanged, and if no
PTE exists it calls directly to the instruction storage interrupt
handler without changing ESR.
access_error() does not cause a segfault due to a store to a read-only
vma because is_exec is true. Most subsequent fault handling does not
check for a write fault on a read-only vma, and might do strange things
like create a writeable PTE or call page_mkwrite on a read only vma or
file. It's not clear what happens here to cause the infinite faulting in
this case, a fault handler failure or low level PTE or TLB handling.
In any case this can be fixed by having the instruction storage
interrupt zero regs->dsisr rather than storing the ESR value to it.
Fixes: a01a3f2ddb ("powerpc: remove arguments from fault handler functions")
Cc: stable@vger.kernel.org # v5.12+
Reported-by: Jacques de Laval <jacques.delaval@protonmail.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Jacques de Laval <jacques.delaval@protonmail.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211028133043.4159501-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 44a8214de9 upstream.
Running program with bpf-to-bpf function calls results in data access
exception (0x300) with the below call trace:
bpf_int_jit_compile+0x238/0x750 (unreliable)
bpf_check+0x2008/0x2710
bpf_prog_load+0xb00/0x13a0
__sys_bpf+0x6f4/0x27c0
sys_bpf+0x2c/0x40
system_call_exception+0x164/0x330
system_call_vectored_common+0xe8/0x278
as bpf_int_jit_compile() tries writing to write protected JIT code
location during the extra pass.
Fix it by holding off write protection of JIT code until the extra
pass, where branch target addresses fixup happens.
Fixes: 62e3d4210a ("powerpc/bpf: Write protect JIT code")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211025055649.114728-1-hbathini@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e3cdba176 upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: dbffc8ccdf ("mtd: rawnand: au1550: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-3-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 325fd539fc upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: 612e048e6a ("mtd: rawnand: plat_nand: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-8-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 194ac63de6 upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: 553508cec2 ("mtd: rawnand: orion: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-6-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f16b7d2a5e upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: 8fc6f1f042 ("mtd: rawnand: pasemi: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-7-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5b5b4dc6f upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: f6341f6448 ("mtd: rawnand: gpio: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-4-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9d8570b7f upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: 6dd09f775b ("mtd: rawnand: mpc5121: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-5-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6bcd2960af upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: d525914b5b ("mtd: rawnand: xway: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Cc: Jan Hoffmann <jan@3e8.eu>
Cc: Kestrel seventyfour <kestrelseventyfour@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Tested-by: Jan Hoffmann <jan@3e8.eu>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-10-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d707bb74da upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: 59d9347332 ("mtd: rawnand: ams-delta: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-2-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9be1446ece upstream.
The introduction of the generic ECC engine API lead to a number of
changes in various drivers which broke some of them. Here is a typical
example: I expected the SM_ORDER option to be handled by the Hamming ECC
engine internals. Problem: the fsmc driver does not instantiate (yet) a
real ECC engine object so we had to use a 'bare' ECC helper instead of
the shiny rawnand functions. However, when not intializing this engine
properly and using the bare helpers, we do not get the SM ORDER feature
handled automatically. It looks like this was lost in the process so
let's ensure we use the right SM ORDER now.
Fixes: ad9ffdce45 ("mtd: rawnand: fsmc: Fix external use of SW Hamming ECC helper")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928221507.199198-2-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad9a145172 upstream.
Since commit 48720ba568 ("virtio/s390: use DMA memory for ccw I/O and
classic notifiers") we were supposed to make sure that
virtio_ccw_release_dev() completes before the ccw device and the
attached dma pool are torn down, but unfortunately we did not. Before
that commit it used to be OK to delay cleaning up the memory allocated
by virtio-ccw indefinitely (which isn't really intuitive for guys used
to destruction happens in reverse construction order), but now we
trigger a BUG_ON if the genpool is destroyed before all memory allocated
from it is deallocated. Which brings down the guest. We can observe this
problem, when unregister_virtio_device() does not give up the last
reference to the virtio_device (e.g. because a virtio-scsi attached scsi
disk got removed without previously unmounting its previously mounted
partition).
To make sure that the genpool is only destroyed after all the necessary
freeing is done let us take a reference on the ccw device on each
ccw_device_dma_zalloc() and give it up on each ccw_device_dma_free().
Actually there are multiple approaches to fixing the problem at hand
that can work. The upside of this one is that it is the safest one while
remaining simple. We don't crash the guest even if the driver does not
pair allocations and frees. The downside is the reference counting
overhead, that the reference counting for ccw devices becomes more
complex, in a sense that we need to pair the calls to the aforementioned
functions for it to be correct, and that if we happen to leak, we leak
more than necessary (the whole ccw device instead of just the genpool).
Some alternatives to this approach are taking a reference in
virtio_ccw_online() and giving it up in virtio_ccw_release_dev() or
making sure virtio_ccw_release_dev() completes its work before
virtio_ccw_remove() returns. The downside of these approaches is that
these are less safe against programming errors.
Cc: <stable@vger.kernel.org> # v5.3
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: 48720ba568 ("virtio/s390: use DMA memory for ccw I/O and classic notifiers")
Reported-by: bfu@redhat.com
Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3826350e6d upstream.
When a queue is switched to soft offline during heavy load and later
switched to soft online again and now used, it may be that the caller
is blocked forever in the ioctl call.
The failure occurs because there is a pending reply after the queue(s)
have been switched to offline. This orphaned reply is received when
the queue is switched to online and is accidentally counted for the
outstanding replies. So when there was a valid outstanding reply and
this orphaned reply is received it counts as the outstanding one thus
dropping the outstanding counter to 0. Voila, with this counter the
receive function is not called any more and the real outstanding reply
is never received (until another request comes in...) and the ioctl
blocks.
The fix is simple. However, instead of readjusting the counter when an
orphaned reply is detected, I check the queue status for not empty and
compare this to the outstanding counter. So if the queue is not empty
then the counter must not drop to 0 but at least have a value of 1.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 213fca9e23 upstream.
commit 9c6c273aa4 ("timer: Remove init_timer_on_stack() in favor
of timer_setup_on_stack()") changed the timer setup from
init_timer_on_stack(() to timer_setup(), but missed to change the
mod_timer() call. And while at it, use msecs_to_jiffies() instead
of the open coded timeout calculation.
Cc: stable@vger.kernel.org
Fixes: 9c6c273aa4 ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d48c7afed upstream.
When a CPU is hotplugged while the perf stat -e cycles command is
running, a wrong (very large) value is displayed immediately after the
CPU removal:
Check the values, shouldn't be too high as in
time counts unit events
1.001101919 29261846 cycles
2.002454499 17523405 cycles
3.003659292 24361161 cycles
4.004816983 18446744073638406144 cycles
5.005671647 <not counted> cycles
...
The CPU hotplug off took place after 3 seconds.
The issue is the read of the event count value after 4 seconds when
the CPU is not available and the read of the counter returns an
error. This is treated as a counter value of zero. This results
in a very large value (0 - previous_value).
Fix this by detecting the hotplugged off CPU and report 0 instead
of a very large number.
Cc: stable@vger.kernel.org
Fixes: a029a4eab3 ("s390/cpumf: Allow concurrent access for CPU Measurement Counter Facility")
Reported-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33a5471f8d upstream.
The note in c2adda27d2 ("video: backlight: Add of_find_backlight helper
in backlight.c") says that gpio-backlight uses brightness as power state.
This has been fixed since in ec665b756e ("backlight: gpio-backlight:
Correct initial power state handling") and other backlight drivers do not
require this workaround. Drop the workaround.
This fixes the case where e.g. pwm-backlight can perfectly well be set to
brightness 0 on boot in DT, which without this patch leads to the display
brightness to be max instead of off.
Fixes: c2adda27d2 ("video: backlight: Add of_find_backlight helper in backlight.c")
Cc: <stable@vger.kernel.org> # 5.4+
Cc: <stable@vger.kernel.org> # 4.19.x: ec665b756e: backlight: gpio-backlight: Correct initial power state handling
Signed-off-by: Marek Vasut <marex@denx.de>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60e2793d44 upstream.
Any allocation failure during the #PF path will return with VM_FAULT_OOM
which in turn results in pagefault_out_of_memory. This can happen for 2
different reasons. a) Memcg is out of memory and we rely on
mem_cgroup_oom_synchronize to perform the memcg OOM handling or b)
normal allocation fails.
The latter is quite problematic because allocation paths already trigger
out_of_memory and the page allocator tries really hard to not fail
allocations. Anyway, if the OOM killer has been already invoked there
is no reason to invoke it again from the #PF path. Especially when the
OOM condition might be gone by that time and we have no way to find out
other than allocate.
Moreover if the allocation failed and the OOM killer hasn't been invoked
then we are unlikely to do the right thing from the #PF context because
we have already lost the allocation context and restictions and
therefore might oom kill a task from a different NUMA domain.
This all suggests that there is no legitimate reason to trigger
out_of_memory from pagefault_out_of_memory so drop it. Just to be sure
that no #PF path returns with VM_FAULT_OOM without allocation print a
warning that this is happening before we restart the #PF.
[VvS: #PF allocation can hit into limit of cgroup v1 kmem controller.
This is a local problem related to memcg, however, it causes unnecessary
global OOM kills that are repeated over and over again and escalate into a
real disaster. This has been broken since kmem accounting has been
introduced for cgroup v1 (3.8). There was no kmem specific reclaim for
the separate limit so the only way to handle kmem hard limit was to return
with ENOMEM. In upstream the problem will be fixed by removing the
outdated kmem limit, however stable and LTS kernels cannot do it and are
still affected. This patch fixes the problem and should be backported
into stable/LTS.]
Link: https://lkml.kernel.org/r/f5fd8dd8-0ad4-c524-5f65-920b01972a42@virtuozzo.com
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b28179a61 upstream.
Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3.
Memory cgroup charging allows killed or exiting tasks to exceed the hard
limit. It can be misused and allowed to trigger global OOM from inside
a memcg-limited container. On the other hand if memcg fails allocation,
called from inside #PF handler it triggers global OOM from inside
pagefault_out_of_memory().
To prevent these problems this patchset:
(a) removes execution of out_of_memory() from
pagefault_out_of_memory(), becasue nobody can explain why it is
necessary.
(b) allow memcg to fail allocation of dying/killed tasks.
This patch (of 3):
Any allocation failure during the #PF path will return with VM_FAULT_OOM
which in turn results in pagefault_out_of_memory which in turn executes
out_out_memory() and can kill a random task.
An allocation might fail when the current task is the oom victim and
there are no memory reserves left. The OOM killer is already handled at
the page allocator level for the global OOM and at the charging level
for the memcg one. Both have much more information about the scope of
allocation/charge request. This means that either the OOM killer has
been invoked properly and didn't lead to the allocation success or it
has been skipped because it couldn't have been invoked. In both cases
triggering it from here is pointless and even harmful.
It makes much more sense to let the killed task die rather than to wake
up an eternally hungry oom-killer and send him to choose a fatter victim
for breakfast.
Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3e3c102d1 upstream.
We need to ensure that we serialize the stalled and hash bits with the
wait_queue wait handler, or we could be racing with someone modifying
the hashed state after we find it busy, but before we then give up and
wait for it to be cleared. This can cause random delays or stalls when
handling buffered writes for many files, where some of these files cause
hash collisions between the worker threads.
Cc: stable@vger.kernel.org
Reported-by: Daniel Black <daniel@mariadb.org>
Fixes: e941894eae ("io-wq: make buffered file write hashed work map per-ctx")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0242f6426e upstream.
We need to set the stalled bit early, before we drop the lock for adding
us to the stall hash queue. If not, then we can race with new work being
queued between adding us to the stall hash and io_worker_handle_work()
marking us stalled.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08bdbd39b5 upstream.
A previous commit removed the IRQ safety of the worker and wqe locks,
but that left one spot of the hash wait lock now being done without
already having IRQs disabled.
Ensure that we use the right locking variant for the hashed waitqueue
lock.
Fixes: a9a4aa9fbf ("io-wq: wqe and worker locks no longer need to be IRQ safe")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a4ebf1b6ca upstream.
Memory cgroup charging allows killed or exiting tasks to exceed the hard
limit. It is assumed that the amount of the memory charged by those
tasks is bound and most of the memory will get released while the task
is exiting. This is resembling a heuristic for the global OOM situation
when tasks get access to memory reserves. There is no global memory
shortage at the memcg level so the memcg heuristic is more relieved.
The above assumption is overly optimistic though. E.g. vmalloc can
scale to really large requests and the heuristic would allow that. We
used to have an early break in the vmalloc allocator for killed tasks
but this has been reverted by commit b8c8a338f7 ("Revert "vmalloc:
back off when the current task is killed""). There are likely other
similar code paths which do not check for fatal signals in an
allocation&charge loop. Also there are some kernel objects charged to a
memcg which are not bound to a process life time.
It has been observed that it is not really hard to trigger these
bypasses and cause global OOM situation.
One potential way to address these runaways would be to limit the amount
of excess (similar to the global OOM with limited oom reserves). This
is certainly possible but it is not really clear how much of an excess
is desirable and still protects from global OOMs as that would have to
consider the overall memcg configuration.
This patch is addressing the problem by removing the heuristic
altogether. Bypass is only allowed for requests which either cannot
fail or where the failure is not desirable while excess should be still
limited (e.g. atomic requests). Implementation wise a killed or dying
task fails to charge if it has passed the OOM killer stage. That should
give all forms of reclaim chance to restore the limit before the failure
(ENOMEM) and tell the caller to back off.
In addition, this patch renames should_force_charge() helper to
task_is_dying() because now its use is not associated witch forced
charging.
This patch depends on pagefault_out_of_memory() to not trigger
out_of_memory(), because then a memcg failure can unwind to VM_FAULT_OOM
and cause a global OOM killer.
Link: https://lkml.kernel.org/r/8f5cebbb-06da-4902-91f0-6566fc4b4203@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Uladzislau Rezki <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 78cc316e95 ]
If cgroup_sk_alloc() is called from interrupt context, then just assign the
root cgroup to skcd->cgroup. Prior to commit 8520e224f5 ("bpf, cgroups:
Fix cgroup v2 fallback on v1/v2 mixed mode") we would just return, and later
on in sock_cgroup_ptr(), we were NULL-testing the cgroup in fast-path, and
iff indeed NULL returning the root cgroup (v ?: &cgrp_dfl_root.cgrp). Rather
than re-adding the NULL-test to the fast-path we can just assign it once from
cgroup_sk_alloc() given v1/v2 handling has been simplified. The migration from
NULL test with returning &cgrp_dfl_root.cgrp to assigning &cgrp_dfl_root.cgrp
directly does /not/ change behavior for callers of sock_cgroup_ptr().
syzkaller was able to trigger a splat in the legacy netrom code base, where
the RX handler in nr_rx_frame() calls nr_make_new() which calls sk_alloc()
and therefore cgroup_sk_alloc() with in_interrupt() condition. Thus the NULL
skcd->cgroup, where it trips over on cgroup_sk_free() side given it expects
a non-NULL object. There are a few other candidates aside from netrom which
have similar pattern where in their accept-like implementation, they just call
to sk_alloc() and thus cgroup_sk_alloc() instead of sk_clone_lock() with the
corresponding cgroup_sk_clone() which then inherits the cgroup from the parent
socket. None of them are related to core protocols where BPF cgroup programs
are running from. However, in future, they should follow to implement a similar
inheritance mechanism.
Additionally, with a !CONFIG_CGROUP_NET_PRIO and !CONFIG_CGROUP_NET_CLASSID
configuration, the same issue was exposed also prior to 8520e224f5 due to
commit e876ecc67d ("cgroup: memcg: net: do not associate sock with unrelated
cgroup") which added the early in_interrupt() return back then.
Fixes: 8520e224f5 ("bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode")
Fixes: e876ecc67d ("cgroup: memcg: net: do not associate sock with unrelated cgroup")
Reported-by: syzbot+df709157a4ecaf192b03@syzkaller.appspotmail.com
Reported-by: syzbot+533f389d4026d86a2a95@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: syzbot+df709157a4ecaf192b03@syzkaller.appspotmail.com
Tested-by: syzbot+533f389d4026d86a2a95@syzkaller.appspotmail.com
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/bpf/20210927123921.21535-1-daniel@iogearbox.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8520e224f5 ]
Fix cgroup v1 interference when non-root cgroup v2 BPF programs are used.
Back in the days, commit bd1060a1d6 ("sock, cgroup: add sock->sk_cgroup")
embedded per-socket cgroup information into sock->sk_cgrp_data and in order
to save 8 bytes in struct sock made both mutually exclusive, that is, when
cgroup v1 socket tagging (e.g. net_cls/net_prio) is used, then cgroup v2
falls back to the root cgroup in sock_cgroup_ptr() (&cgrp_dfl_root.cgrp).
The assumption made was "there is no reason to mix the two and this is in line
with how legacy and v2 compatibility is handled" as stated in bd1060a1d6.
However, with Kubernetes more widely supporting cgroups v2 as well nowadays,
this assumption no longer holds, and the possibility of the v1/v2 mixed mode
with the v2 root fallback being hit becomes a real security issue.
Many of the cgroup v2 BPF programs are also used for policy enforcement, just
to pick _one_ example, that is, to programmatically deny socket related system
calls like connect(2) or bind(2). A v2 root fallback would implicitly cause
a policy bypass for the affected Pods.
In production environments, we have recently seen this case due to various
circumstances: i) a different 3rd party agent and/or ii) a container runtime
such as [0] in the user's environment configuring legacy cgroup v1 net_cls
tags, which triggered implicitly mentioned root fallback. Another case is
Kubernetes projects like kind [1] which create Kubernetes nodes in a container
and also add cgroup namespaces to the mix, meaning programs which are attached
to the cgroup v2 root of the cgroup namespace get attached to a non-root
cgroup v2 path from init namespace point of view. And the latter's root is
out of reach for agents on a kind Kubernetes node to configure. Meaning, any
entity on the node setting cgroup v1 net_cls tag will trigger the bypass
despite cgroup v2 BPF programs attached to the namespace root.
Generally, this mutual exclusiveness does not hold anymore in today's user
environments and makes cgroup v2 usage from BPF side fragile and unreliable.
This fix adds proper struct cgroup pointer for the cgroup v2 case to struct
sock_cgroup_data in order to address these issues; this implicitly also fixes
the tradeoffs being made back then with regards to races and refcount leaks
as stated in bd1060a1d6, and removes the fallback, so that cgroup v2 BPF
programs always operate as expected.
[0] https://github.com/nestybox/sysbox/
[1] https://kind.sigs.k8s.io/
Fixes: bd1060a1d6 ("sock, cgroup: add sock->sk_cgroup")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-1-daniel@iogearbox.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3dc20f4762 ]
Currently, it is not possible to migrate a neighbor entry between NUD_PERMANENT
state and NTF_USE flag with a dynamic NUD state from a user space control plane.
Similarly, it is not possible to add/remove NTF_EXT_LEARNED flag from an existing
neighbor entry in combination with NTF_USE flag.
This is due to the latter directly calling into neigh_event_send() without any
meta data updates as happening in __neigh_update(). Thus, to enable this use
case, extend the latter with a NEIGH_UPDATE_F_USE flag where we break the
NUD_PERMANENT state in particular so that a latter neigh_event_send() is able
to re-resolve a neighbor entry.
Before fix, NUD_PERMANENT -> NUD_* & NTF_USE:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
[...]
As can be seen, despite the admin-triggered replace, the entry remains in the
NUD_PERMANENT state.
After fix, NUD_PERMANENT -> NUD_* & NTF_USE:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
[...]
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn STALE
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
[...]
After the fix, the admin-triggered replace switches to a dynamic state from
the NTF_USE flag which triggered a new neighbor resolution. Likewise, we can
transition back from there, if needed, into NUD_PERMANENT.
Similar before/after behavior can be observed for below transitions:
Before fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[...]
After fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
[...]
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[..]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit eb91224e47 upstream.
udma_get_*() checks if rchan/tchan/rflow is already allocated by checking
if it has a NON NULL value. For the error cases, rchan/tchan/rflow will
have error value and udma_get_*() considers this as already allocated
(PASS) since the error values are NON NULL. This results in NULL pointer
dereference error while de-referencing rchan/tchan/rflow.
Reset the value of rchan/tchan/rflow to NULL if a channel request fails.
CC: stable@vger.kernel.org
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20211031032411.27235-3-kishon@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c6c6d60e4 upstream.
bcdma_get_*() checks if bchan is already allocated by checking if it
has a NON NULL value. For the error cases, bchan will have error value
and bcdma_get_*() considers this as already allocated (PASS) since the
error values are NON NULL. This results in NULL pointer dereference
error while de-referencing bchan.
Reset the value of bchan to NULL if a channel request fails.
CC: stable@vger.kernel.org
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20211031032411.27235-2-kishon@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 86432a6dca upstream.
There are pclusters in runtime marked with Z_EROFS_PCLUSTER_TAIL
before actual I/O submission. Thus, the decompression chain can be
extended if the following pcluster chain hooks such tail pcluster.
As the related comment mentioned, if some page is made of a hooked
pcluster and another followed pcluster, it can be reused for in-place
I/O (since I/O should be submitted anyway):
_______________________________________________________________
| tail (partial) page | head (partial) page |
|_____PRIMARY_HOOKED___|____________PRIMARY_FOLLOWED____________|
However, it's by no means safe to reuse as pagevec since if such
PRIMARY_HOOKED pclusters finally move into bypass chain without I/O
submission. It's somewhat hard to reproduce with LZ4 and I just found
it (general protection fault) by ro_fsstressing a LZMA image for long
time.
I'm going to actively clean up related code together with multi-page
folio adaption in the next few months. Let's address it directly for
easier backporting for now.
Call trace for reference:
z_erofs_decompress_pcluster+0x10a/0x8a0 [erofs]
z_erofs_decompress_queue.isra.36+0x3c/0x60 [erofs]
z_erofs_runqueue+0x5f3/0x840 [erofs]
z_erofs_readahead+0x1e8/0x320 [erofs]
read_pages+0x91/0x270
page_cache_ra_unbounded+0x18b/0x240
filemap_get_pages+0x10a/0x5f0
filemap_read+0xa9/0x330
new_sync_read+0x11b/0x1a0
vfs_read+0xf1/0x190
Link: https://lore.kernel.org/r/20211103182006.4040-1-xiang@kernel.org
Fixes: 3883a79abd ("staging: erofs: introduce VLE decompression support")
Cc: <stable@vger.kernel.org> # 4.19+
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5429c9dbc9 upstream.
if2fs_fill_super
-> f2fs_build_segment_manager
-> create_discard_cmd_control
-> f2fs_start_discard_thread
It invokes kthread_run to create a thread and run issue_discard_thread.
However, if f2fs_build_node_manager fails, the control flow goes to
free_nm and calls f2fs_destroy_node_manager. This function will free
sbi->nm_info. However, if issue_discard_thread accesses sbi->nm_info
after the deallocation, but before the f2fs_stop_discard_thread, it will
cause UAF(Use-after-free).
-> f2fs_destroy_segment_manager
-> destroy_discard_cmd_control
-> f2fs_stop_discard_thread
Fix this by stopping discard thread before f2fs_destroy_node_manager.
Note that, the commit d6d2b491a8 introduces the call of
f2fs_available_free_memory into issue_discard_thread.
Cc: stable@vger.kernel.org
Fixes: d6d2b491a8 ("f2fs: allow to change discard policy based on cached discard cmds")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69ea463021 upstream.
When using "devm_request_threaded_irq(,,,,IRQF_ONESHOT,,)" in a driver,
only the first interrupt is handled, and following interrupts are never
delivered (initially reported in [1]).
That's because the RISC-V PLIC cannot EOI masked interrupts, as explained
in the description of Interrupt Completion in the PLIC spec [2]:
<quote>
The PLIC signals it has completed executing an interrupt handler by
writing the interrupt ID it received from the claim to the claim/complete
register. The PLIC does not check whether the completion ID is the same
as the last claim ID for that target. If the completion ID does not match
an interrupt source that *is currently enabled* for the target, the
completion is silently ignored.
</quote>
Re-enable the interrupt before completion if it has been masked during
the handling, and remask it afterwards.
[1] http://lists.infradead.org/pipermail/linux-riscv/2021-July/007441.html
[2] 8bc15a35d0/riscv-plic.adoc
Fixes: bb0fed1c60 ("irqchip/sifive-plic: Switch to fasteoi flow")
Reported-by: Vincent Pelletier <plr.vincent@gmail.com>
Tested-by: Nikita Shubin <nikita.shubin@maquefel.me>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Atish Patra <atish.patra@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
[maz: amended commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211105094748.3894453-1-guoren@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ca7752caea upstream.
copy_process currently copies task_struct.posix_cputimers_work as-is. If a
timer interrupt arrives while handling clone and before dup_task_struct
completes then the child task will have:
1. posix_cputimers_work.scheduled = true
2. posix_cputimers_work.work queued.
copy_process clears task_struct.task_works, so (2) will have no effect and
posix_cpu_timers_work will never run (not to mention it doesn't make sense
for two tasks to share a common linked list).
Since posix_cpu_timers_work never runs, posix_cputimers_work.scheduled is
never cleared. Since scheduled is set, future timer interrupts will skip
scheduling work, with the ultimate result that the task will never receive
timer expirations.
Together, the complete flow is:
1. Task 1 calls clone(), enters kernel.
2. Timer interrupt fires, schedules task work on Task 1.
2a. task_struct.posix_cputimers_work.scheduled = true
2b. task_struct.posix_cputimers_work.work added to
task_struct.task_works.
3. dup_task_struct() copies Task 1 to Task 2.
4. copy_process() clears task_struct.task_works for Task 2.
5. Future timer interrupts on Task 2 see
task_struct.posix_cputimers_work.scheduled = true and skip scheduling
work.
Fix this by explicitly clearing contents of task_struct.posix_cputimers_work
in copy_process(). This was never meant to be shared or inherited across
tasks in the first place.
Fixes: 1fb497dd00 ("posix-cpu-timers: Provide mechanisms to defer timer handling to task_work")
Reported-by: Rhys Hiltner <rhys@justin.tv>
Signed-off-by: Michael Pratt <mpratt@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211101210615.716522-1-mpratt@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e067fd850 upstream.
When UBSAN is enabled, the code emitted for the call to guest_pv_has
includes a call to __ubsan_handle_load_invalid_value. objtool
complains that this call happens with UACCESS enabled; to avoid
the warning, pull the calls to user_access_begin into both arms
of the "if" statement, after the check for guest_pv_has.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a923a2676e upstream.
Fix assembly errors like:
{standard input}: Assembler messages:
{standard input}:287: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32'
{standard input}:680: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32'
{standard input}:1274: Error: opcode not supported on this processor: mips3 (mips3) `dins $12,$9,32,32'
{standard input}:2175: Error: opcode not supported on this processor: mips3 (mips3) `dins $10,$7,32,32'
make[1]: *** [scripts/Makefile.build:277: mm/highmem.o] Error 1
with code produced from `__cmpxchg64' for MIPS64r2 CPU configurations
using CONFIG_32BIT and CONFIG_PHYS_ADDR_T_64BIT.
This is due to MIPS_ISA_ARCH_LEVEL downgrading the assembly architecture
to `r4000' i.e. MIPS III for MIPS64r2 configurations, while there is a
block of code containing a DINS MIPS64r2 instruction conditionalized on
MIPS_ISA_REV >= 2 within the scope of the downgrade.
The assembly architecture override code pattern has been put there for
LL/SC instructions, so that code compiles for configurations that select
a processor to build for that does not support these instructions while
still providing run-time support for processors that do, dynamically
switched by non-constant `cpu_has_llsc'. It went in with linux-mips.org
commit aac8aa7717 ("Enable a suitable ISA for the assembler around
ll/sc so that code builds even for processors that don't support the
instructions. Plus minor formatting fixes.") back in 2005.
Fix the problem by wrapping these instructions along with the adjacent
SYNC instructions only, following the practice established with commit
cfd54de3b0 ("MIPS: Avoid move psuedo-instruction whilst using
MIPS_ISA_LEVEL") and commit 378ed6f0e3 ("MIPS: Avoid using .set mips0
to restore ISA"). Strictly speaking the SYNC instructions do not have
to be wrapped as they are only used as a Loongson3 erratum workaround,
so they will be enabled in the assembler by default, but do this so as
to keep code consistent with other places.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: c7e2d71dda ("MIPS: Fix set_pte() for Netlogic XLR using cmpxchg64()")
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cca2aac8ac upstream.
platform-y accumulates platform names with a slash appended.
The current $(patsubst ...) ends up with doubling slashes.
GNU Make still include Platform files, but in case of an error,
a clumsy file path is displayed:
arch/mips/loongson2ef//Platform:36: *** only binutils >= 2.20.2 have needed option -mfix-loongson2f-nop. Stop.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Jason Self <jason@bluehome.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 38860b2c8b upstream.
For years, there have been random segmentation faults in userspace on
SMP PA-RISC machines. It occurred to me that this might be a problem in
set_pte_at(). MIPS and some other architectures do cache flushes when
installing PTEs with the present bit set.
Here I have adapted the code in update_mmu_cache() to flush the kernel
mapping when the kernel flush is deferred, or when the kernel mapping
may alias with the user mapping. This simplifies calls to
update_mmu_cache().
I also changed the barrier in set_pte() from a compiler barrier to a
full memory barrier. I know this change is not sufficient to fix the
problem. It might not be needed.
I have had a few days of operation with 5.14.16 to 5.15.1 and haven't
seen any random segmentation faults on rp3440 or c8000 so far.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@kernel.org # 5.12+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 279917e27e upstream.
I noticed that sometimes at kernel startup the backtraces did not
included the function names of init functions. Their address were not
resolved to function names and instead only the address was printed.
Debugging shows that the culprit is is_ksym_addr() which is called
by the backtrace functions to check if an address belongs to a function in
the kernel. The problem occurs only for CONFIG_KALLSYMS_ALL=y.
When looking at is_ksym_addr() one can see that for CONFIG_KALLSYMS_ALL=y
the function only tries to resolve the address via is_kernel() function,
which checks like this:
if (addr >= _stext && addr <= _end)
return 1;
On parisc the init functions are located before _stext, so this check fails.
Other platforms seem to have all functions (including init functions)
behind _stext.
The following patch moves the _stext symbol at the beginning of the
kernel and thus includes the init section. This fixes the check and does
not seem to have any negative side effects on where the kernel mapping
happens in the map_pages() function in arch/parisc/mm/init.c.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@kernel.org # 5.4+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 418ace9992 upstream.
Naresh and Antonio ran into a build failure with latest Debian
armhf compilers, with lots of output like
tmp/ccY3nOAs.s:2215: Error: selected processor does not support `cpsid i' in ARM mode
As it turns out, $(cc-option) fails early here when the FPU is not
selected before CPU architecture is selected, as the compiler
option check runs before enabling -msoft-float, which causes
a problem when testing a target architecture level without an FPU:
cc1: error: '-mfloat-abi=hard': selected architecture lacks an FPU
Passing e.g. -march=armv6k+fp in place of -march=armv6k would avoid this
issue, but the fallback logic is already broken because all supported
compilers (gcc-5 and higher) are much more recent than these options,
and building with -march=armv5t as a fallback no longer works.
The best way forward that I see is to just remove all the checks, which
also has the nice side-effect of slightly improving the startup time for
'make'.
The -mtune=marvell-f option was apparently never supported by any mainline
compiler, and the custom Codesourcery gcc build that did support is
now too old to build kernels, so just use -mtune=xscale unconditionally
for those.
This should be safe to apply on all stable kernels, and will be required
in order to keep building them with gcc-11 and higher.
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996419
Reported-by: Antonio Terceiro <antonio.terceiro@linaro.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Tested-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Tested-by: Klaus Kudielka <klaus.kudielka@gmail.com>
Cc: Matthias Klose <doko@debian.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d08e7bf0d upstream.
Currently __set_fixmap() bails out with a warning when called in early boot
from early_iounmap(). Fix it, and while at it, make the comment a bit easier
to understand.
Cc: <stable@vger.kernel.org>
Fixes: b089c31c51 ("ARM: 8667/3: Fix memory attribute inconsistencies when using fixmap")
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71e6864eac upstream.
Linux allows doing a flush/fsync on a file open for read-only,
but the protocol does not allow that. If the file passed in
on the flush is read-only try to find a writeable handle for
the same inode, if that is not possible skip sending the
fsync call to the server to avoid breaking the apps.
Reported-by: Julian Sikorski <belegdol@gmail.com>
Tested-by: Julian Sikorski <belegdol@gmail.com>
Suggested-by: Jeremy Allison <jra@samba.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d336509cb9 ]
The below commit added optional support for passing a bind address.
It configures the sockaddr bind arguments before parsing options and
reconfigures on options -b and -4.
This broke support for passing port (-p) on its own.
Configure sockaddr after parsing all arguments.
Fixes: 3327a9c463 ("selftests: add functionals test for UDP GRO")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4ca110bf8d ]
Ensure diagnostics monitoring support is implemented for the SFF 8472
compliant port module and set the correct length for ethtool port
module eeprom read.
Fixes: f56ec6766d ("cxgb4: Add support for ethtool i2c dump")
Signed-off-by: Manoj Malviya <manojmalviya@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5d5aadcf3 ]
We got the following WARNING when running ab/nginx
test with RDMA link flapping (up-down-up).
The reason is when smc_sock fallback and at linkdown
happens simultaneously, we may got the following situation:
__smc_lgr_terminate()
--> smc_conn_kill()
--> smc_close_active_abort()
smc_sock->sk_state = SMC_CLOSED
sock_put(smc_sock)
smc_sock was set to SMC_CLOSED and sock_put() been called
when terminate the link group. But later application call
close() on the socket, then we got:
__smc_release():
if (smc_sock->fallback)
smc_sock->sk_state = SMC_CLOSED
sock_put(smc_sock)
Again we set the smc_sock to CLOSED through it's already
in CLOSED state, and double put the refcnt, so the following
warning happens:
refcount_t: underflow; use-after-free.
WARNING: CPU: 5 PID: 860 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
Modules linked in:
CPU: 5 PID: 860 Comm: nginx Not tainted 5.10.46+ #403
Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014
RIP: 0010:refcount_warn_saturate+0x8d/0xf0
Code: 05 5c 1e b5 01 01 e8 52 25 bc ff 0f 0b c3 80 3d 4f 1e b5 01 00 75 ad 48
RSP: 0018:ffffc90000527e50 EFLAGS: 00010286
RAX: 0000000000000026 RBX: ffff8881300df2c0 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffff88813bd58040 RDI: ffff88813bd58048
RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000001
R10: ffff8881300df2c0 R11: ffffc90000527c78 R12: ffff8881300df340
R13: ffff8881300df930 R14: ffff88810b3dad80 R15: ffff8881300df4f8
FS: 00007f739de8fb80(0000) GS:ffff88813bd40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000a01b008 CR3: 0000000111b64003 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
smc_release+0x353/0x3f0
__sock_release+0x3d/0xb0
sock_close+0x11/0x20
__fput+0x93/0x230
task_work_run+0x65/0xa0
exit_to_user_mode_prepare+0xf9/0x100
syscall_exit_to_user_mode+0x27/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This patch adds check in __smc_release() to make
sure we won't do an extra sock_put() and set the
socket to CLOSED when its already in CLOSED state.
Fixes: 51f1de79ad (net/smc: replace sock_put worker by socket refcounting)
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c7cd82b905 ]
Currently vosck_connect() increments sock refcount for nonblocking
socket each time it's called, which can lead to memory leak if
it's called multiple times because connect timeout function decrements
sock refcount only once.
Fixes it by making vsock_connect() return -EALREADY immediately when
sock state is already SS_CONNECTING.
Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb7bbb6e36 ]
Commit bfe301ebbc ("net: mvpp2: convert to use
mac_prepare()/mac_finish()") introduced a bug wherein it leaves the MAC
RESET register asserted after mac_finish(), due to wrong order of
function calls.
Before it was:
.mac_config()
mvpp22_mode_reconfigure()
assert reset
mvpp2_xlg_config()
deassert reset
Now it is:
.mac_prepare()
.mac_config()
mvpp2_xlg_config()
deassert reset
.mac_finish()
mvpp2_xlg_config()
assert reset
Obviously this is wrong.
This bug is triggered when phylink tries to change the PHY interface
mode from a GMAC mode (sgmii, 1000base-x, 2500base-x) to XLG mode
(10gbase-r, xaui). The XLG mode does not work since reset is left
asserted. Only after
ifconfig down && ifconfig up
is called will the XLG mode work.
Move the call to mvpp22_mode_reconfigure() to .mac_prepare()
implementation. Since some of the subsequent functions need to know
whether the interface is being changed, we unfortunately also need to
pass around the new interface mode before setting port->phy_interface.
Fixes: bfe301ebbc ("net: mvpp2: convert to use mac_prepare()/mac_finish()")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7a166854b4 ]
It is spurious to allocate a bitmap without initializing it.
So, better safe than sorry, initialize it to 0 at least to have some known
values.
While at it, switch to the devm_bitmap_ API which is less verbose.
Fixes: 4b41d34367 ("net: ethernet: ti: cpsw: allow untagged traffic on host port")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f64ab8e4f3 ]
Commit fe28c53ed7 ("net: stmmac: fix taprio configuration when
base_time is in the past") allowed some base time values in the past,
but apparently not all, the base-time value of 0 (Jan 1st 1970) is still
explicitly denied by the driver.
Remove the bogus check.
Fixes: b60189e039 ("net: stmmac: Integrate EST with TAPRIO scheduler API")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 688db0c7a4 ]
Currently, driver only allow configuring ETS bandwidth of TCs according
to the max TC number queried from firmware. However, the hardware actually
supports 8 TCs and users may need to configure ETS bandwidth of all TCs,
so remove the restriction.
Fixes: 330baff542 ("net: hns3: add ETS TC weight setting in SSU module")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b653a81a2 ]
Currently, driver will send command to firmware to query pfc packet number
when user uses dcb tool to get pfc parameters. However, the periodic
service task will also periodically query and record MAC statistics,
including pfc packet number.
As the hardware registers of statistics is cleared after reading, it will
cause pfc packet number of MAC statistics are not correct after using dcb
tool to get pfc parameters.
To fix this problem, when user uses dcb tool to get pfc parameters, driver
updates MAC statistics firstly and then get pfc packet number from MAC
statistics.
Fixes: 64fd2300fc ("net: hns3: add support for querying pfc puase packets statistic")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit beb27ca451 ]
Currently, NIC init ROCE interrupt vector with MSIX interrupt. But ROCE use
pci_irq_vector() to get interrupt vector, which adds the relative interrupt
vector again and gets wrong interrupt vector.
So fixes it by assign relative interrupt vector to ROCE instead of MSIX
interrupt vector and delete the unused struct member base_msi_vector
declaration of hclgevf_dev.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1c360cc1cc ]
The priv->ntfy_blocks[] has "priv->num_ntfy_blks" elements so this >
needs to be >= to prevent an off by one bug. The priv->ntfy_blocks[]
array is allocated in gve_alloc_notify_blocks().
Fixes: 87a7f321bb ("gve: Recover from queue stall due to missed IRQ")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2498363310 ]
Using the % operator on a 64-bit variable is expensive and can
cause a link failure:
arm-linux-gnueabi-ld: drivers/dma/stm32-dma.o: in function `stm32_dma_get_max_width':
stm32-dma.c:(.text+0x170): undefined reference to `__aeabi_uldivmod'
arm-linux-gnueabi-ld: drivers/dma/stm32-dma.o: in function `stm32_dma_set_xfer_param':
stm32-dma.c:(.text+0x1cd4): undefined reference to `__aeabi_uldivmod'
As we know that we just want to check the alignment in
stm32_dma_get_max_width(), there is no need for a full division, and
using a simple mask is a faster replacement.
Same in stm32_dma_set_xfer_param(), change this to only allow burst
transfers if the address is a multiple of the length.
stm32_dma_get_best_burst just after will take buf_len into account to fix
burst in case of misalignment.
Fixes: b20fd5fa31 ("dmaengine: stm32-dma: fix stm32_dma_get_max_width")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211103153312.41483-1-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit af229d2c25 ]
Theorically, address pointers used by STM32 DMA must be chosen so as to
ensure that all transfers within a burst block are aligned on the address
boundary equal to the size of the transfer.
If this is always the case for peripheral addresses on STM32, it is not for
memory addresses if the user doesn't respect this alignment constraint.
To avoid a weird behavior of the DMA controller in this case (no error
triggered but data are not transferred as expected), force no burst.
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211011094259.315023-4-amelie.delaunay@foss.st.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2c4618162 ]
The current conversion of skb->data_end reads like this:
; data_end = (void*)(long)skb->data_end;
559: (79) r1 = *(u64 *)(r2 +200) ; r1 = skb->data
560: (61) r11 = *(u32 *)(r2 +112) ; r11 = skb->len
561: (0f) r1 += r11
562: (61) r11 = *(u32 *)(r2 +116)
563: (1f) r1 -= r11
But similar to the case in 84f44df664 ("bpf: sock_ops sk access may stomp
registers when dst_reg = src_reg"), the code will read an incorrect skb->len
when src == dst. In this case we end up generating this xlated code:
; data_end = (void*)(long)skb->data_end;
559: (79) r1 = *(u64 *)(r1 +200) ; r1 = skb->data
560: (61) r11 = *(u32 *)(r1 +112) ; r11 = (skb->data)->len
561: (0f) r1 += r11
562: (61) r11 = *(u32 *)(r1 +116)
563: (1f) r1 -= r11
... where line 560 is the reading 4B of (skb->data + 112) instead of the
intended skb->len Here the skb pointer in r1 gets set to skb->data and the
later deref for skb->len ends up following skb->data instead of skb.
This fixes the issue similarly to the patch mentioned above by creating an
additional temporary variable and using to store the register when dst_reg =
src_reg. We name the variable bpf_temp_reg and place it in the cb context for
sk_skb. Then we restore from the temp to ensure nothing is lost.
Fixes: 16137b09a6 ("bpf: Compute data_end dynamically with JIT code")
Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-6-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e0dc3b93bd ]
Strparser is reusing the qdisc_skb_cb struct to stash the skb message handling
progress, e.g. offset and length of the skb. First this is poorly named and
inherits a struct from qdisc that doesn't reflect the actual usage of cb[] at
this layer.
But, more importantly strparser is using the following to access its metadata.
(struct _strp_msg *)((void *)skb->cb + offsetof(struct qdisc_skb_cb, data))
Where _strp_msg is defined as:
struct _strp_msg {
struct strp_msg strp; /* 0 8 */
int accum_len; /* 8 4 */
/* size: 12, cachelines: 1, members: 2 */
/* last cacheline: 12 bytes */
};
So we use 12 bytes of ->data[] in struct. However in BPF code running parser
and verdict the user has read capabilities into the data[] array as well. Its
not too problematic, but we should not be exposing internal state to BPF
program. If its really needed then we can use the probe_read() APIs which allow
reading kernel memory. And I don't believe cb[] layer poses any API breakage by
moving this around because programs can't depend on cb[] across layers.
In order to fix another issue with a ctx rewrite we need to stash a temp
variable somewhere. To make this work cleanly this patch builds a cb struct
for sk_skb types called sk_skb_cb struct. Then we can use this consistently
in the strparser, sockmap space. Additionally we can start allowing ->cb[]
write access after this.
Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-5-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c5d2177a72 ]
A socket in a sockmap may have different combinations of programs attached
depending on configuration. There can be no programs in which case the socket
acts as a sink only. There can be a TX program in this case a BPF program is
attached to sending side, but no RX program is attached. There can be an RX
program only where sends have no BPF program attached, but receives are hooked
with BPF. And finally, both TX and RX programs may be attached. Giving us the
permutations:
None, Tx, Rx, and TxRx
To date most of our use cases have been TX case being used as a fast datapath
to directly copy between local application and a userspace proxy. Or Rx cases
and TxRX applications that are operating an in kernel based proxy. The traffic
in the first case where we hook applications into a userspace application looks
like this:
AppA redirect AppB
Tx <-----------> Rx
| |
+ +
TCP <--> lo <--> TCP
In this case all traffic from AppA (after 3whs) is copied into the AppB
ingress queue and no traffic is ever on the TCP recieive_queue.
In the second case the application never receives, except in some rare error
cases, traffic on the actual user space socket. Instead the send happens in
the kernel.
AppProxy socket pool
sk0 ------------->{sk1,sk2, skn}
^ |
| |
| v
ingress lb egress
TCP TCP
Here because traffic is never read off the socket with userspace recv() APIs
there is only ever one reader on the sk receive_queue. Namely the BPF programs.
However, we've started to introduce a third configuration where the BPF program
on receive should process the data, but then the normal case is to push the
data into the receive queue of AppB.
AppB
recv() (userspace)
-----------------------
tcp_bpf_recvmsg() (kernel)
| |
| |
| |
ingress_msgQ |
| |
RX_BPF |
| |
v v
sk->receive_queue
This is different from the App{A,B} redirect because traffic is first received
on the sk->receive_queue.
Now for the issue. The tcp_bpf_recvmsg() handler first checks the ingress_msg
queue for any data handled by the BPF rx program and returned with PASS code
so that it was enqueued on the ingress msg queue. Then if no data exists on
that queue it checks the socket receive queue. Unfortunately, this is the same
receive_queue the BPF program is reading data off of. So we get a race. Its
possible for the recvmsg() hook to pull data off the receive_queue before the
BPF hook has a chance to read it. It typically happens when an application is
banging on recv() and getting EAGAINs. Until they manage to race with the RX
BPF program.
To fix this we note that before this patch at attach time when the socket is
loaded into the map we check if it needs a TX program or just the base set of
proto bpf hooks. Then it uses the above general RX hook regardless of if we
have a BPF program attached at rx or not. This patch now extends this check to
handle all cases enumerated above, TX, RX, TXRX, and none. And to fix above
race when an RX program is attached we use a new hook that is nearly identical
to the old one except now we do not let the recv() call skip the RX BPF program.
Now only the BPF program pulls data from sk->receive_queue and recv() only
pulls data from the ingress msgQ post BPF program handling.
With this resolved our AppB from above has been up and running for many hours
without detecting any errors. We do this by correlating counters in RX BPF
events and the AppB to ensure data is never skipping the BPF program. Selftests,
was not able to detect this because we only run them for a short period of time
on well ordered send/recvs so we don't get any of the noise we see in real
application environments.
Fixes: 51199405f9 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-4-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8b8315e39 ]
We do not need to handle unhash from BPF side we can simply wait for the
close to happen. The original concern was a socket could transition from
ESTABLISHED state to a new state while the BPF hook was still attached.
But, we convinced ourself this is no longer possible and we also improved
BPF sockmap to handle listen sockets so this is no longer a problem.
More importantly though there are cases where unhash is called when data is
in the receive queue. The BPF unhash logic will flush this data which is
wrong. To be correct it should keep the data in the receive queue and allow
a receiving application to continue reading the data. This may happen when
tcp_abort() is received for example. Instead of complicating the logic in
unhash simply moving all this to tcp_close() hook solves this.
Fixes: 51199405f9 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jussi Maki <joamaki@gmail.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20211103204736.248403-3-john.fastabend@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c7c386fbc2 ]
gcc warns about undefined behavior the vmalloc code when building
with CONFIG_ARM64_PA_BITS_52, when the 'idx++' in the argument to
__phys_to_pte_val() is evaluated twice:
mm/vmalloc.c: In function 'vmap_pfn_apply':
mm/vmalloc.c:2800:58: error: operation on 'data->idx' may be undefined [-Werror=sequence-point]
2800 | *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
| ~~~~~~~~~^~
arch/arm64/include/asm/pgtable-types.h:25:37: note: in definition of macro '__pte'
25 | #define __pte(x) ((pte_t) { (x) } )
| ^
arch/arm64/include/asm/pgtable.h:80:15: note: in expansion of macro '__phys_to_pte_val'
80 | __pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
| ^~~~~~~~~~~~~~~~~
mm/vmalloc.c:2800:30: note: in expansion of macro 'pfn_pte'
2800 | *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
| ^~~~~~~
I have no idea why this never showed up earlier, but the safest
workaround appears to be changing those macros into inline functions
so the arguments get evaluated only once.
Cc: Matthew Wilcox <willy@infradead.org>
Fixes: 75387b9263 ("arm64: handle 52-bit physical addresses in page table entries")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211105075414.2553155-1-arnd@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9dc232a8ab ]
The id argument of ARM64_FTR_REG_OVERRIDE() is used for two purposes:
one as the system register encoding (used for the sys_id field of
__ftr_reg_entry), and the other as the register name (stringified
and used for the name field of arm64_ftr_reg), which is debug
information. The id argument is supposed to be a macro that
indicates an encoding of the register (eg. SYS_ID_AA64PFR0_EL1, etc).
ARM64_FTR_REG(), which also has the same id argument,
uses ARM64_FTR_REG_OVERRIDE() and passes the id to the macro.
Since the id argument is completely macro-expanded before it is
substituted into a macro body of ARM64_FTR_REG_OVERRIDE(),
the stringified id in the body of ARM64_FTR_REG_OVERRIDE is not
a human-readable register name, but a string of numeric bitwise
operations.
Fix this so that human-readable register names are available as
debug information.
Fixes: 8f266a5d87 ("arm64: cpufeature: Add global feature override facility")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211101045421.2215822-1-reijiw@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9fec40f850 ]
skb is already freed by dev_kfree_skb in pn533_fill_fragment_skbs,
but follow error handler branch when pn533_fill_fragment_skbs()
fails, skb is freed again, results in double free issue. Fix this
by not free skb in error path of pn533_fill_fragment_skbs.
Fixes: 963a82e07d ("NFC: pn533: Split large Tx frames in chunks")
Fixes: 93ad42020c ("NFC: pn533: Target mode Tx fragmentation support")
Signed-off-by: Chengfeng Ye <cyeaa@connect.ust.hk>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ac9dfd58b ]
Both ifindex and LLC_SK_DEV_HASH_ENTRIES are signed.
This means that (ifindex % LLC_SK_DEV_HASH_ENTRIES) is negative
if @ifindex is negative.
We could simply make LLC_SK_DEV_HASH_ENTRIES unsigned.
In this patch I chose to use hash_32() to get more entropy
from @ifindex, like llc_sk_laddr_hashfn().
UBSAN: array-index-out-of-bounds in ./include/net/llc.h:75:26
index -43 is out of range for type 'hlist_head [64]'
CPU: 1 PID: 20999 Comm: syz-executor.3 Not tainted 5.15.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
ubsan_epilogue+0xb/0x5a lib/ubsan.c:151
__ubsan_handle_out_of_bounds.cold+0x62/0x6c lib/ubsan.c:291
llc_sk_dev_hash include/net/llc.h:75 [inline]
llc_sap_add_socket+0x49c/0x520 net/llc/llc_conn.c:697
llc_ui_bind+0x680/0xd70 net/llc/af_llc.c:404
__sys_bind+0x1e9/0x250 net/socket.c:1693
__do_sys_bind net/socket.c:1704 [inline]
__se_sys_bind net/socket.c:1702 [inline]
__x64_sys_bind+0x6f/0xb0 net/socket.c:1702
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fa503407ae9
Fixes: 6d2e3ea284 ("llc: use a device based hash table to speed up multicast delivery")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afe8605ca4 ]
There is one possible race window between zs_pool_dec_isolated() and
zs_unregister_migration() because wait_for_isolated_drain() checks the
isolated count without holding class->lock and there is no order inside
zs_pool_dec_isolated(). Thus the below race window could be possible:
zs_pool_dec_isolated zs_unregister_migration
check pool->destroying != 0
pool->destroying = true;
smp_mb();
wait_for_isolated_drain()
wait for pool->isolated_pages == 0
atomic_long_dec(&pool->isolated_pages);
atomic_long_read(&pool->isolated_pages) == 0
Since we observe the pool->destroying (false) before atomic_long_dec()
for pool->isolated_pages, waking pool->migration_wait up is missed.
Fix this by ensure checking pool->destroying happens after the
atomic_long_dec(&pool->isolated_pages).
Link: https://lkml.kernel.org/r/20210708115027.7557-1-linmiaohe@huawei.com
Fixes: 701d678599 ("mm/zsmalloc.c: fix race condition in zs_destroy_pool")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Henry Burns <henryburns@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d9447f768b ]
In es58x_rx_err_msg(), if can->do_set_mode() fails, the function
directly returns without calling netif_rx(skb). This means that the
skb previously allocated by alloc_can_err_skb() is not freed. In other
terms, this is a memory leak.
This patch simply removes the return statement in the error branch and
let the function continue.
Issue was found with GCC -fanalyzer, please follow the link below for
details.
Fixes: 8537257874 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces")
Link: https://lore.kernel.org/all/20211026180740.1953265-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d6366e743 ]
My previous patch correctly addressed the possible link failure, but as
Jani points out, the dependency is now stricter than it needs to be.
Change it again, to allow DRM_FBDEV_EMULATION to be used when
DRM_KMS_HELPER and FB are both loadable modules and DRM is linked into
the kernel.
As a side-effect, the option is now only visible when at least one DRM
driver makes use of DRM_KMS_HELPER. This is better, because the option
has no effect otherwise.
Fixes: 606b102876 ("drm: fb_helper: fix CONFIG_FB dependency")
Suggested-by: Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211029120307.1407047-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8955c1a329 ]
As I want to test both DEVMAP and DEVMAP_HASH in XDP multicast redirect, I
limited DEVMAP max entries to a small value for performace. When the test
runs after amount of interface creating/deleting tests. The interface index
will exceed the map max entries and xdp_redirect_multi will error out with
"Get interfacesInterface index to large".
Fix this issue by limit the tests in netns and specify the ifindex when
creating interfaces.
Fixes: d232924762 ("selftests/bpf: Add xdp_redirect_multi test")
Reported-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211027033553.962413-5-liuhangbin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 452a3e723f ]
Fix a device wakeup power reference counting error introduced by
commit a2d7b2e004 ("ACPI: PM: Fix sharing of wakeup power
resources") because of a coding mistake.
Fixes: a2d7b2e004 ("ACPI: PM: Fix sharing of wakeup power resources")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d5fa8592b7 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210924143347.14721-3-broonie@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dce9446192 ]
Although we've covered all calls with NULL dma buffer pointer, so far,
there may be still some else in the wild. For catching such a case
more easily, add a WARN_ON_ONCE() in snd_dma_get_ops().
Fixes: 37af81c599 ("ALSA: core: Abstract memory alloc helpers")
Link: https://lore.kernel.org/r/20211105102103.28148-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b93c6a911a ]
When I do fuzz test for bonding device interface, I got the following
use-after-free Calltrace:
==================================================================
BUG: KASAN: use-after-free in bond_enslave+0x1521/0x24f0
Read of size 8 at addr ffff88825bc11c00 by task ifenslave/7365
CPU: 5 PID: 7365 Comm: ifenslave Tainted: G E 5.15.0-rc1+ #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
Call Trace:
dump_stack_lvl+0x6c/0x8b
print_address_description.constprop.0+0x48/0x70
kasan_report.cold+0x82/0xdb
__asan_load8+0x69/0x90
bond_enslave+0x1521/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f19159cf577
Code: b3 66 90 48 8b 05 11 89 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 78
RSP: 002b:00007ffeb3083c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffeb3084bca RCX: 00007f19159cf577
RDX: 00007ffeb3083ce0 RSI: 0000000000008990 RDI: 0000000000000003
RBP: 00007ffeb3084bc4 R08: 0000000000000040 R09: 0000000000000000
R10: 00007ffeb3084bc0 R11: 0000000000000246 R12: 00007ffeb3083ce0
R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffeb3083cb0
Allocated by task 7365:
kasan_save_stack+0x23/0x50
__kasan_kmalloc+0x83/0xa0
kmem_cache_alloc_trace+0x22e/0x470
bond_enslave+0x2e1/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 7365:
kasan_save_stack+0x23/0x50
kasan_set_track+0x20/0x30
kasan_set_free_info+0x24/0x40
__kasan_slab_free+0xf2/0x130
kfree+0xd1/0x5c0
slave_kobj_release+0x61/0x90
kobject_put+0x102/0x180
bond_sysfs_slave_add+0x7a/0xa0
bond_enslave+0x11b6/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Last potentially related work creation:
kasan_save_stack+0x23/0x50
kasan_record_aux_stack+0xb7/0xd0
insert_work+0x43/0x190
__queue_work+0x2e3/0x970
delayed_work_timer_fn+0x3e/0x50
call_timer_fn+0x148/0x470
run_timer_softirq+0x8a8/0xc50
__do_softirq+0x107/0x55f
Second to last potentially related work creation:
kasan_save_stack+0x23/0x50
kasan_record_aux_stack+0xb7/0xd0
insert_work+0x43/0x190
__queue_work+0x2e3/0x970
__queue_delayed_work+0x130/0x180
queue_delayed_work_on+0xa7/0xb0
bond_enslave+0xe25/0x24f0
bond_do_ioctl+0x3e0/0x450
dev_ifsioc+0x2ba/0x970
dev_ioctl+0x112/0x710
sock_do_ioctl+0x118/0x1b0
sock_ioctl+0x2e0/0x490
__x64_sys_ioctl+0x118/0x150
do_syscall_64+0x35/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
The buggy address belongs to the object at ffff88825bc11c00
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 0 bytes inside of
1024-byte region [ffff88825bc11c00, ffff88825bc12000)
The buggy address belongs to the page:
page:ffffea00096f0400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x25bc10
head:ffffea00096f0400 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x57ff00000010200(slab|head|node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff00000010200 ffffea0009a71c08 ffff888240001968 ffff88810004dbc0
raw: 0000000000000000 00000000000a000a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88825bc11b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88825bc11b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88825bc11c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88825bc11c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88825bc11d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Put new_slave in bond_sysfs_slave_add() will cause use-after-free problems
when new_slave is accessed in the subsequent error handling process. Since
new_slave will be put in the subsequent error handling process, remove the
unnecessary put to fix it.
In addition, when sysfs_create_file() fails, if some files have been crea-
ted successfully, we need to call sysfs_remove_file() to remove them.
Since there are sysfs_create_files() & sysfs_remove_files() can be used,
use these two functions instead.
Fixes: 7afcaec496 (bonding: use kobject_put instead of _del after kobject_add)
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d97950953 ]
The huge page functionality in TTM does not work safely because PUD and
PMD entries do not have a special bit.
get_user_pages_fast() considers any page that passed pmd_huge() as
usable:
if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
pmd_devmap(pmd))) {
And vmf_insert_pfn_pmd_prot() unconditionally sets
entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.
As such gup_huge_pmd() will try to deref a struct page:
head = try_grab_compound_head(pmd_page(orig), refs, flags);
and thus crash.
Thomas further notices that the drivers are not expecting the struct page
to be used by anything - in particular the refcount incr above will cause
them to malfunction.
Thus everything about this is not able to fully work correctly considering
GUP_fast. Delete it entirely. It can return someday along with a proper
PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast.
Fixes: 314b6580ad ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
[danvet: Update subject per Thomas' &Christian's review]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dbea75fe18 ]
Commit a365ab6b9d ("cpufreq: intel_pstate: Implement the
->adjust_perf() callback") caused intel_pstate to use nonzero HWP
desired values in certain usage scenarios, but it did not prevent
them from being leaked into the confugirations in which HWP desired
is expected to be 0.
The failing scenarios are switching the driver from the passive
mode to the active mode and starting a new kernel via kexec() while
intel_pstate is running in the passive mode.
To address this issue, ensure that HWP desired will be cleared on
offline and suspend/shutdown.
Fixes: a365ab6b9d ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback")
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Tested-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ec0a6fcb6 ]
Host crashes when pci_enable_atomic_ops_to_root() is called for VFs with
virtual buses. The virtual buses added to SR-IOV have bus->self set to NULL
and host crashes due to this.
PID: 4481 TASK: ffff89c6941b0000 CPU: 53 COMMAND: "bash"
...
#3 [ffff9a9481713808] oops_end at ffffffffb9025cd6
#4 [ffff9a9481713828] page_fault_oops at ffffffffb906e417
#5 [ffff9a9481713888] exc_page_fault at ffffffffb9a0ad14
#6 [ffff9a94817138b0] asm_exc_page_fault at ffffffffb9c00ace
[exception RIP: pcie_capability_read_dword+28]
RIP: ffffffffb952fd5c RSP: ffff9a9481713960 RFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff89c6b1096000 RCX: 0000000000000000
RDX: ffff9a9481713990 RSI: 0000000000000024 RDI: 0000000000000000
RBP: 0000000000000080 R8: 0000000000000008 R9: ffff89c64341a2f8
R10: 0000000000000002 R11: 0000000000000000 R12: ffff89c648bab000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff89c648bab0c8
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff9a9481713988] pci_enable_atomic_ops_to_root at ffffffffb95359a6
#8 [ffff9a94817139c0] bnxt_qplib_determine_atomics at ffffffffc08c1a33 [bnxt_re]
#9 [ffff9a94817139d0] bnxt_re_dev_init at ffffffffc08ba2d1 [bnxt_re]
Per PCIe r5.0, sec 9.3.5.10, the AtomicOp Requester Enable bit in Device
Control 2 is reserved for VFs. The PF value applies to all associated VFs.
Return -EINVAL if pci_enable_atomic_ops_to_root() is called for a VF.
Link: https://lore.kernel.org/r/1631354585-16597-1-git-send-email-selvin.xavier@broadcom.com
Fixes: 35f5ace5de ("RDMA/bnxt_re: Enable global atomic ops if platform supports")
Fixes: 430a23689d ("PCI: Add pci_enable_atomic_ops_to_root()")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4ddb85d366 ]
Commit bf9c0538e4 ("ataflop: use a separate gendisk for each media
format") introduced ataflop_probe_lock mutex, but forgot to unlock the
mutex when atari_floppy_init() (i.e. module loading) succeeded. This will
result in double lock deadlock if ataflop_probe() is called. Also,
unregister_blkdev() must not be called from atari_floppy_init() with
ataflop_probe_lock held when atari_floppy_init() failed, for
ataflop_probe() waits for ataflop_probe_lock with major_names_lock held
(i.e. AB-BA deadlock).
__register_blkdev() needs to be called last in order to avoid calling
ataflop_probe() when atari_floppy_init() is about to fail, for memory for
completing already-started ataflop_probe() safely will be released as soon
as atari_floppy_init() released ataflop_probe_lock mutex.
As with commit 8b52d8be86 ("loop: reorder loop_exit"),
unregister_blkdev() needs to be called first in order to avoid calling
ataflop_alloc_disk() from ataflop_probe() after del_gendisk() from
atari_floppy_exit().
By relocating __register_blkdev() / unregister_blkdev() as explained above,
we can remove ataflop_probe_lock mutex, for probe function and __exit
function are serialized by major_names_lock mutex.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: bf9c0538e4 ("ataflop: use a separate gendisk for each media format")
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Link: https://lore.kernel.org/r/20211103230437.1639990-11-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit deae1138d0 ]
Instead of using two separate code paths for cleaning up an atari disk,
use one. We take the more careful approach to check for *all* disk
types, as is done on exit. The init path didn't have that check as
the alternative disk types are only probed for later, they are not
initialized by default.
Yes, there is a shared tag for all disks.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210927220302.1073499-14-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 573effb298 ]
The ataflop assumes del_gendisk() is safe to call, this is only
true because add_disk() does not return a failure, but that will
change soon. And so, before we get to adding error handling for
that case, let's make sure we keep track of which disks actually
get registered. Then we use this to only call del_gendisk for them.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210927220302.1073499-13-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 009a789443 ]
The handling of PMIC register reads through writing 0 to address 4
of the OpRegion is wrong. Instead of returning the read value
through the value64, which is a no-op for function == ACPI_WRITE calls,
store the value and then on a subsequent function == ACPI_READ with
address == 3 (the address for the value field of the OpRegion)
return the stored value.
This has been tested on a Xiaomi Mi Pad 2 and makes the ACPI battery dev
there mostly functional (unfortunately there are still other issues).
Here are the SET() / GET() functions of the PMIC ACPI device,
which use this OpRegion, which clearly show the new behavior to
be correct:
OperationRegion (REGS, 0x8F, Zero, 0x50)
Field (REGS, ByteAcc, NoLock, Preserve)
{
CLNT, 8,
SA, 8,
OFF, 8,
VAL, 8,
RWM, 8
}
Method (GET, 3, Serialized)
{
If ((AVBE == One))
{
CLNT = Arg0
SA = Arg1
OFF = Arg2
RWM = Zero
If ((AVBG == One))
{
GPRW = Zero
}
}
Return (VAL) /* \_SB_.PCI0.I2C7.PMI5.VAL_ */
}
Method (SET, 4, Serialized)
{
If ((AVBE == One))
{
CLNT = Arg0
SA = Arg1
OFF = Arg2
VAL = Arg3
RWM = One
If ((AVBG == One))
{
GPRW = One
}
}
}
Fixes: 0afa877a56 ("ACPI / PMIC: intel: add REGS operation region support")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b385cca473 ]
When a VF is removed and/or reset its Tx queues need to be
stopped from the PF. This is done by calling the ice_dis_vf_qs()
function, which calls ice_vsi_stop_lan_tx_rings(). Currently
ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA.
Unfortunately, this is causing the Tx queues to not be disabled in some
cases and when the VF tries to re-enable/reconfigure its Tx queues over
virtchnl the op is failing. This is because a VF can be reset and/or
removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues
were already configured via ice_vsi_cfg_single_txq() in the
VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit
is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always
happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op.
This was causing the following error message when loading the ice
driver, creating VFs, and modifying VF trust in an endless loop:
[35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM
[35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5
[35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
Fix this by always calling ice_dis_vf_qs() and silencing the error
message in ice_vsi_stop_tx_ring() since the calling code ignores the
return anyway. Also, all other places that call ice_vsi_stop_tx_ring()
catch the error, so this doesn't affect those flows since there was no
change to the values the function returns.
Other solutions were considered (i.e. tracking which VF queues had been
"started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed
more complicated than it was worth. This solution also brings in the
chance for other unexpected conditions due to invalid state bit checks.
So, the proposed solution seemed like the best option since there is no
harm in failing to stop Tx queues that were never started.
This issue can be seen using the following commands:
for i in {0..50}; do
rmmod ice
modprobe ice
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
sleep 2
echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs
sleep 1
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs
ip link set ens785f1 vf 0 trust on
ip link set ens785f0 vf 0 trust on
done
Fixes: 77ca27c417 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ce572a5b88 ]
VF was not able to change its hardware MAC address in case
the new address was already present in the MAC filter list.
Change the handling of VF add mac request to not return
if requested MAC address is already present on the list
and check if its hardware MAC needs to be updated in this case.
Fixes: ed4c068d46 ("ice: Enable ip link show on the PF to display VF unicast MAC(s)")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 92f62485b3 ]
Normally it is expected that the dsa_device_ops :: rcv() method finishes
parsing the DSA tag and consumes it, then never looks at it again.
But commit c0bcf53766 ("net: dsa: ocelot: add hardware timestamping
support for Felix") added support for RX timestamping in a very
unconventional way. On this switch, a partial timestamp is available in
the DSA header, but the driver got away with not parsing that timestamp
right away, but instead delayed that parsing for a little longer:
dsa_switch_rcv():
nskb = cpu_dp->rcv(skb, dev); <------------- not here
-> ocelot_rcv()
...
skb = nskb;
skb_push(skb, ETH_HLEN);
skb->pkt_type = PACKET_HOST;
skb->protocol = eth_type_trans(skb, skb->dev);
...
if (dsa_skb_defer_rx_timestamp(p, skb)) <--- but here
-> felix_rxtstamp()
return 0;
When in felix_rxtstamp(), this driver accounted for the fact that
eth_type_trans() happened in the meanwhile, so it got a hold of the
extraction header again by subtracting (ETH_HLEN + OCELOT_TAG_LEN) bytes
from the current skb->data.
This worked for quite some time but was quite fragile from the very
beginning. Not to mention that having DSA tag parsing split in two
different files, under different folders (net/dsa/tag_ocelot.c vs
drivers/net/dsa/ocelot/felix.c) made it quite non-obvious for patches to
come that they might break this.
Finally, the blamed commit does the following: at the end of
ocelot_rcv(), it checks whether the skb payload contains a VLAN header.
If it does, and this port is under a VLAN-aware bridge, that VLAN ID
might not be correct in the sense that the packet might have suffered
VLAN rewriting due to TCAM rules (VCAP IS1). So we consume the VLAN ID
from the skb payload using __skb_vlan_pop(), and take the classified
VLAN ID from the DSA tag, and construct a hwaccel VLAN tag with the
classified VLAN, and the skb payload is VLAN-untagged.
The big problem is that __skb_vlan_pop() does:
memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);
__skb_pull(skb, VLAN_HLEN);
aka it moves the Ethernet header 4 bytes to the right, and pulls 4 bytes
from the skb headroom (effectively also moving skb->data, by definition).
So for felix_rxtstamp()'s fragile logic, all bets are off now.
Instead of having the "extraction" pointer point to the DSA header,
it actually points to 4 bytes _inside_ the extraction header.
Corollary, the last 4 bytes of the "extraction" header are in fact 4
stale bytes of the destination MAC address from the Ethernet header,
from prior to the __skb_vlan_pop() movement.
So of course, RX timestamps are completely bogus when the system is
configured in this way.
The fix is actually very simple: just don't structure the code like that.
For better or worse, the DSA PTP timestamping API does not offer a
straightforward way for drivers to present their RX timestamps, but
other drivers (sja1105) have established a simple mechanism to carry
their RX timestamp from dsa_device_ops :: rcv() all the way to
dsa_switch_ops :: port_rxtstamp() and even later. That mechanism is to
simply save the partial timestamp to the skb->cb, and complete it later.
Question: why don't we simply populate the skb's struct
skb_shared_hwtstamps from ocelot_rcv(), and bother with this
complication of propagating the timestamp to felix_rxtstamp()?
Answer: dsa_switch_ops :: port_rxtstamp() answers the question whether
PTP packets need sleepable context to retrieve the full RX timestamp.
Currently felix_rxtstamp() answers "no, thanks" to that question, and
calls ocelot_ptp_gettime64() from softirq atomic context. This is
understandable, since Felix VSC9959 is a PCIe memory-mapped switch, so
hardware access does not require sleeping. But the felix driver is
preparing for the introduction of other switches where hardware access
is over a slow bus like SPI or MDIO:
https://lore.kernel.org/lkml/20210814025003.2449143-1-colin.foster@in-advantage.com/
So I would like to keep this code structure, so the rework needed when
that driver will need PTP support will be minimal (answer "yes, I need
deferred context for this skb's RX timestamp", then the partial
timestamp will still be found in the skb->cb.
Fixes: ea440cd2d9 ("net: dsa: tag_ocelot: use VLAN information from tagging header when available")
Reported-by: Po Liu <po.liu@nxp.com>
Cc: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit deab6b1cd9 ]
As explained here:
https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/
DSA tagging protocol drivers cannot depend on symbols exported by switch
drivers, because this creates a circular dependency that breaks module
autoloading.
The tag_ocelot.c file depends on the ocelot_ptp_rew_op() function
exported by the common ocelot switch lib. This function looks at
OCELOT_SKB_CB(skb) and computes how to populate the REW_OP field of the
DSA tag, for PTP timestamping (the command: one-step/two-step, and the
TX timestamp identifier).
None of that requires deep insight into the driver, it is quite
stateless, as it only depends upon the skb->cb. So let's make it a
static inline function and put it in include/linux/dsa/ocelot.h, a
file that despite its name is used by the ocelot switch driver for
populating the injection header too - since commit 40d3f295b5 ("net:
mscc: ocelot: use common tag parsing code with DSA").
With that function declared as static inline, its body is expanded
inside each call site, so the dependency is broken and the DSA tagger
can be built without the switch library, upon which the felix driver
depends.
Fixes: 39e5308b32 ("net: mscc: ocelot: support PTP Sync one-step timestamping")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 563bcbae3b ]
The real_dev of a vlan net_device may be freed after
unregister_vlan_dev(). Access the real_dev continually by
vlan_dev_real_dev() will trigger the UAF problem for the
real_dev like following:
==================================================================
BUG: KASAN: use-after-free in vlan_dev_real_dev+0xf9/0x120
Call Trace:
kasan_report.cold+0x83/0xdf
vlan_dev_real_dev+0xf9/0x120
is_eth_port_of_netdev_filter.part.0+0xb1/0x2c0
is_eth_port_of_netdev_filter+0x28/0x40
ib_enum_roce_netdev+0x1a3/0x300
ib_enum_all_roce_netdevs+0xc7/0x140
netdevice_event_work_handler+0x9d/0x210
...
Freed by task 9288:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0xfc/0x130
slab_free_freelist_hook+0xdd/0x240
kfree+0xe4/0x690
kvfree+0x42/0x50
device_release+0x9f/0x240
kobject_put+0x1c8/0x530
put_device+0x1b/0x30
free_netdev+0x370/0x540
ppp_destroy_interface+0x313/0x3d0
...
Move the put_device(real_dev) to vlan_dev_free(). Ensure
real_dev not be freed before vlan_dev unregistered.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot+e4df4e1389e28972e955@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27dff9a9c2 ]
Throughout the OpenRISC kernel port VMA is passed as NULL when flushing
kernel tlb entries. Somehow this was missed when I was testing
c28b27416d ("openrisc: Implement proper SMP tlb flushing") and now the
SMP kernel fails to completely boot.
In OpenRISC VMA is used only to determine which cores need to have their
TLB entries flushed.
This patch updates the logic to flush tlbs on all cores when the VMA is
passed as NULL. Also, we update places VMA is passed as NULL to use
flush_tlb_kernel_range instead. Now, the only place VMA is passed as
NULL is in the implementation of flush_tlb_kernel_range.
Fixes: c28b27416d ("openrisc: Implement proper SMP tlb flushing")
Reported-by: Jan Henrik Weinstock <jan.weinstock@rwth-aachen.de>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1aabe578dd ]
ETHTOOL_A_PAUSE_STAT_MAX is the MAX attribute id,
so we need to subtract non-stats and add one to
get a count (IOW -2+1 == -1).
Otherwise we'll see:
ethnl cmd 21: calculated reply length 40, but consumed 52
Fixes: 9a27a33027 ("ethtool: add standard pause stats")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca3676f94b ]
When generating the selftests to another folder, the icmp.sh test will
miss as it is not in Makefile, e.g.
make -C tools/testing/selftests/ install \
TARGETS="net" INSTALL_PATH=/tmp/kselftests
Fixes: 7e9838b791 ("selftests/net: Add icmp.sh for testing ICMP dummy address responses")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d52bcb47bd ]
This patch allows to use 0 for `coal->rx_coalesce_usecs` param to
disable rx irq coalescing.
Previously we could enable rx irq coalescing via ethtool
(For ex: `ethtool -C eth0 rx-usecs 2000`) but we couldn't disable
it because this part rejects 0 value:
if (!coal->rx_coalesce_usecs)
return -EINVAL;
Fixes: 84da2658a6 ("TI DaVinci EMAC : Implement interrupt pacing functionality.")
Signed-off-by: Maxim Kiselev <bigunclemax@gmail.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20211101152343.4193233-1-bigunclemax@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e4c4871a73 ]
commit b1a811633f ("block: nbd: add sanity check for first_minor")
checks that 'first_minor' should not be greater than 0xff, which is
wrong. Whitout the commit, the details that when user pass 0x100000,
it ends up create sysfs dir "/sys/block/43:0" are as follows:
nbd_dev_add
disk->first_minor = index << part_shift
-> default part_shift is 5, first_minor is 0x2000000
device_add_disk
ddev->devt = MKDEV(disk->major, disk->first_minor)
-> (0x2b << 20) | (0x2000000) = 0x2b00000
device_add
device_create_sys_dev_entry
format_dev_t
sprintf(buffer, "%u:%u", MAJOR(dev), MINOR(dev));
-> got 43:0
sysfs_create_link -> /sys/block/43:0
By the way, with the wrong fix, when part_shift is the default value,
only 8 ndb devices can be created since 8 << 5 is greater than 0xff.
Since the max bits for 'first_minor' should be the same as what
MKDEV() does, which is 20. Change the upper bound of 'first_minor'
from 0xff to 0xfffff.
Fixes: b1a811633f ("block: nbd: add sanity check for first_minor")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20211102015237.2309763-2-yebin10@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 585a070799 ]
The irqchip uses one domain for all GPIO lines, so the line offset
should be determined w.r.t. the first line of the first port, not the
first line of the triggered port.
Fixes: 0d82fb1127 ("gpio: Add Realtek Otto GPIO support")
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f98960c04 ]
A successful 'clk_prepare()' call should be balanced by a corresponding
'clk_unprepare()' call in the error handling path of the probe, as already
done in the remove function.
More specifically, 'clk_prepare_enable()' is used, but 'clk_disable()' is
also already called. So just the unprepare step has still to be done.
Update the error handling path accordingly.
Fixes: 75d31c2372 ("i2c: xlr: add support for Sigma Designs controller variant")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 01d29f87fc ]
If we already hold open state on the client, yet the server gives us a
completely different stateid to the one we already hold, then we
currently treat it as if it were an out-of-sequence update, and wait for
5 seconds for other updates to come in.
This commit fixes the behaviour so that we immediately start processing
of the new stateid, and then leave it to the call to
nfs4_test_and_free_stateid() to decide what to do with the old stateid.
Fixes: b4868b44c5 ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 12b6fcd0ea ]
Currently TMF commands are removed from de_device.dev_tmf_list at the very
end of se_cmd lifecycle. However, se_lun unlinks from se_cmd upon a command
status (response) being queued in transport layer. This means that LUN and
backend device can be deleted in the meantime and a panic will occur:
target_tmr_work()
cmd->se_tfo->queue_tm_rsp(cmd); // send abort_rsp to a wire
transport_lun_remove_cmd(cmd) // unlink se_cmd from se_lun
- // - // - // -
<<<--- lun remove
<<<--- core backend device remove
- // - // - // -
qlt_handle_abts_completion()
tfo->free_mcmd()
transport_generic_free_cmd()
target_put_sess_cmd()
core_tmr_release_req() {
if (dev) { // backend device, can not be null
spin_lock_irqsave(&dev->se_tmr_lock, flags); //<<<--- CRASH
Call Trace:
NIP [c000000000e1683c] _raw_spin_lock_irqsave+0x2c/0xc0
LR [c00800000e433338] core_tmr_release_req+0x40/0xa0 [target_core_mod]
Call Trace:
(unreliable)
0x0
target_put_sess_cmd+0x2a0/0x370 [target_core_mod]
transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
tcm_qla2xxx_complete_mcmd+0x28/0x50 [tcm_qla2xxx]
process_one_work+0x2c4/0x5c0
worker_thread+0x88/0x690
For the iSCSI protocol this is easily reproduced:
- Send some SCSI sommand
- Send Abort of that command over iSCSI
- Remove LUN on target
- Send next iSCSI command to acknowledge the Abort_Response
- Target panics
There is no need to keep the command in tmr_list until response completion,
so move the removal from tmr_list from the response completion to the
response queueing when the LUN is unlinked. Move the removal from state
list too as it is a subject to the same race condition.
Link: https://lore.kernel.org/r/20211018135753.15297-1-d.bogdanov@yadro.com
Fixes: c66ac9db8d ("[SCSI] target: Add LIO target core v4.0.0-rc6")
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28b7ee33a2 ]
TI AR7 Watchdog Timer is only build for 32bit.
Avoid error like:
In file included from drivers/watchdog/ar7_wdt.c:29:
./arch/mips/include/asm/mach-ar7/ar7.h: In function ‘ar7_is_titan’:
./arch/mips/include/asm/mach-ar7/ar7.h:111:24: error: implicit declaration of function ‘KSEG1ADDR’; did you mean ‘CKSEG1ADDR’? [-Werror=implicit-function-declaration]
111 | return (readl((void *)KSEG1ADDR(AR7_REGS_GPIO + 0x24)) & 0xffff) ==
| ^~~~~~~~~
| CKSEG1ADDR
Fixes: da2a68b3eb ("watchdog: Enable COMPILE_TEST where possible")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210907024904.4127611-1-liu.yun@linux.dev
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1aaa557b2d ]
'make randconfig' can produce a .config file with
"CONFIG_MEMORY_RESERVE=" (no value) since it has no default.
When a subsequent 'make all' is done, kconfig restarts the config
and prompts for a value for MEMORY_RESERVE. This breaks
scripting/automation where there is no interactive user input.
Add a default value for MEMORY_RESERVE. (Any integer value will
work here for kconfig.)
Fixes a kconfig warning:
.config:214:warning: symbol value '' invalid for MEMORY_RESERVE
* Restart config...
Memory reservation (MiB) (MEMORY_RESERVE) [] (NEW)
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2") # from beginning of git history
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: linux-m68k@lists.linux-m68k.org
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ce0ee4e6ac ]
Today the sh code allocates memory the first time a process uses
the fpu. If that memory allocation fails, kill the affected task
with force_sig(SIGKILL) rather than do_group_exit(SIGKILL).
Calling do_group_exit from an exception handler can potentially lead
to dead locks as do_group_exit is not designed to be called from
interrupt context. Instead use force_sig(SIGKILL) to kill the
userspace process. Sending signals in general and force_sig in
particular has been tested from interrupt context so there should be
no problems.
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Fixes: 0ea820cf9b ("sh: Move over to dynamically allocated FPU context.")
Link: https://lkml.kernel.org/r/20211020174406.17889-6-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e7e1e880b1 ]
Before the `callback_result` callback was introduced drivers coded their
invocation to the callback in a similar way to:
if (cb->callback) {
spin_unlock(&dma->lock);
cb->callback(cb->callback_param);
spin_lock(&dma->lock);
}
With the introduction of `callback_result` two helpers where introduced to
transparently handle both types of callbacks. And drivers where updated to
look like this:
if (dmaengine_desc_callback_valid(cb)) {
spin_unlock(&dma->lock);
dmaengine_desc_callback_invoke(cb, ...);
spin_lock(&dma->lock);
}
dmaengine_desc_callback_invoke() correctly handles both `callback_result`
and `callback`. But we forgot to update the dmaengine_desc_callback_valid()
function to check for `callback_result`. As a result DMA descriptors that
use the `callback_result` rather than `callback` don't have their callback
invoked by drivers that follow the pattern above.
Fix this by checking for both `callback` and `callback_result` in
dmaengine_desc_callback_valid().
Fixes: f067025bc6 ("dmaengine: add support to provide error result from a DMA transation")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20211023134101.28042-1-lars@metafoo.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5648b5e116 ]
On 64bit platforms the MAC header is set to 0xffff on allocation and
also when a helper like skb_unset_mac_header() is called.
dev_parse_header may call skb_mac_header() which assumes valid mac offset:
BUG: KASAN: use-after-free in eth_header_parse+0x75/0x90
Read of size 6 at addr ffff8881075a5c05 by task nf-queue/1364
Call Trace:
memcpy+0x20/0x60
eth_header_parse+0x75/0x90
__nfqnl_enqueue_packet+0x1a61/0x3380
__nf_queue+0x597/0x1300
nf_queue+0xf/0x40
nf_hook_slow+0xed/0x190
nf_hook+0x184/0x440
ip_output+0x1c0/0x2a0
nf_reinject+0x26f/0x700
nfqnl_recv_verdict+0xa16/0x18b0
nfnetlink_rcv_msg+0x506/0xe70
The existing code only works if the skb has a mac header.
Fixes: 2c38de4c1f ("netfilter: fix looped (broad|multi)cast's MAC handling")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8120bd469f ]
Free the kbuf buffer before returning from the dpaa2_console_read()
function. The variable no longer goes out of scope, leaking the storage
it points to.
Fixes: c93349d8c1 ("soc: fsl: add DPAA2 console support")
Signed-off-by: Robert-Ionut Alexa <robert-ionut.alexa@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 840fe25833 ]
As the ht16k33 frame buffer sub-driver does not register an
fb_ops.fb_blank() handler, blanking does not work:
$ echo 1 > /sys/class/graphics/fb0/blank
sh: write error: Invalid argument
Fix this by providing a handler that always returns zero, to make sure
blank events will be sent to the actual device handling the backlight.
Reported-by: Robin van der Gracht <robin@protonic.nl>
Suggested-by: Robin van der Gracht <robin@protonic.nl>
Fixes: 8992da44c6 ("auxdisplay: ht16k33: Driver for LED controller")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 80f9eb70fd ]
Currently /sys/class/graphics/fb0/bl_curve is not accessible (-ENODEV),
as the driver does not connect the backlight to the frame buffer device.
Fix this moving backlight initialization up, and filling in
fb_info.bl_dev.
Fixes: 8992da44c6 ("auxdisplay: ht16k33: Driver for LED controller")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Robin van der Gracht <robin@protonic.nl>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afcb5a811f ]
While writing an empty string to a device attribute is a no-op, and thus
does not need explicit safeguards, the user can still write a single
newline to an attribute file:
echo > .../message
If that happens, img_ascii_lcd_display() trims the newline, yielding an
empty string, and causing an infinite loop in img_ascii_lcd_scroll().
Fix this by adding a check for empty strings. Clear the display in case
one is encountered.
Fixes: 0cad855fbd ("auxdisplay: img-ascii-lcd: driver for simple ASCII LCD displays")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 133a48abf6 ]
If O_DIRECT bumps the commit_info rpcs_out field, then that could lead
to fsync() hangs. The fix is to ensure that O_DIRECT calls
nfs_commit_end().
Fixes: 723c921e7d ("sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fc9e18f9e9 ]
Under the following conditions:
* after rounding up by 4 the number of bytes to transfer (this is
related to the controller's internal constraints),
* if this (rounded) amount of data is situated beyond the end of the
device,
* and only in NV-DDR mode,
the Arasan NAND controller timeouts.
This currently can happen in a particular helper used when picking
software ECC algorithms. Let's prevent this situation by refusing to use
the NV-DDR interface with software engines.
Fixes: 4edde60314 ("mtd: rawnand: arasan: Support NV-DDR interface")
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20211008163640.1753821-1-miquel.raynal@bootlin.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4caab28a62 ]
The condition register PCI_RCV_INTX is used in irq_mask() and irq_unmask()
callbacks. Accesses to register can occur at the same time without a lock.
Add a lock into each callback to prevent the issue.
And INTX mask and unmask fields in PCL_RCV_INTX register should only be
set/reset for each bit. Clearing by PCL_RCV_INTX_ALL_MASK should be
removed.
INTX status fields in PCL_RCV_INTX register only indicates each INTX
interrupt status, so the handler can't clear by writing 1 to the field.
The status is expected to be cleared by the interrupt origin.
The ack function has no meaning, so should remove it.
Suggested-by: Pali Rohár <pali@kernel.org>
Link: https://lore.kernel.org/r/1631924579-24567-1-git-send-email-hayashi.kunihiko@socionext.com
Fixes: 7e6d5cd88a ("PCI: uniphier: Add UniPhier PCIe host controller support")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Pali Rohár <pali@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 78e4d34218 ]
hisi_spi_nor_probe() invokes clk_disable_unprepare() on all paths after
successful call of clk_prepare_enable(). Besides, the clock is enabled by
hispi_spi_nor_prep() and disabled by hispi_spi_nor_unprep(). So at remove
time it is not possible to have the clock enabled. The patch removes
excessive clk_disable_unprepare() from hisi_spi_nor_remove().
Found by Linux Driver Verification project (linuxtesting.org).
Fixes: e523f11141 ("mtd: spi-nor: add hisilicon spi-nor flash controller driver")
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Reviewed-by: Pratyush Yadav <p.yadav@ti.com>
Link: https://lore.kernel.org/r/20210709144529.31379-1-novikov@ispras.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2667f6b7af ]
I have a ST1633 touch controller which fails to probe due to a timeout
waiting for the controller to become ready. Increasing the minimum
delay to 100ms ensures that the probe sequence completes successfully.
The ST1633 datasheet says nothing about the maximum delay here and the
ST1232 I2C protocol document says "wait until" with no notion of a
timeout.
Since this only runs once during probe, being generous with the timout
seems reasonable and most likely the device will become ready
eventually.
(It may be worth noting that I saw this issue with a PREEMPT_RT patched
kernel which probably has tighter wakeups from usleep_range() than other
preemption models.)
Fixes: f605be6a57 ("Input: st1232 - wait until device is ready before reading resolution")
Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20210929152609.2421483-1-john@metanate.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4c2b46c824 ]
When op_alloc() returns NULL to new_op, no error return code of
orangefs_revalidate_lookup() is assigned.
To fix this bug, ret is assigned with -ENOMEM in this case.
Fixes: 8bb8aefd5a ("OrangeFS: Change almost all instances of the string PVFS2 to OrangeFS.")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 64a93dbf25 ]
Partially revert commit 2ce209c42c ("NFS: Wait for requests that are
locked on the commit list"), since it can lead to deadlocks between
commit requests and nfs_join_page_group().
For now we should assume that any locked requests on the commit list are
either about to be removed and committed by another task, or the writes
they describe are about to be retransmitted. In either case, we should
not need to worry.
Fixes: 2ce209c42c ("NFS: Wait for requests that are locked on the commit list")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27ff8187f1 ]
Fix sparse warning:
drivers/opp/of.c:924 _opp_add_static_v2() warn: passing zero to 'ERR_PTR'
For duplicate OPPs 'ret' be set to zero.
Fixes: deac8703da ("PM / OPP: _of_add_opp_table_v2(): increment count only if OPP is added")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d419052bc6 ]
Commit 43f5c77bcb ("PCI: aardvark: Fix reporting CRS value") started
using CRSSVE flag for handling CRS responses.
PCI_EXP_RTCTL_CRSSVE flag is stored only in emulated config space buffer
and there is handler for PCI_EXP_RTCTL register. So every read operation
from config space automatically clears CRSSVE flag as it is not defined in
PCI_EXP_RTCTL read handler.
Fix this by reading current CRSSVE bit flag from emulated space buffer and
appending it to PCI_EXP_RTCTL read response.
Link: https://lore.kernel.org/r/20211005180952.6812-5-kabel@kernel.org
Fixes: 43f5c77bcb ("PCI: aardvark: Fix reporting CRS value")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7be28bd73f ]
drivers/gpu/drm/drm_plane_helper.c: In function 'drm_primary_helper_update':
drivers/gpu/drm/drm_plane_helper.c:113:32: error: 'visible' is used uninitialized [-Werror=uninitialized]
113 | struct drm_plane_state plane_state = {
| ^~~~~~~~~~~
drivers/gpu/drm/drm_plane_helper.c:178:14: note: 'visible' was declared here
178 | bool visible;
| ^~~~~~~
cc1: all warnings being treated as errors
visible is an output, not an input. in practice this use might turn out
OK but it's still UB.
Fixes: df86af9133 ("drm/plane-helper: Add drm_plane_helper_check_state()")
Reviewed-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: Simon Ser <contact@emersion.fr>
Link: https://patchwork.freedesktop.org/patch/msgid/20211007063706.305984-1-alex_y_xu@yahoo.ca
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a2915fa062 ]
_nfs4_pnfs_v3/v4_ds_connect do
some work
smp_wmb
ds->ds_clp = clp;
And nfs4_ff_layout_prepare_ds currently does
smp_rmb
if(ds->ds_clp)
...
This patch places the smp_rmb after the if. This ensures that following
reads only happen once nfs4_ff_layout_prepare_ds has checked that data
has been properly initialized.
Fixes: d67ae825a5 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Baptiste Lepers <baptiste.lepers@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cec08f452a ]
If the directory changed while we were revalidating the dentry, then
don't update the dentry verifier. There is no value in setting the
verifier to an older value, and we could end up overwriting a more up to
date verifier from a parallel revalidation.
Fixes: efeda80da3 ("NFSv4: Fix revalidation of dentries with delegations")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6a361c4ca ]
If we want to revalidate the directory, then just mark the change
attribute as invalid.
Fixes: 13c0b082b6 ("NFS: Replace use of NFS_INO_REVAL_PAGECACHE when checking cache validity")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eea413308f ]
Both NFSv3 and NFSv2 generate their change attribute from the ctime
value that was supplied by the server. However the problem is that there
are plenty of servers out there with ctime resolutions of 1ms or worse.
In a modern performance system, this is insufficient when trying to
decide which is the most recent set of attributes when, for instance, a
READ or GETATTR call races with a WRITE or SETATTR.
For this reason, let's revert to labelling the NFSv2/v3 change
attributes as NFS4_CHANGE_TYPE_IS_UNDEFINED. This will ensure we protect
against such races.
Fixes: 7b24dacf08 ("NFS: Another inode revalidation improvement")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8228aea5a ]
The reason for the modification here is that the previous
offset information is incorrect, OFFSET_DEBUGSTAT = 0xE4 is
the correct value.
Fixes: 25708278f8 ("i2c: mediatek: Add i2c support for MediaTek MT8183")
Signed-off-by: Kewei Xu <kewei.xu@mediatek.com>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Qii Wang <qii.wang@mediatek.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c4c2c8e6f ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210927134104.38648-1-broonie@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3109151c47 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding an id_table listing the
SPI IDs for everything.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210927130240.33693-1-broonie@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5f84478e14 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding an id_table listing the
SPI IDs for everything.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210923194922.53386-4-broonie@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit da87639d63 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding an id_table listing the
SPI IDs for everything.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210923194922.53386-3-broonie@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8719a17613 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding an id_table listing the
SPI IDs for everything.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20210923194922.53386-2-broonie@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b6e27d01a ]
Dan Carpenter says:
The patch d20c11d86d: "nfsd: Protect session creation and client
confirm using client_lock" from Jul 30, 2014, leads to the following
Smatch static checker warning:
net/sunrpc/addr.c:178 rpc_parse_scope_id()
warn: sleeping in atomic context
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: d20c11d86d ("nfsd: Protect session creation and client...")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 46a0dc10fb ]
ebu_nand_probe() read the value of u32 variable "cs" from the device
firmware description and used it as the index for array ebu_host->cs
that can contain MAX_CS (2) elements at most. That could result in
a buffer overflow and various bad consequences later.
Fix the potential buffer overflow by restricting values of "cs" with
MAX_CS in probe.
Found by Linux Driver Verification project (linuxtesting.org).
Fixes: 0b1039f016 ("mtd: rawnand: Add NAND controller support on Intel LGM SoC")
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Co-developed-by: Kirill Shilimanov <kirill.shilimanov@huawei.com>
Signed-off-by: Kirill Shilimanov <kirill.shilimanov@huawei.com>
Co-developed-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210903082653.16441-1-novikov@ispras.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d108370c64 ]
clang static analysis reports this representative problem:
label.c:1463:16: warning: Assigned value is garbage or undefined
label->hname = name;
^ ~~~~
In aa_update_label_name(), this the problem block of code
if (aa_label_acntsxprint(&name, ...) == -1)
return res;
On failure, aa_label_acntsxprint() has a more complicated return
that just -1. So check for a negative return.
It was also noted that the aa_label_acntsxprint() main comment refers
to a nonexistent parameter, so clean up the comment.
Fixes: f1bd904175 ("apparmor: add the base fns() for domain labels")
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f1a090f09f ]
If the driver returns a new MR during rereg it has to fill it with the
IOVA from the proper source. If IB_MR_REREG_TRANS is set then the IOVA is
cmd.hca_va, otherwise the IOVA comes from the old MR. mlx5 for example has
two calls inside rereg_mr:
return create_real_mr(new_pd, umem, mr->ibmr.iova,
new_access_flags);
and
return create_real_mr(new_pd, new_umem, iova, new_access_flags);
Unconditionally overwriting the iova in the newly allocated MR will
corrupt the iova if the first path is used.
Remove the redundant initializations from ib_uverbs_rereg_mr().
Fixes: 6e0954b11c ("RDMA/uverbs: Allow drivers to create a new HW object during rereg_mr")
Link: https://lore.kernel.org/r/4b0a31bbc372842613286a10d7a8cbb0ee6069c7.1635400472.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cdf10ffe8f ]
When registering the IRQ handler fails, do not just return the error code,
this will free the devm_kzalloc()-ed data struct while leaving the queued
work queued and the registered power_supply registered with both of them
now pointing to free-ed memory, resulting in various kernel crashes
soon afterwards.
Instead properly tear-down things on IRQ handler register errors.
Fixes: 703df6c097 ("power: bq27xxx_battery: Reorganize I2C into a module")
Cc: Andrew F. Davis <afd@ti.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18b8f5b6fc ]
mips_cm_error_report() extracts the cause and other cause from the error
register using shifts. This works fine for the former, as it is stored
in the top bits, and the shift will thus remove all non-related bits.
However, the latter is stored in the bottom bits, hence thus needs masking
to get rid of non-related bits. Without such masking, using it as an
index into the cm2_causes[] array will lead to an out-of-bounds access,
probably causing a crash.
Fix this by using FIELD_GET() instead. Bite the bullet and convert all
MIPS CM handling to the bitfield API, to improve readability and safety.
Fixes: 3885c2b463 ("MIPS: CM: Add support for reporting CM cache errors")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d142585bce ]
If CONFIG_CONSOLE_POLL=y, and CONFIG_SERIAL_CPM=m (hence
CONFIG_SERIAL_CPM_CONSOLE=n):
drivers/tty/serial/cpm_uart/cpm_uart_core.c:1109:12: warning: ‘udbg_cpm_getc’ defined but not used [-Wunused-function]
1109 | static int udbg_cpm_getc(void)
| ^~~~~~~~~~~~~
drivers/tty/serial/cpm_uart/cpm_uart_core.c:1095:13: warning: ‘udbg_cpm_putc’ defined but not used [-Wunused-function]
1095 | static void udbg_cpm_putc(char c)
| ^~~~~~~~~~~~~
Fix this by making the udbg definitions depend on
CONFIG_SERIAL_CPM_CONSOLE, in addition to CONFIG_CONSOLE_POLL.
Fixes: a60526097f ("tty: serial: cpm_uart: Add udbg support for enabling xmon")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20211027075326.3270785-1-geert@linux-m68k.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 778a0cbef5 ]
The setting from the cirrus,ts-inv property should be applied to the
TIP_SENSE_INV bit, as this is the one that actually affects the jack
detect block. The TS_INV bit only swaps the meaning of the PLUG and
UNPLUG interrupts and should always be 1 for the interrupts to have
the normal meaning.
Due to some misunderstanding the driver had been implemented to
configure the TS_INV bit based on the jack switch polarity. This made
the interrupts behave the correct way around, but left the jack detect
block, button detect and analogue circuits always interpreting an open
switch as unplugged.
The signal chain inside the codec is:
SENSE pin -> TIP_SENSE_INV -> TS_INV -> (invert) -> interrupts
|
v
Jack detect,
button detect and
analog control
As the TIP_SENSE_INV already performs the necessary inversion the
TS_INV bit never needs to change. It must always be 1 to yield the
expected interrupt behaviour.
Some extra confusion has arisen because of the additional invert in the
interrupt path, meaning that a value applied to the TS_INV bit produces
the opposite effect of applying it to the TIP_SENSE_INV bit. The ts-inv
property has therefore always had the opposite effect to what might be
expected (0 = inverted, 1 = not inverted). To maintain the meaning of
the ts-inv property it must be inverted when applied to TIP_SENSE_INV.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 2c394ca796 ("ASoC: Add support for CS42L42 codec")
Link: https://lore.kernel.org/r/20211028140902.11786-3-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cddcd5472a ]
A user reports functional regression for Mackie Onyx 1640i that the device
generates slow sound with ALSA oxfw driver which supports media clock
recovery. Although the device is based on OXFW971 ASIC, it does not
transfer isochronous packet with own event frequency as expected. The
device seems to adjust event frequency according to events in received
isochronous packets in the beginning of packet streaming. This is
unknown quirk.
This commit fixes the regression to turn the recovery off in driver
side. As a result, nominal frequency is used in duplex packet streaming
between device and driver. For stability of sampling rate in events of
transferred isochronous packet, 4,000 isochronous packets are skipped
in the beginning of packet streaming.
Reference: https://github.com/takaswie/snd-firewire-improve/issues/38
Fixes: 029ffc4294 ("ALSA: oxfw: perform sequence replay for media clock recovery")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20211028130325.45772-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b6cb20fdc2 ]
set_memory_x() calls pte_mkexec() which sets _PAGE_EXEC.
set_memory_nx() calls pte_exprotec() which clears _PAGE_EXEC.
Book3e has 2 bits, UX and SX, which defines the exec rights
resp. for user (PR=1) and for kernel (PR=0).
_PAGE_EXEC is defined as UX only.
An executable kernel page is set with either _PAGE_KERNEL_RWX
or _PAGE_KERNEL_ROX, which both have SX set and UX cleared.
So set_memory_nx() call for an executable kernel page does
nothing because UX is already cleared.
And set_memory_x() on a non-executable kernel page makes it
executable for the user and keeps it non-executable for kernel.
Also, pte_exec() always returns 'false' on kernel pages, because
it checks _PAGE_EXEC which doesn't include SX, so for instance
the W+X check doesn't work.
To fix this:
- change tlb_low_64e.S to use _PAGE_BAP_UX instead of _PAGE_USER
- sets both UX and SX in _PAGE_EXEC so that pte_exec() returns
true whenever one of the two bits is set and pte_exprotect()
clears both bits.
- Define a book3e specific version of pte_mkexec() which sets
either SX or UX based on UR.
Fixes: 1f9ad21c3b ("powerpc/mm: Implement set_memory() routines")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c41100f9c144dc5b62e5a751b810190c6b5d42fd.1635226743.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1b93cb7e7 ]
Commit 26973fa5ac ("powerpc/mm: use pte helpers in generic code")
changed those two functions to use pte helpers to determine which
bits to clear and which bits to set.
This change was based on the assumption that bits to be set/cleared
are always the same and can be determined by applying the pte
manipulation helpers on __pte(0).
But on platforms like book3e, the bits depend on whether the page
is a user page or not.
For the time being it more or less works because of _PAGE_EXEC being
used for user pages only and exec right being set at all time on
kernel page. But following patch will clean that and output of
pte_mkexec() will depend on the page being a user or kernel page.
Instead of trying to make an even more complicated helper where bits
would become dependent on the final pte value, come back to a more
static situation like before commit 26973fa5ac ("powerpc/mm: use
pte helpers in generic code"), by introducing an 8xx specific
version of __ptep_set_access_flags() and ptep_set_wrprotect().
Fixes: 26973fa5ac ("powerpc/mm: use pte helpers in generic code")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/922bdab3a220781bae2360ff3dd5adb7fe4d34f1.1635226743.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 43775e62c4 ]
The wait_for_completion_timeout function returns 0 if timed out or a
positive value if completed. Hence, "less than zero" comparison always
misses timeouts and doesn't kill the URB as it should, leading to
re-sending it while it is active.
Fixes: 42337b9d4d ("HID: add driver for U2F Zero built-in LED and RNG")
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7abf78b7a ]
The previous commit fixed handling of incomplete packets but broke error
handling: offsetof returns an unsigned value (size_t), but when compared
against the signed return value, the return value is interpreted as if
it were unsigned, so negative return values are never less than the
offset.
To make the code easier to read, calculate the minimal packet length
once and separately, and assign it to a signed int variable to eliminate
unsigned math and the need for type casts. It then becomes immediately
obvious how the actual data length is calculated and why the return
value cannot be less than the minimal length.
Fixes: 22d65765f2 ("HID: u2fzero: ignore incomplete packets without data")
Fixes: 42337b9d4d ("HID: add driver for U2F Zero built-in LED and RNG")
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 88b20f84f0 ]
xilinx_uartps .start_tx() clears TXEMPTY when enabling TXEMPTY to avoid
any previous TXEVENT event asserting the UART interrupt. This clear
operation is done immediately after filling the TX FIFO.
However, if the bytes inserted by cdns_uart_handle_tx() are consumed by
the UART before the TXEMPTY is cleared, the clear operation eats the new
TXEMPTY event as well, causing cdns_uart_isr() to never receive the
TXEMPTY event. If there are bytes still queued in circbuf, TX will get
stuck as they will never get transferred to FIFO (unless new bytes are
queued to circbuf in which case .start_tx() is called again).
While the racy missed TXEMPTY occurs fairly often with short data
sequences (e.g. write 1 byte), in those cases circbuf is usually empty
so no action on TXEMPTY would have been needed anyway. On the other
hand, longer data sequences make the race much more unlikely as UART
takes longer to consume the TX FIFO. Therefore it is rare for this race
to cause visible issues in general.
Fix the race by clearing the TXEMPTY bit in ISR *before* filling the
FIFO.
The TXEMPTY bit in ISR will only get asserted at the exact moment the
TX FIFO *becomes* empty, so clearing the bit before filling FIFO does
not cause an extra immediate assertion even if the FIFO is initially
empty.
This is hard to reproduce directly on a normal system, but inserting
e.g. udelay(200) after cdns_uart_handle_tx(port), setting 4000000 baud,
and then running "dd if=/dev/zero bs=128 of=/dev/ttyPS0 count=50"
reliably reproduces the issue on my ZynqMP test system unless this fix
is applied.
Fixes: 85baf542d5 ("tty: xuartps: support 64 byte FIFO size")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Link: https://lore.kernel.org/r/20211026102741.2910441-1-anssi.hannula@bitwise.fi
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b475bf0ec4 ]
The FSEL_MASK which selects the refclock is defined incorrectly.
It should be [4:6] not [5:7]. Due to this incorrect definition, the BIT(7)
in USB2_PHY_USB_PHY_HS_PHY_CTRL_COMMON0 is reset which keeps PHY analog
blocks ON during suspend.
Fix this issue by correctly defining the FSEL_MASK.
Fixes: 51e8114f80 ("phy: qcom-snps: Add SNPS USB PHY driver for QCOM based SOCs")
Signed-off-by: Sandeep Maheswaram <quic_c_sanm@quicinc.com>
Link: https://lore.kernel.org/r/1635135575-5668-1-git-send-email-quic_c_sanm@quicinc.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8d387f61b0 ]
In case of USB_DR_MODE_PERIPHERAL, the OTG clock is disabled at the end of
the probe (it is not the case if USB_DR_MODE_HOST or USB_DR_MODE_OTG).
The clock is then enabled on udc_start.
If dwc2_drd_role_sw_set is called before udc_start (it is the case if the
usb cable is plugged at boot), GOTGCTL and GUSBCFG registers cannot be
read/written, so session cannot be overridden.
To avoid this case, check the ll_hw_enabled value and enable the clock if
it is available, and disable it after the override.
Fixes: 17f934024e ("usb: dwc2: override PHY input signals with usb role switch support")
Acked-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211005095305.66397-3-amelie.delaunay@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2cab2a24f ]
Instead of forcing the role to Device, check the dr_mode configuration.
If the core is Host only, force the mode to Host, this to avoid the
dwc2_force_mode warning:
WARNING: CPU: 1 PID: 21 at drivers/usb/dwc2/core.c:615 dwc2_drd_init+0x104/0x17c
When forcing mode to Host, dwc2_force_mode may sleep the time the host
role is applied. To avoid sleeping while atomic context, move the call
to dwc2_force_mode after spin_unlock_irqrestore. It is safe, as
interrupts are not yet unmasked here.
Fixes: 17f934024e ("usb: dwc2: override PHY input signals with usb role switch support")
Acked-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
Signed-off-by: Amelie Delaunay <amelie.delaunay@foss.st.com>
Link: https://lore.kernel.org/r/20211005095305.66397-2-amelie.delaunay@foss.st.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d305c253af ]
A prior patch introduced HBA_NEEDS_CFG_PORT flag logic, but in
lpfc_sli_brdrestart_s3() code path, right after HBA_NEEDS_CFG_PORT is set,
the phba->hba_flag is cleared in lpfc_sli_brdreset().
Fix by calling lpfc_sli_chipset_init() to wait for successful restart of
the HBA in lpfc_host_reset_handler() after lpfc_sli_brdrestart().
lpfc_sli_chipset_init() sets the HBA_NEEDS_CFG_PORT flag so that the
lpfc_sli_hba_setup() routine from lpfc_online() will execute
lpfc_sli_config_port() initialization step when the brdrestart is
successful.
Link: https://lore.kernel.org/r/20211020211417.88754-3-jsmart2021@gmail.com
Fixes: d2f2547efd ("scsi: lpfc: Fix auto sli_mode and its effect on CONFIG_PORT for SLI3")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b600bd7eb3 ]
With commit ecb010d441 ("iio: imu: adis: Refactor adis_initial_startup")
we are doing a HW or SW reset to the device which means that we'll get
the default state of the data ready pin (which is enabled). Hence there's
no point in disabling the IRQ in the init function. Moreover, this
function is intended to initialize internal data structures and not
really do anything on the device.
As a result of this, some devices were left with the data ready pin enabled
after probe which was not the desired behavior. Thus, we move the call to
'adis_enable_irq()' to the initial startup function where it makes more
sense for it to be.
Note that for devices that cannot mask/unmask the pin, it makes no sense
to call the function at this point since the IRQ should not have been
yet requested. This will be improved in a follow up change.
Fixes: ecb010d441 ("iio: imu: adis: Refactor adis_initial_startup")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20210903141423.517028-2-nuno.sa@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09776d9374 ]
When __iio_buffer_alloc_sysfs_and_mask() failed, 'unwind_idx' should be
set to 'i - 1' to prevent double-free when cleanup resources.
BUG: KASAN: double-free or invalid-free in __iio_buffer_free_sysfs_and_mask+0x32/0xb0 [industrialio]
Call Trace:
kfree+0x117/0x4c0
__iio_buffer_free_sysfs_and_mask+0x32/0xb0 [industrialio]
iio_buffers_alloc_sysfs_and_mask+0x60d/0x1570 [industrialio]
__iio_device_register+0x483/0x1a30 [industrialio]
ina2xx_probe+0x625/0x980 [ina2xx_adc]
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: ee708e6baa ("iio: buffer: introduce support for attaching more IIO buffers")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211013094923.2473-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3e56c050a ]
The general expectation is that powering on a power-domain should make
the power domain deliver some power, and if a specific performance state
is needed further requests has to be made.
But in contrast with other power-domain implementations (e.g. rpmpd) the
RPMh does not have an interface to enable the power, so the driver has
to vote for a particular corner (performance level) in rpmh_power_on().
But the corner is never initialized, so a typical request to simply
enable the power domain would not actually turn on the hardware. Further
more, when no more clients vote for a performance state (i.e. the
aggregated vote is 0) the power domain would be turned off.
Fix both of these issues by always voting for a corner with non-zero
value, when the power domain is enabled.
The tracking of the lowest non-zero corner is performed to handle the
corner case if there's ever a domain with a non-zero lowest corner, in
which case both rpmh_power_on() and rpmh_rpmhpd_set_performance_state()
would be allowed to use this lowest corner.
Fixes: 279b7e8a62 ("soc: qcom: rpmhpd: Add RPMh power domain driver")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20211005033732.2284447-1-bjorn.andersson@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6e6825801a ]
An I2S frame always has two slots (left and right) even when sending
mono. The right channel (channel 2) of ASP TX will always have the
same bit width as the left channel and will always be on the high
phase of LRCLK.
The previous implementation always passed the field masks for both
channels to snd_soc_component_update_bits() but for mono the written value
only contained the settings for channel 1. The result was that for mono
channel 2 was set to 8-bit (which is an invalid configuration) with both
channels on the low phase of LRCLK.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 585e7079de ("ASoC: cs42l42: Add Capture Support")
Link: https://lore.kernel.org/r/20211015133619.4698-3-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6f87a74d31 ]
The STM32 SAI subblocks registers offsets are in the range
0x0004 (SAIx_CR1) to 0x0020 (SAIx_DR).
The corresponding range length is 0x20 instead of 0x1c.
Change reg property accordingly.
Fixes: 5afd65c3a0 ("ARM: dts: stm32: add sai support on stm32mp157c")
Signed-off-by: Olivier Moysan <olivier.moysan@foss.st.com>
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2012579b31 ]
The SPI NOR is a bit further away from the SoC on DHCOR than on DHCOM,
which causes additional signal delay. At 108 MHz, this delay triggers
a sporadic issue where the first bit of RX data is not received by the
QSPI controller.
There are two options of addressing this problem, either by using the
DLYB block to compensate the extra delay, or by reducing the QSPI bus
clock frequency. The former requires calibration and that is overly
complex, so opt for the second option.
Fixes: 76045bc457 ("ARM: dts: stm32: Add QSPI NOR on AV96")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Patrice Chotard <patrice.chotard@foss.st.com>
Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
Cc: linux-stm32@st-md-mailman.stormreply.com
To: linux-arm-kernel@lists.infradead.org
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f6aca0e0f ]
On power9 and earlier platforms, the default event used for cyles and
instructions is PM_CYC (0x0001e) and PM_INST_CMPL (0x00002)
respectively. These events use two programmable PMCs and by default will
count irrespective of the run latch state (idle state). But since they
use programmable PMCs, these events can lead to multiplexing with other
events, because there are only 4 programmable PMCs. Hence in power10,
performance monitoring unit (PMU) driver uses performance monitor
counter 5 (PMC5) and performance monitor counter6 (PMC6) for counting
instructions and cycles.
Currently on power10, the event used for cycles is PM_RUN_CYC (0x600F4)
and instructions uses PM_RUN_INST_CMPL (0x500fa). But counting of these
events in idle state is controlled by the CC56RUN bit setting in Monitor
Mode Control Register0 (MMCR0). If the CC56RUN bit is zero, PMC5/6 will
not count when CTRL[RUN] (run latch) is zero. This could lead to missing
some counts if a thread is in idle state during system wide profiling.
To fix it, set the CC56RUN bit in MMCR0 for power10, which makes PMC5
and PMC6 count instructions and cycles regardless of the run latch
state. Since this change make PMC5/6 count as PM_INST_CMPL/PM_CYC,
rename the event code 0x600f4 as PM_CYC instead of PM_RUN_CYC and event
code 0x500fa as PM_INST_CMPL instead of PM_RUN_INST_CMPL. The changes
are only for PMC5/6 event codes and will not affect the behaviour of
PM_RUN_CYC/PM_RUN_INST_CMPL if progammed in other PMC's.
Fixes: a64e697cef ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.cm>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[mpe: Tweak change log wording for style and consistency]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211007075121.28497-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ca1739748 ]
Right now dyndbg shows up as an unknown parameter if used on boot:
Unknown command line parameters: dyndbg=+p
That's because it is unknown, it doesn't sit in the __param
section, so the processing done to warn users supplying an unknown
parameter doesn't think it is legitimate.
Install a dummy handler to register it. dynamic debug needs to search
the whole command line for modules listed that are currently builtin,
so there's no real work to be done in this callback.
Fixes: 86d1919a4f ("init: print out unknown kernel parameters")
Tested-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Link: https://lore.kernel.org/r/1634139622-20667-2-git-send-email-jbaron@akamai.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2ab1891640 ]
Commit 723de0f917 ("staging: most: remove device from interface
structure") moved registration of driver-provided struct device to
the most subsystem.
Dim2 used to register the same struct device to provide a custom device
attribute. This causes double-registration of the same struct device.
Fix that by moving the custom attribute to driver's dev_groups.
This moves attribute to the platform_device object, which is a better
location for platform-specific attributes anyway.
Fixes: 723de0f917 ("staging: most: remove device from interface structure")
Acked-by: Christian Gromm <christian.gromm@microchip.com>
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Link: https://lore.kernel.org/r/20211011061117.21435-1-nikita.yoush@cogentembedded.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f4875d509a ]
This variable is just a temporary variable, used to do an endian
conversion. The problem is that the last byte is not initialized. After
the conversion is completely done, the last byte is discarded so it doesn't
cause a problem. But static checkers and the KMSan runtime checker can
detect the uninitialized read and will complain about it.
Link: https://lore.kernel.org/r/20211006073242.GA8404@kili
Fixes: 5036f0a0ec ("[SCSI] csiostor: Fix sparse warnings.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fda0eb2200 ]
vcpu_is_preempted() can be used outside of preempt-disabled critical
sections, yielding warnings such as:
BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/185
caller is rwsem_spin_on_owner+0x1cc/0x2d0
CPU: 1 PID: 185 Comm: systemd-udevd Not tainted 5.15.0-rc2+ #33
Call Trace:
[c000000012907ac0] [c000000000aa30a8] dump_stack_lvl+0xac/0x108 (unreliable)
[c000000012907b00] [c000000001371f70] check_preemption_disabled+0x150/0x160
[c000000012907b90] [c0000000001e0e8c] rwsem_spin_on_owner+0x1cc/0x2d0
[c000000012907be0] [c0000000001e1408] rwsem_down_write_slowpath+0x478/0x9a0
[c000000012907ca0] [c000000000576cf4] filename_create+0x94/0x1e0
[c000000012907d10] [c00000000057ac08] do_symlinkat+0x68/0x1a0
[c000000012907d70] [c00000000057ae18] sys_symlink+0x58/0x70
[c000000012907da0] [c00000000002e448] system_call_exception+0x198/0x3c0
[c000000012907e10] [c00000000000c54c] system_call_common+0xec/0x250
The result of vcpu_is_preempted() is always used speculatively, and the
function does not access per-cpu resources in a (Linux) preempt-unsafe way.
Use raw_smp_processor_id() to avoid such warnings, adding explanatory
comments.
Fixes: ca3f969dcb ("powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210928214147.312412-3-nathanl@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f2719b26ae ]
While investigating a lockup at startup on Powerbook 3400C, it was
identified that the fbdev driver generates alignment exception at
startup:
--- interrupt: 600 at memset+0x60/0xc0
NIP: c0021414 LR: c03fc49c CTR: 00007fff
REGS: ca021c10 TRAP: 0600 Tainted: G W (5.14.2-pmac-00727-g12a41fa69492)
MSR: 00009032 <EE,ME,IR,DR,RI> CR: 44008442 XER: 20000100
DAR: cab80020 DSISR: 00017c07
GPR00: 00000007 ca021cd0 c14412e0 cab80000 00000000 00100000 cab8001c 00000004
GPR08: 00100000 00007fff 00000000 00000000 84008442 00000000 c0006fb4 00000000
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00100000
GPR24: 00000000 81800000 00000320 c15fa400 c14d1878 00000000 c14d1800 c094e19c
NIP [c0021414] memset+0x60/0xc0
LR [c03fc49c] chipsfb_pci_init+0x160/0x580
--- interrupt: 600
[ca021cd0] [c03fc46c] chipsfb_pci_init+0x130/0x580 (unreliable)
[ca021d20] [c03a3a70] pci_device_probe+0xf8/0x1b8
[ca021d50] [c043d584] really_probe.part.0+0xac/0x388
[ca021d70] [c043d914] __driver_probe_device+0xb4/0x170
[ca021d90] [c043da18] driver_probe_device+0x48/0x144
[ca021dc0] [c043e318] __driver_attach+0x11c/0x1c4
[ca021de0] [c043ad30] bus_for_each_dev+0x88/0xf0
[ca021e10] [c043c724] bus_add_driver+0x190/0x22c
[ca021e40] [c043ee94] driver_register+0x9c/0x170
[ca021e60] [c0006c28] do_one_initcall+0x54/0x1ec
[ca021ed0] [c08246e4] kernel_init_freeable+0x1c0/0x270
[ca021f10] [c0006fdc] kernel_init+0x28/0x11c
[ca021f30] [c0017148] ret_from_kernel_thread+0x14/0x1c
Instruction dump:
7d4601a4 39490777 7d4701a4 39490888 7d4801a4 39490999 7d4901a4 39290aaa
7d2a01a4 4c00012c 4bfffe88 0fe00000 <4bfffe80> 9421fff0 38210010 48001970
This is due to 'dcbz' instruction being used on non-cached memory.
'dcbz' instruction is used by memset() to zeroize a complete
cacheline at once, and memset() is not expected to be used on non
cached memory.
When performing a 'sparse' check on fbdev driver, it also appears
that the use of memset() is unexpected:
drivers/video/fbdev/chipsfb.c:334:17: warning: incorrect type in argument 1 (different address spaces)
drivers/video/fbdev/chipsfb.c:334:17: expected void *
drivers/video/fbdev/chipsfb.c:334:17: got char [noderef] __iomem *screen_base
drivers/video/fbdev/chipsfb.c:334:15: warning: memset with byte count of 1048576
Use fb_memset() instead of memset(). fb_memset() is defined as
memset_io() for powerpc.
Fixes: 8c8709334c ("[PATCH] ppc32: Remove CONFIG_PMAC_PBOOK")
Reported-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/884a54f1e5cb774c1d9b4db780209bee5d4f6718.1631712563.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4ed2f3545c ]
The error handling code of fsl_ifc_ctrl_probe is problematic. When
fsl_ifc_ctrl_init fails or request_irq of fsl_ifc_ctrl_dev->irq fails,
it forgets to free the irq and nand_irq. Meanwhile, if request_irq of
fsl_ifc_ctrl_dev->nand_irq fails, it will still free nand_irq even if
the request_irq is not successful.
Fix this by refactoring the error handling code.
Fixes: d2ae2e20fb ("driver/memory:Move Freescale IFC driver to a common driver")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Link: https://lore.kernel.org/r/20210925151434.8170-1-mudongliangabcd@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 986b509470 ]
If an error occurs after a successful tegra_powergate_enable_clocks()
call, it must be undone by a tegra_powergate_disable_clocks() call, as
already done in the below and above error handling paths of this function.
Update the 'goto' to branch at the correct place of the error handling
path.
Fixes: a38045121b ("soc/tegra: pmc: Add generic PM domain support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03748d4e00 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding SPI IDs for parts that
only have a compatible listed.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210927134153.12739-1-broonie@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 884ea75d79 ]
Fix typo in pinctrl. It did only work because the bootloader
seems to have initialized it.
Fixes: ee32711195 ("ARM: dts: omap3-gta04: Define and use bma180 irq pin")
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df0a181494 ]
I got memory leak as follows:
unreferenced object 0xffff88801f0b2200 (size 64):
comm "i2c-lis2hh12-21", pid 5455, jiffies 4294944606 (age 15.224s)
hex dump (first 32 bytes):
72 65 67 75 6c 61 74 6f 72 3a 72 65 67 75 6c 61 regulator:regula
74 6f 72 2e 30 2d 2d 69 32 63 3a 31 2d 30 30 31 tor.0--i2c:1-001
backtrace:
[<00000000bf5b0c3b>] __kmalloc_track_caller+0x19f/0x3a0
[<0000000050da42d9>] kvasprintf+0xb5/0x150
[<000000004bbbed13>] kvasprintf_const+0x60/0x190
[<00000000cdac7480>] kobject_set_name_vargs+0x56/0x150
[<00000000bf83f8e8>] dev_set_name+0xc0/0x100
[<00000000cc1cf7e3>] device_link_add+0x6b4/0x17c0
[<000000009db9faed>] _regulator_get+0x297/0x680
[<00000000845e7f2b>] _devm_regulator_get+0x5b/0xe0
[<000000003958ee25>] st_sensors_power_enable+0x71/0x1b0 [st_sensors]
[<000000005f450f52>] st_accel_i2c_probe+0xd9/0x150 [st_accel_i2c]
[<00000000b5f2ab33>] i2c_device_probe+0x4d8/0xbe0
[<0000000070fb977b>] really_probe+0x299/0xc30
[<0000000088e226ce>] __driver_probe_device+0x357/0x500
[<00000000c21dda32>] driver_probe_device+0x4e/0x140
[<000000004e650441>] __device_attach_driver+0x257/0x340
[<00000000cf1891b8>] bus_for_each_drv+0x166/0x1e0
When device_register() returns an error, the name allocated in dev_set_name()
will be leaked, the put_device() should be used instead of kfree() to give up
the device reference, then the name will be freed in kobject_cleanup() and the
references of consumer and supplier will be decreased in device_link_release_fn().
Fixes: 287905e68d ("driver core: Expose device link details in sysfs")
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210930085714.2057460-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e7dcc514a4 ]
IRQ polling thread calls ISR after enable_irq() to handle any missed I/O
completion. The atomic flag "in_used" was added to have the synchronization
between the IRQ polling thread and the interrupt context. There is a bug
around it leading to a race condition.
Below is the sequence:
- IRQ polling thread accesses ISR, fetches the reply descriptor.
- Real interrupt arrives and pre-empts polling thread (enable_irq() is
already called).
- Interrupt context picks the same reply descriptor as fetched by polling
thread, processes it, and exits.
- Polling thread resumes and processes the descriptor which is already
processed by interrupt thread leads to kernel crash.
Setting the "in_used" flag before fetching the reply descriptor ensures
synchronized access to ISR.
Link: https://www.spinics.net/lists/linux-scsi/msg159440.html
Link: https://lore.kernel.org/r/20210929124022.24605-2-sumit.saxena@broadcom.com
Fixes: 9bedd36e91 ("scsi: megaraid_sas: Handle missing interrupts while re-enabling IRQs")
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c4ca3871e2 ]
The commit f87e7f2589 ("ALSA: hda - Improved position reporting on
SKL+") changed the PCM position report for SKL+ chips to use DPIB, but
according to Pierre, DPIB is no best choice for the accurate position
reports and it often reports too early. The recommended method is
rather the classical position buffer.
This patch makes the PCM position reporting on SKL+ back to the
position buffer again.
Fixes: f87e7f2589 ("ALSA: hda - Improved position reporting on SKL+")
Suggested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210929072934.6809-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 46243b85b0 ]
The position reporting on Intel Skylake and later chips via
azx_get_pos_skl() contains a udelay(20) call for the capture streams.
A call for this alone doesn't sound too harmful. However, as the
pointer PCM ops is one of the hottest path in the PCM operations --
especially for the timer-scheduled operations like PulseAudio -- such
a delay hogs CPU usage significantly in the total performance.
The code there was taken from the original code in ASoC SST Skylake
driver blindly. The udelay() is a workaround for the case where the
reported position is behind the period boundary at the timing
triggered from interrupts; applications often expect that the full
data is available for the whole period when returned (and also that's
the definition of the ALSA PCM period).
OTOH, HD-audio (legacy) driver has already some workarounds for the
delayed position reporting due to its relatively large FIFO, such as
the BDL position adjustment and the delayed period-elapsed call in the
work. That said, the udelay() is almost superfluous for HD-audio
driver unlike SST, and we can drop the udelay().
Though, the current code doesn't guarantee the full period readiness
as mentioned in the above, but rather it checks the wallclock and
detects the unexpected jump. That's one missing piece, and the drop
of udelay() needs a bit more sanity checks for the delayed handling.
This patch implements those: the drop of udelay() call in
azx_get_pos_skl() and the more proper check of hwptr in
azx_position_ok(). The latter change is applied only for the case
where the stream is running in the normal mode without
no_period_wakeup flag. When no_period_wakeup is set, it essentially
ignores the period handling and rather concentrates only on the
current position; which implies that we don't need to care about the
period boundary at all.
Fixes: f87e7f2589 ("ALSA: hda - Improved position reporting on SKL+")
Reported-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210929072934.6809-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 06e620345d ]
When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
helper, so it can use the bounce buffer address if necessary.
Now that iommu_dma_map_sg delegates to a function which takes care of
architectural syncing in the untrusted device case, the call to
iommu_dma_sync_sg_for_device can be moved so it only occurs for trusted
devices. Doing the sync for untrusted devices before mapping never
really worked, since it needs to be able to target swiotlb buffers.
This also moves the architectural sync to before the call to
__iommu_dma_map, to guarantee that untrusted devices can't see stale
data they shouldn't see.
Fixes: 82612d66d5 ("iommu: Allow the iommu/dma api to use bounce buffers")
Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20210929023300.335969-3-stevensd@google.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 483de2b44c ]
While removing the size from the "reg" properties in pm8916.dtsi,
commit bd6429e810 ("ARM64: dts: qcom: Remove size elements from
pmic reg properties") mistakenly also removed the second register
address for the rtc@6000 device. That one did not represent the size
of the register region but actually the address of the second "alarm"
register region of the rtc@6000 device.
Now there are "reg-names" for two "reg" elements, but there is actually
only one "reg" listed.
Since the DT schema for "qcom,pm8941-rtc" only expects one "reg"
element anyway, just drop the "reg-names" entirely to fix this.
Fixes: bd6429e810 ("ARM64: dts: qcom: Remove size elements from pmic reg properties")
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210928112945.25310-1-stephan@gerhold.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f13efafc1a ]
clang-14 notices that a comparison is never true when
CONFIG_PHYS_ADDR_T_64BIT is disabled:
drivers/iommu/mtk_iommu.c:553:34: error: result of comparison of constant 5368709120 with expression of type 'phys_addr_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (dom->data->enable_4GB && pa >= MTK_IOMMU_4GB_MODE_REMAP_BASE)
~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Add an explicit check for the type of the variable to skip the check
and the warning in that case.
Fixes: b4dad40e4f ("iommu/mediatek: Adjust the PA for the 4GB Mode")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Yong Wu <yong.wu@mediatek.com>
Link: https://lore.kernel.org/r/20210927121857.941160-1-arnd@kernel.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8199a0b31e ]
At the moment, playing audio on Secondary MI2S will just end up getting
stuck, without actually playing any audio. This happens because the wrong
bit clock is configured when playing audio on Secondary MI2S.
The PRI_I2S_CLK (better name: SPKR_I2S_CLK) is used by the SPKR audio mux
block that provides both Primary and Secondary MI2S.
The SEC_I2S_CLK (better name: MIC_I2S_CLK) is used by the MIC audio mux
block that provides Tertiary MI2S. Quaternary MI2S is also part of the
MIC audio mux but has its own clock (AUX_I2S_CLK).
This means that (quite confusingly) the SEC_I2S_CLK is not actually
used for Secondary MI2S as the name would suggest. Secondary MI2S
needs to have the same clock as Primary MI2S configured.
Fix the clock list for the lpass node in the device tree and add
a comment to clarify this confusing naming. With these changes,
audio can be played correctly on Secondary MI2S.
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Fixes: 3761a3618f ("arm64: dts: qcom: add lpass node")
Tested-by: Vincent Knecht <vincent.knecht@mailoo.org>
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210816181810.2242-1-stephan@gerhold.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c48a14dca2 ]
In jfs_mount, when diMount(ipaimap2) fails, it goes to errout35. However,
the following code does not free ipaimap2 allocated by diReadSpecial.
Fix this by refactoring the error handling code of jfs_mount. To be
specific, modify the lable name and free ipaimap2 when the above error
ocurrs.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f3b3c2bfa ]
mach/loongson64 fails to build when the FPU support is disabled:
arch/mips/loongson64/cop2-ex.c:45:15: error: implicit declaration of function ‘__is_fpu_owner’; did you mean ‘is_fpu_owner’? [-Werror=implicit-function-declaration]
arch/mips/loongson64/cop2-ex.c:98:30: error: ‘struct thread_struct’ has no member named ‘fpu’
arch/mips/loongson64/cop2-ex.c:99:30: error: ‘struct thread_struct’ has no member named ‘fpu’
arch/mips/loongson64/cop2-ex.c:131:43: error: ‘struct thread_struct’ has no member named ‘fpu’
arch/mips/loongson64/cop2-ex.c:137:38: error: ‘struct thread_struct’ has no member named ‘fpu’
arch/mips/loongson64/cop2-ex.c:203:30: error: ‘struct thread_struct’ has no member named ‘fpu’
arch/mips/loongson64/cop2-ex.c:219:30: error: ‘struct thread_struct’ has no member named ‘fpu’
arch/mips/loongson64/cop2-ex.c:283:38: error: ‘struct thread_struct’ has no member named ‘fpu’
arch/mips/loongson64/cop2-ex.c:301:38: error: ‘struct thread_struct’ has no member named ‘fpu’
Fixes: ef2f826c8f ("MIPS: Loongson-3: Enable the COP2 usage")
Suggested-by: Huacai Chen <chenhuacai@kernel.org>
Reviewed-by: Huacai Chen <chenhuacai@kernel.org>
Reported-by: k2ci robot <kernel-bot@kylinos.cn>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 82ea7d411d ]
The sc7180's dynamic-power-coefficient violates the device tree bindings.
The bindings (arm/cpus.yaml) say that the units for the
dynamic-power-coefficient are supposed to be "uW/MHz/V^2". The ones for
sc7180 aren't this. Qualcomm arbitrarily picked 100 for the "little" CPUs
and then picked a number for the big CPU based on this.
At the time, there was a giant dicussion about this. Apparently Qualcomm
Engineers were instructed not to share the actual numbers here. As part
of the discussion, I pointed out [1] that these numbers shouldn't really
be secret since once a device is shipping anyone can just run a script
and produce them. This patch is the result of running the script I posted
in that discussion on sc7180-trogdor-coachz, which is currently available
for purchase by consumers.
[1] https://lore.kernel.org/r/CAD=FV=U1FP0e3_AVHpauUUZtD-5X3XCwh5aT9fH_8S_FFML2Uw@mail.gmail.com/
I ran the script four times, measuring little, big, little, big. I used
the 64-bit version of dhrystone 2.2 in my test. I got these results:
576 kHz, 596 mV, 20 mW, 88 Cx
768 kHz, 596 mV, 32 mW, 122 Cx
1017 kHz, 660 mV, 45 mW, 97 Cx
1248 kHz, 720 mV, 87 mW, 139 Cx
1324 kHz, 756 mV, 109 mW, 148 Cx
1516 kHz, 828 mV, 150 mW, 148 Cx
1612 kHz, 884 mV, 182 mW, 147 Cx
1708 kHz, 884 mV, 192 mW, 146 Cx
1804 kHz, 884 mV, 207 mW, 149 Cx
Your dynamic-power-coefficient for cpu 0: 132
825 kHz, 596 mV, 142 mW, 401 Cx
979 kHz, 628 mV, 183 mW, 427 Cx
1113 kHz, 656 mV, 224 mW, 433 Cx
1267 kHz, 688 mV, 282 mW, 449 Cx
1555 kHz, 812 mV, 475 mW, 450 Cx
1708 kHz, 828 mV, 566 mW, 478 Cx
1843 kHz, 884 mV, 692 mW, 476 Cx
1900 kHz, 884 mV, 722 mW, 482 Cx
1996 kHz, 916 mV, 814 mW, 482 Cx
2112 kHz, 916 mV, 862 mW, 483 Cx
2208 kHz, 916 mV, 962 mW, 521 Cx
2323 kHz, 940 mV, 1060 mW, 517 Cx
2400 kHz, 956 mV, 1133 mW, 518 Cx
Your dynamic-power-coefficient for cpu 6: 471
576 kHz, 596 mV, 26 mW, 103 Cx
768 kHz, 596 mV, 40 mW, 147 Cx
1017 kHz, 660 mV, 54 mW, 114 Cx
1248 kHz, 720 mV, 97 mW, 151 Cx
1324 kHz, 756 mV, 113 mW, 150 Cx
1516 kHz, 828 mV, 154 mW, 148 Cx
1612 kHz, 884 mV, 194 mW, 155 Cx
1708 kHz, 884 mV, 203 mW, 152 Cx
1804 kHz, 884 mV, 219 mW, 155 Cx
Your dynamic-power-coefficient for cpu 0: 142
825 kHz, 596 mV, 148 mW, 530 Cx
979 kHz, 628 mV, 189 mW, 475 Cx
1113 kHz, 656 mV, 230 mW, 461 Cx
1267 kHz, 688 mV, 287 mW, 466 Cx
1555 kHz, 812 mV, 469 mW, 445 Cx
1708 kHz, 828 mV, 567 mW, 480 Cx
1843 kHz, 884 mV, 699 mW, 482 Cx
1900 kHz, 884 mV, 719 mW, 480 Cx
1996 kHz, 916 mV, 814 mW, 484 Cx
2112 kHz, 916 mV, 861 mW, 483 Cx
2208 kHz, 916 mV, 963 mW, 522 Cx
2323 kHz, 940 mV, 1063 mW, 520 Cx
2400 kHz, 956 mV, 1135 mW, 519 Cx
Your dynamic-power-coefficient for cpu 6: 489
As you can see, the calculations aren't perfectly consistent but
roughly you could say about 480 for big and 137 for little.
The ratio between these numbers isn't quite the same as the ratio
between the two numbers that Qualcomm used. Perhaps this is because
Qualcomm measured something slightly different than the 64-bit version
of dhrystone 2.2 or perhaps it's because they fudged these numbers a
bit (and fudged the capacity-dmips-mhz). As per discussion [2], let's
use the numbers I came up with and also un-fudge
capacity-dmips-mhz. While unfudging capacity-dmips-mhz, let's scale it
so that bigs are 1024 which seems to be the common practice.
In general these numbers don't need to be perfectly exact. In fact,
they can't be since the CPU power depends a lot on what's being run on
the CPU and the big/little CPUs are each more or less efficient in
different operations. Historically running the 32-bit vs. 64-bit
versions of dhrystone produced notably different numbers, though I
didn't test this time.
We also need to scale all of the sustainable-power numbers by the same
amount. I scale ones related to the big CPUs by the adjustment I made
to the big dynamic-power-coefficient and the ones related to the
little CPUs by the adjustment I made to the little
dynamic-power-coefficient.
[2] https://lore.kernel.org/r/0a865b6e-be34-6371-f9f2-9913ee1c5608@codeaurora.org/
Fixes: 71f873169a ("arm64: dts: qcom: sc7180: Add dynamic CPU power coefficients")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210902145127.v2.1.I049b30065f3c715234b6303f55d72c059c8625eb@changeid
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b3e9431854 ]
On resume we can get a warning at kernel/time/timekeeping.c:824 for
timekeeping_suspended.
Let's fix this by adding separate functions for sysc_poll_reset_sysstatus()
and sysc_poll_reset_sysconfig() and have the new functions handle also
timekeeping_suspended.
If iopoll at some point supports timekeeping_suspended, we can just drop
the custom handling from these functions.
Fixes: d46f9fbec7 ("bus: ti-sysc: Use optional clocks on for enable and wait for softreset bit")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8bb8429290 ]
commit 3276d9f53c ("arm64: dts: ti: k3-j7200-main: Add PCIe device
tree node") incorrectly added PCIe bus numbers from 0 to 15 (copy-paste
from J721E node). Enable all the supported bus numbers from 0 to 255
defined in PCIe spec here.
Fixes: 3276d9f53c ("arm64: dts: ti: k3-j7200-main: Add PCIe device tree node")
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/r/20210915055358.19997-5-kishon@ti.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5f46633565 ]
commit 4e5833884f ("arm64: dts: ti: k3-j721e-main: Add PCIe device
tree nodes") restricted PCIe bus numbers from 0 to 15 (due to SMMU
restriction in J721E). However since SMMU is not enabled, allow the full
supported bus numbers from 0 to 255.
Fixes: 4e5833884f ("arm64: dts: ti: k3-j721e-main: Add PCIe device tree nodes")
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Signed-off-by: Nishanth Menon <nm@ti.com>
Link: https://lore.kernel.org/r/20210915055358.19997-3-kishon@ti.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 932b4610f5 ]
As can be seen in RK3328's TRM the register range for the GPU is
0xff300000 to 0xff330000.
It would (and does in vendor kernel) overlap with the registers of
the HEVC encoder (node/driver do not exist yet in upstream kernel).
See already existing h265e_mmu node.
Fixes: 752fbc0c8d ("arm64: dts: rockchip: add rk3328 mali gpu node")
Signed-off-by: Alex Bee <knaerzche@gmail.com>
Link: https://lore.kernel.org/r/20210623115926.164861-1-knaerzche@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b27a40534e ]
Commit 1f02beff22 ("scsi: pm80xx: Remove global lock from outbound queue
processing") introduced a lock per outbound queue. Prior to that change the
driver was using a global lock for all outbound queues.
While processing the I/O responses and events the driver takes the outbound
queue spinlock and is supposed to release it in pm8001_ccb_task_free_done()
before calling command done(). Since the older code was using a global
lock, pm8001_ccb_task_free_done() was releasing the global spin lock. The
change that split the lock per outbound queue did not consider this and
pm8001_ccb_task_free_done() was still releasing the global lock.
Link: https://lore.kernel.org/r/20210906170404.5682-3-Ajish.Koshy@microchip.com
Fixes: 1f02beff22 ("scsi: pm80xx: Remove global lock from outbound queue processing")
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Ajish Koshy <Ajish.Koshy@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6c38c39ab2 ]
According to the binding the correct clock name is "refclk".
Fixes: 2961f69f15 ("arm64: dts: broadcom: add BCM4908 and Asus GT-AC5300 early DTS files")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f0b3e0cc0 ]
Up until commit ea7e586bdd ("iio: st_sensors: move regulator retrieveal
to core") only the ST pressure driver seems to have had any regulator
disable. After that commit, the regulator handling was moved into the
common st_sensors logic.
In all instances of this regulator handling, the regulators were disabled
before unregistering the IIO device.
This can cause issues where the device would be powered down and still be
available to userspace, allowing it to send invalid/garbage data.
This change moves the st_sensors_power_disable() after the common probe
functions. These common probe functions also handle unregistering the IIO
device.
Fixes: 774487611c ("iio: pressure-core: st: Provide support for the Vdd power supply")
Fixes: ea7e586bdd ("iio: st_sensors: move regulator retrieveal to core")
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Denis CIOCCA <denis.ciocca@st.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>
Link: https://lore.kernel.org/r/20210823112204.243255-2-aardelean@deviqon.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d02214f83 ]
This patch is to fix an issue that the ethernet link doesn't come up
when using ip link set down/up:
[ 11.428114] meson8b-dwmac ff3f0000.ethernet eth0: Link is Down
[ 14.428595] meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.0:00] driver [RTL8211F Gigabit Ethernet] (irq=31)
[ 14.428610] meson8b-dwmac ff3f0000.ethernet: Failed to reset the dma
[ 14.428974] meson8b-dwmac ff3f0000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed
[ 14.711185] meson8b-dwmac ff3f0000.ethernet eth0: stmmac_open: Hw setup failed
This fix refers to two commits applied for ODROID-N2 (G12B).
commit 658e4129bb ("arm64: dts: meson: g12b: odroid-n2: add the Ethernet PHY reset line")
commit 1c7412530d ("arm64: dts: meson: g12b: odroid-n2: fix PHY deassert timing requirements")
Fixes: 88d537bc92 ("arm64: dts: meson: convert meson-sm1-odroid-c4 to dtsi")
Signed-off-by: Dongjin Kim <tobetter@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
[narmstrong: added fixes tag and typo in commit log]
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/YScKYFWlYymgGw3l@anyang-linuxfactory-or-kr
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b9979db834 ]
Before this fix:
166: (b5) if r2 <= 0x1 goto pc+22
from 166 to 189: R2=invP(id=1,umax_value=1,var_off=(0x0; 0xffffffff))
After this fix:
166: (b5) if r2 <= 0x1 goto pc+22
from 166 to 189: R2=invP(id=1,umax_value=1,var_off=(0x0; 0x1))
While processing BPF_JLE the reg_set_min_max() would set true_reg->umax_value = 1
and call __reg_combine_64_into_32(true_reg).
Without the fix it would not pass the condition:
if (__reg64_bound_u32(reg->umin_value) && __reg64_bound_u32(reg->umax_value))
since umin_value == 0 at this point.
Before commit 10bf4e8316 the umin was incorrectly ingored.
The commit 10bf4e8316 fixed the correctness issue, but pessimized
propagation of 64-bit min max into 32-bit min max and corresponding var_off.
Fixes: 10bf4e8316 ("bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211101222153.78759-1-alexei.starovoitov@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 81c49d39ae ]
In account_guest_time in kernel/sched/cputime.c guest time is
attributed to both CPUTIME_NICE and CPUTIME_USER in addition to
CPUTIME_GUEST_NICE and CPUTIME_GUEST respectively. Therefore, adding
both to calculate usage results in double counting any guest time at
the rootcg.
Fixes: 936f2a70f2 ("cgroup: add cpu.stat file to root cgroup")
Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7303524e04 ]
If sockmap enable strparser, there are lose offset info in
sk_psock_skb_ingress(). If the length determined by parse_msg function is not
skb->len, the skb will be converted to sk_msg multiple times, and userspace
app will get the data multiple times.
Fix this by get the offset and length from strp_msg. And as Cong suggested,
add one bit in skb->_sk_redir to distinguish enable or disable strparser.
Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211029141216.211899-1-liujian56@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b278c0cb3 ]
If we get CRQ_INIT, we set errno to -EIO and first call complete() to
notify the waiter. Then we try to schedule a FAILOVER reset. If this
occurs while adapter is in PROBING state, ibmvnic_reset() changes the
error code to EAGAIN and returns without scheduling the FAILOVER. The
purpose of setting error code to EAGAIN is to ask the waiter to retry.
But due to the earlier complete() call, the waiter may already have seen
the -EIO response and decided not to retry. This can cause intermittent
failures when bringing up ibmvnic adapters during boot, specially in
in kexec/kdump kernels.
Defer the complete() call until after scheduling the reset.
Also streamline the error code to EAGAIN. Don't see why we need EIO
sometimes. All 3 callers of ibmvnic_reset_init() can handle EAGAIN.
Fixes: 17c8705838 ("ibmvnic: Return error code if init interrupted by transport event")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6e20d00158 ]
Soon after registering a CRQ it is possible that we get a fail over or
maybe a CRQ_INIT from the VIOS while interrupts were disabled.
Look for any such CRQs after enabling interrupts.
Otherwise we can intermittently fail to bring up ibmvnic adapters during
boot, specially in kexec/kdump kernels.
Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8878e46fcf ]
If adapter's resetting bit is on, discard the packet but don't stop the
transmit queue - instead leave that to the reset code. With this change,
it is possible that we may get several calls to ibmvnic_xmit() that simply
discard packets and return.
But if we stop the queue here, we might end up doing so just after
__ibmvnic_open() started the queues (during a hard/soft reset) and before
the ->resetting bit was cleared. If that happens, there will be no one to
restart queue and transmissions will be blocked indefinitely.
This can cause a TIMEOUT reset and with auto priority failover enabled,
an unnecessary FAILOVER reset to less favored backing device and then a
FAILOVER back to the most favored backing device. If we hit the window
repeatedly, we can get stuck in a loop of TIMEOUT, FAILOVER, FAILOVER
resets leaving the adapter unusable for extended periods of time.
Fixes: 7f5b030830 ("ibmvnic: Free skb's in cases of failure in transmit")
Reported-by: Abdul Haleem <abdhalee@in.ibm.com>
Reported-by: Vaishnavi Bhat <vaish123@in.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 42dcfd850e ]
Commit c6af0c227a ("ip: support SO_MARK cmsg")
added propagation of SO_MARK from cmsg to skb->mark.
For IPv4 and raw sockets the mark also affects route
lookup, but in case of IPv6 the flow info is
initialized before cmsg is parsed.
Fixes: c6af0c227a ("ip: support SO_MARK cmsg")
Reported-and-tested-by: Xintong Hu <huxintong@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 68b6dea802 ]
These three events can race when pcrypt is used multiple times in a
template ("pcrypt(pcrypt(...))"):
1. [taskA] The caller makes the crypto request via crypto_aead_encrypt()
2. [kworkerB] padata serializes the inner pcrypt request
3. [kworkerC] padata serializes the outer pcrypt request
3 might finish before the call to crypto_aead_encrypt() returns in 1,
resulting in two possible issues.
First, a use-after-free of the crypto request's memory when, for
example, taskA writes to the outer pcrypt request's padata->info in
pcrypt_aead_enc() after kworkerC completes the request.
Second, the outer pcrypt request overwrites the inner pcrypt request's
return code with -EINPROGRESS, making a successful request appear to
fail. For instance, kworkerB writes the outer pcrypt request's
padata->info in pcrypt_aead_done() and then taskA overwrites it
in pcrypt_aead_enc().
Avoid both situations by delaying the write of padata->info until after
the inner crypto request's return code is checked. This prevents the
use-after-free by not touching the crypto request's memory after the
next-inner crypto request is made, and stops padata->info from being
overwritten.
Fixes: 5068c7a883 ("crypto: pcrypt - Add pcrypt crypto parallelization wrapper")
Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 34d7ecb3d4 ]
When I fixed IGMPv3/MLDv2 to use the bridge's multicast_membership_interval
value which is chosen by user-space instead of calculating it based on
multicast_query_interval and multicast_query_response_interval I forgot
to update the selftests relying on that behaviour. Now we have to
manually set the expected GMI value to perform the tests correctly and get
proper results (similar to IGMPv2 behaviour).
Fixes: fac3cb82a5 ("net: bridge: mcast: use multicast_membership_interval for IGMPv3")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 829e050eea ]
Function br_get_link_af_size_filtered() calls br_cfm_{,peer}_mep_count()
that return a count. When BRIDGE_CFM is not enabled these functions
simply return -EOPNOTSUPP but do not modify count parameter and
calling function then works with uninitialized variables.
Modify these inline functions to return zero in count parameter.
Fixes: b6d0425b81 ("bridge: cfm: Netlink Notifications.")
Cc: Henrik Bjoernlund <henrik.bjoernlund@microchip.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fd8d9731bc ]
mvneta does not support asymetric pause modes, and it flags this by the
lack of AsymPause in the supported field. When setting pause modes, we
check that pause->rx_pause == pause->tx_pause, but only when pause
autoneg is enabled. When pause autoneg is disabled, we still allow
pause->rx_pause != pause->tx_pause, which is incorrect when the MAC
does not support asymetric pause, and causes mvneta to issue a warning.
Fix this by removing the test for pause->autoneg, so we always check
that pause->rx_pause == pause->tx_pause for network devices that do not
support AsymPause.
Fixes: 9525ae8395 ("phylink: add phylink infrastructure")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit daf182d360 ]
For each rate change command submission, the FW has to do a phy
power off sequence internally. For this to happen correctly, the
PLL re-initialization control setting has to be turned off before
sending mailbox commands and re-enabled once the command submission
is complete.
Without the PLL control setting, the link up takes longer time in a
fixed phy configuration.
Fixes: 47f164deab ("amd-xgbe: Add PCI device support")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 75cf662c64 ]
sctp_transport_pl_toobig() supposes to return true only if there's
pathmtu update, so that in sctp_icmp_frag_needed() it would call
sctp_assoc_sync_pmtu() and sctp_retransmit(). This patch is to fix
these return places in sctp_transport_pl_toobig().
Fixes: 8369640831 ("sctp: do state transition when receiving an icmp TOOBIG packet")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc4665ca64 ]
sctp_transport_pl_hlen() is called to calculate the outer header length
for PL. However, as the Figure in rfc8899#section-4.4:
Any additional
headers .--- MPS -----.
| | |
v v v
+------------------------------+
| IP | ** | PL | protocol data |
+------------------------------+
<----- PLPMTU ----->
<---------- PMTU -------------->
Outer header are IP + Any additional headers, which doesn't include
Packetization Layer itself header, namely sctphdr, whereas sctphdr
is counted by __sctp_mtu_payload().
The incorrect calculation caused the link pathmtu to be set larger
than expected by t->pl.pmtu + sctp_transport_pl_hlen(). This patch
is to fix it by subtracting sctphdr len in sctp_transport_pl_hlen().
Fixes: d9e2e410ae ("sctp: add the constants/variables and states and some APIs for transport")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c6ea04ea69 ]
sctp_transport_pl_update() is called when transport update its dst and
pathmtu, instead of stopping the PLPMTUD probe timer, PLPMTUD should
start over and reset the probe timer. Otherwise, the PLPMTUD service
would stop.
Fixes: 92548ec2f1 ("sctp: add the probe timer in transport for PLPMTUD")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 40171248bb ]
Currently when PLPMTUD enters Error state, transport pathmtu will be set
to MIN_PLPMTU(512) while probe is continuing with BASE_PLPMTU(1200). It
will cause pathmtu to stay in a very small value, even if the real pmtu
is some value like 1000.
RFC8899 doesn't clearly say how to set the value in Error state. But one
possibility could be keep using BASE_PLPMTU for the real pmtu, but allow
to do IP fragmentation when it's in Error state.
As it says in rfc8899#section-5.4:
Some paths could be unable to sustain packets of the BASE_PLPMTU
size. The Error State could be implemented to provide robustness to
such paths. This allows fallback to a smaller than desired PLPMTU
rather than suffer connectivity failure. This could utilize methods
such as endpoint IP fragmentation to enable the PL sender to
communicate using packets smaller than the BASE_PLPMTU.
This patch is to set pmtu to BASE_PLPMTU instead of MIN_PLPMTU for Error
state in sctp_transport_pl_send/toobig(), and set packet ipfragok for
non-probe packets when it's in Error state.
Fixes: 1dc68c1945 ("sctp: do state transition when PROBE_COUNT == MAX_PROBES on HB send path")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c3fc706e94 ]
Similar to the fix in commit:
e31eec77e4 ("bpf: selftests: Fix fd cleanup in get_branch_snapshot")
We use designated initializer to set fds to -1 without breaking on
future changes to MAX_SERVER constant denoting the array size.
The particular close(0) occurs on non-reuseport tests, so it can be seen
with -n 115/{2,3} but not 115/4. This can cause problems with future
tests if they depend on BTF fd never being acquired as fd 0, breaking
internal libbpf assumptions.
Fixes: 0ab5539f85 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-8-memxor@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a5c5d8d50e ]
amdgpu_fence_driver_sw_fini() should be executed before
amdgpu_device_ip_fini(), otherwise fence driver resource
won't be properly freed as adev->rings have been tore down.
Fixes: 72c8c97b15 ("drm/amdgpu: Split amdgpu_device_fini into early and late")
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d707f812bb ]
The channel scan list must be updated before triggering a hardware scan
so that firmware takes into account the regulatory info for each single
channel such as active/passive config, power, DFS, etc... Without this
the firmware uses its own internal default channel configuration, which
is not aligned with mac80211 regulatory rules, and misses several
channels (e.g. 144).
Fixes: 2f3bef4b24 ("wcn36xx: Add hardware scan offload support")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1635175328-25642-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 113f304dbc ]
The firmware is offering features such as ARP offload, for which
firmware crafts its own (QoS)packets without waking up the host.
Point is that the sequence numbers generated by the firmware are
not in sync with the host mac80211 layer and can cause packets
such as firmware ARP reponses to be dropped by the AP (too old SN).
To fix this we need to let the firmware manages the sequence
numbers by its own (except for QoS null frames). There is a SN
counter for each QoS queue and one global/baseline counter for
Non-QoS.
Fixes: 84aff52e4f ("wcn36xx: Use sequence number allocated by mac80211")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1635150336-18736-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9bfe38e064 ]
This is essentially exactly following the dma_wmb()/dma_rmb() usage
instructions in Documentation/memory-barriers.txt.
The theoretical races here are:
1. DXE (the DMA Transfer Engine in the Wi-Fi subsystem) seeing the
dxe->ctrl & WCN36xx_DXE_CTRL_VLD write before the dxe->dst_addr_l
write, thus performing DMA into the wrong address.
2. CPU reading dxe->dst_addr_l before DXE unsets dxe->ctrl &
WCN36xx_DXE_CTRL_VLD. This should generally be harmless since DXE
doesn't write dxe->dst_addr_l (no risk of freeing the wrong skb).
Fixes: 8e84c25821 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Benjamin Li <benl@squareup.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211023001528.3077822-1-benl@squareup.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 85f517b294 ]
If handle_sske cannot set the storage key, because there is no
page table entry or no present large page entry, it calls
fixup_user_fault.
However, currently, if the call succeeds, handle_sske returns
-EAGAIN, without having set the storage key.
Instead, retry by continue'ing the loop without incrementing the
address.
The same issue in handle_pfmf was fixed by
a11bdb1a6b ("KVM: s390: Fix pfmf and conditional skey emulation").
Fixes: bd096f6443 ("KVM: s390: Add skey emulation fault handling")
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211022152648.26536-1-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f76fbbbb50 ]
Use the actual return value instead of always -1 if register_kretprobe()
failed.
E.g. without this patch:
# insmod samples/kprobes/kretprobe_example.ko func=no_such_func
insmod: ERROR: could not insert module samples/kprobes/kretprobe_example.ko: Operation not permitted
With this patch:
# insmod samples/kprobes/kretprobe_example.ko func=no_such_func
insmod: ERROR: could not insert module samples/kprobes/kretprobe_example.ko: Unknown symbol in module
Link: https://lkml.kernel.org/r/1635213091-24387-2-git-send-email-yangtiezhu@loongson.cn
Fixes: 804defea1c ("Kprobes: move kprobe examples to samples/")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c72bcf0ab8 ]
Fix a problem in active mode that cpu->pstate.turbo_freq is initialized
only if HWP-to-frequency scaling factor is refined.
In passive mode, this problem is not exposed, because
cpu->pstate.turbo_freq is set again, later in
intel_cpufreq_cpu_init()->intel_pstate_get_hwp_cap().
Fixes: eb3693f052 ("cpufreq: intel_pstate: hybrid: CPU-specific scaling factor")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf12e6f912 ]
v1: Implement a more general statement as recommended by Eric Dumazet. The
sequence number will be advanced, so this check will fix the FIN case and
other cases.
A customer reported sockets stuck in the CLOSING state. A Vmcore revealed that
the write_queue was not empty as determined by tcp_write_queue_empty() but the
sk_buff containing the FIN flag had been freed and the socket was zombied in
that state. Corresponding pcaps show no FIN from the Linux kernel on the wire.
Some instrumentation was added to the kernel and it was found that there is a
timing window where tcp_sendmsg() can run after tcp_send_fin().
tcp_sendmsg() will hit an error, for example:
1269 ▹ if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))↩
1270 ▹ ▹ goto do_error;↩
tcp_remove_empty_skb() will then free the FIN sk_buff as "skb->len == 0". The
TCP socket is now wedged in the FIN-WAIT-1 state because the FIN is never sent.
If the other side sends a FIN packet the socket will transition to CLOSING and
remain that way until the system is rebooted.
Fix this by checking for the FIN flag in the sk_buff and don't free it if that
is the case. Testing confirmed that fixed the issue.
Fixes: fdfc5c8594 ("tcp: remove empty skb from write queue in error cases")
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Reported-by: Monir Zouaoui <Monir.Zouaoui@mail.schwarz>
Reported-by: Simon Stier <simon.stier@mail.schwarz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 45f2bebc80 ]
__BYTE_ORDER is supposed to be defined by a libc, and __BYTE_ORDER__ -
by a compiler. bpf_core_read.h checks __BYTE_ORDER == __LITTLE_ENDIAN,
which is true if neither are defined, leading to incorrect behavior on
big-endian hosts if libc headers are not included, which is often the
case.
Fixes: ee26dade0e ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-2-iii@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7eba41fe8c ]
In commit c46ed2281b ("tpm_tis_spi: add missing SPI device ID entries")
we added SPI IDs for all the DT aliases to handle the fact that we always
use SPI modaliases to load modules even when probed via DT however the
mentioned commit missed that the SPI and OF device ID entries did not
match and were different and so DT nodes with compatible
"tcg,tpm_tis-spi" will not match. Add an extra ID for tpm_tis-spi
rather than just fix the existing one since what's currently there is
going to be better for anyone actually using SPI IDs to instantiate.
Fixes: c46ed2281b ("tpm_tis_spi: add missing SPI device ID entries")
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79ca6f74da ]
The Atmel TPM 1.2 chips crash with error
`tpm_try_transmit: send(): error -62` since kernel 4.14.
It is observed from the kernel log after running `tpm_sealdata -z`.
The error thrown from the command is as follows
```
$ tpm_sealdata -z
Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl,
code=0087 (135), I/O error
```
The issue was reproduced with the following Atmel TPM chip:
```
$ tpm_version
T0 TPM 1.2 Version Info:
Chip Version: 1.2.66.1
Spec Level: 2
Errata Revision: 3
TPM Vendor ID: ATML
TPM Version: 01010000
Manufacturer Info: 41544d4c
```
The root cause of the issue is due to the TPM calls to msleep()
were replaced with usleep_range() [1], which reduces
the actual timeout. Via experiments, it is observed that
the original msleep(5) actually sleeps for 15ms.
Because of a known timeout issue in Atmel TPM 1.2 chip,
the shorter timeout than 15ms can cause the error described above.
A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further
reduced the timeout to less than 1ms. With experiments,
the problematic timeout in the latest kernel is the one
for `wait_for_tpm_stat`.
To fix it, the patch reverts the timeout of `wait_for_tpm_stat`
to 15ms for all Atmel TPM 1.2 chips, but leave it untouched
for Ateml TPM 2.0 chip, and chips from other vendors.
As explained above, the chosen 15ms timeout is
the actual timeout before this issue introduced,
thus the old value is used here.
Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us,
TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to
the existing TPM_TIMEOUT_RANGE_US (300us).
The fixed has been tested in the system with the affected Atmel chip
with no issues observed after boot up.
References:
[1] 9f3fc7bcdd tpm: replace msleep() with usleep_range() in TPM
1.2/2.0 generic drivers
[2] cf151a9a44 tpm: reduce tpm polling delay in tpm_tis_core
[3] 59f5a6b07f tpm: reduce poll sleep time in tpm_transmit()
[4] 424eaf910c tpm: reduce polling time to usecs for even finer
granularity
Fixes: 9f3fc7bcdd ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers")
Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@rubrik.com/
Signed-off-by: Hao Wu <hao.wu@rubrik.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d28e4dff08 ]
As it turns out, my earlier patch in commit 86d46fdaa1 (block:
ataflop: fix breakage introduced at blk-mq refactoring) was
incomplete. This patch fixes any remaining issues found during
more testing and code review.
Requests exceeding 4 k are handled in 4k segments but
__blk_mq_end_request() is never called on these (still
sectors outstanding on the request). With redo_fd_request()
removed, there is no provision to kick off processing of the
next segment, causing requests exceeding 4k to hang. (By
setting /sys/block/fd0/queue/max_sectors_k <= 4 as workaround,
this behaviour can be avoided).
Instead of reintroducing redo_fd_request(), requeue the remainder
of the request by calling blk_mq_requeue_request() on incomplete
requests (i.e. when blk_update_request() still returns true), and
rely on the block layer to queue the residual as new request.
Both error handling and formatting needs to release the
ST-DMA lock, so call finish_fdc() on these (this was previously
handled by redo_fd_request()). finish_fdc() may be called
legitimately without the ST-DMA lock held - make sure we only
release the lock if we actually held it. In a similar way,
early exit due to errors in ataflop_queue_rq() must release
the lock.
After minor errors, fd_error sets up to recalibrate the drive
but never re-runs the current operation (another task handled by
redo_fd_request() before). Call do_fd_action() to get the next
steps (seek, retry read/write) underway.
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Fixes: 6ec3938cff (ataflop: convert to blk-mq)
CC: linux-block@vger.kernel.org
Link: https://lore.kernel.org/r/20211024002013.9332-1-schmitzmic@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6f8c8bf4c7 ]
Commit 9af7c32cec ("ath10k: add target IRAM recovery feature support")
introduced a new firmware feature flag ATH10K_FW_FEATURE_IRAM_RECOVERY. But
this caused ath10k_pci module load to fail if ATH10K_FW_CRASH_DUMP_RAM_DATA bit
was not enabled in the ath10k coredump_mask module parameter:
[ 2209.328190] ath10k_pci 0000:02:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 2209.434414] ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1
[ 2209.547191] ath10k_pci 0000:02:00.0: firmware ver 10.4-3.9.0.2-00099 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast,no-ps,peer-fixed-rate,iram-recovery crc32 cbade90a
[ 2210.896485] ath10k_pci 0000:02:00.0: board_file api 1 bmi_id 0:1 crc32 a040efc2
[ 2213.603339] ath10k_pci 0000:02:00.0: failed to copy target iram contents: -12
[ 2213.839027] ath10k_pci 0000:02:00.0: could not init core (-12)
[ 2213.933910] ath10k_pci 0000:02:00.0: could not probe fw (-12)
And by default coredump_mask does not have ATH10K_FW_CRASH_DUMP_RAM_DATA
enabled so anyone using a firmware with iram-recovery feature would fail. To my
knowledge only QCA9984 firmwares starting from release 10.4-3.9.0.2-00099
enabled the feature.
The reason for regression was that ath10k_core_copy_target_iram() used
ath10k_coredump_get_mem_layout() to get the memory layout, but when
ATH10K_FW_CRASH_DUMP_RAM_DATA was disabled it would get just NULL and bail out
with an error.
While looking at all this I noticed another bug: if CONFIG_DEV_COREDUMP is
disabled but the firmware has iram-recovery enabled the module load fails with
similar error messages. I fixed that by returning 0 from
ath10k_core_copy_target_iram() when _ath10k_coredump_get_mem_layout() returns
NULL.
Tested-on: QCA9984 hw2.0 PCI 10.4-3.9.0.2-00139
Fixes: 9af7c32cec ("ath10k: add target IRAM recovery feature support")
Signed-off-by: Abinaya Kalaiselvan <akalaise@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211020075054.23061-1-kvalo@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c2e6df3eaa ]
pgd_page_vaddr() returns an 'unsigned long' address, causing a warning
with the memcpy() call in kasan_init():
arch/arm/mm/kasan_init.c: In function 'kasan_init':
include/asm-generic/pgtable-nop4d.h:44:50: error: passing argument 2 of '__memcpy' makes pointer from integer without a cast [-Werror=int-conversion]
44 | #define pgd_page_vaddr(pgd) ((unsigned long)(p4d_pgtable((p4d_t){ pgd })))
| ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| long unsigned int
arch/arm/include/asm/string.h:58:45: note: in definition of macro 'memcpy'
58 | #define memcpy(dst, src, len) __memcpy(dst, src, len)
| ^~~
arch/arm/mm/kasan_init.c:229:16: note: in expansion of macro 'pgd_page_vaddr'
229 | pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_START)),
| ^~~~~~~~~~~~~~
arch/arm/include/asm/string.h:21:47: note: expected 'const void *' but argument is of type 'long unsigned int'
21 | extern void *__memcpy(void *dest, const void *src, __kernel_size_t n);
| ~~~~~~~~~~~~^~~
Avoid this by adding an explicit typecast.
Link: https://lore.kernel.org/all/CACRpkdb3DMvof3-xdtss0Pc6KM36pJA-iy=WhvtNVnsDpeJ24Q@mail.gmail.com/
Fixes: 5615f69bc2 ("ARM: 9016/2: Initialize the mapping of KASan shadow memory")
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 232deb3f95 ]
At present, when either of ds->ops->port_fdb_del() or ds->ops->port_mdb_del()
return a non-zero error code, we attempt to save the day and keep the
data structure associated with that switchdev object, as the deletion
procedure did not complete.
However, the way in which we do this is suspicious to the checker in
lib/refcount.c, who thinks it is buggy to increment a refcount that
became zero, and that this is indicative of a use-after-free.
Fixes: 161ca59d39 ("net: dsa: reference count the MDB entries at the cross-chip notifier level")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c65b52d02f ]
As bcm6345_l1_irq_handle() is a chained irqchip handler, it will be
invoked within the context of the root irqchip handler, which must have
entered IRQ context already.
When bcm6345_l1_irq_handle() calls arch/mips's do_IRQ() , this will nest
another call to irq_enter(), and the resulting nested increment to
`rcu_data.dynticks_nmi_nesting` will cause rcu_is_cpu_rrupt_from_idle()
to fail to identify wakeups from idle, resulting in failure to preempt,
and RCU stalls.
Chained irqchip handlers must invoke IRQ handlers by way of thee core
irqchip code, i.e. generic_handle_irq() or generic_handle_domain_irq()
and should not call do_IRQ(), which is intended only for root irqchip
handlers.
Fix bcm6345_l1_irq_handle() by calling generic_handle_irq() directly.
Fixes: c7c42ec2ba ("irqchips/bmips: Add bcm6345-l1 interrupt controller")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4074324b0 ]
If kvm_s390_pv_destroy_cpu is called more than once, we risk calling
free_page on a random page, since the sidad field is aliased with the
gbea, which is not guaranteed to be zero.
This can happen, for example, if userspace calls the KVM_PV_DISABLE
IOCTL, and it fails, and then userspace calls the same IOCTL again.
This scenario is only possible if KVM has some serious bug or if the
hardware is broken.
The solution is to simply return successfully immediately if the vCPU
was already non secure.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Fixes: 19e1227768 ("KVM: S390: protvirt: Introduce instruction data area bounce buffer")
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20210920132502.36111-3-imbrenda@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 46c22ffd27 ]
We should not walk/touch page tables outside of VMA boundaries when
holding only the mmap sem in read mode. Evil user space can modify the
VMA layout just before this function runs and e.g., trigger races with
page table removal code since commit dd2283f260 ("mm: mmap: zap pages
with read mmap_sem in munmap").
find_vma() does not check if the address is >= the VMA start address;
use vma_lookup() instead.
Fixes: 214d9bbcd3 ("s390/mm: provide memory management functions for protected KVM guests")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Link: https://lore.kernel.org/r/20210909162248.14969-6-david@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 949f5c1244 ]
There are multiple things broken about our storage key handling
functions:
1. We should not walk/touch page tables outside of VMA boundaries when
holding only the mmap sem in read mode. Evil user space can modify the
VMA layout just before this function runs and e.g., trigger races with
page table removal code since commit dd2283f260 ("mm: mmap: zap pages
with read mmap_sem in munmap"). gfn_to_hva() will only translate using
KVM memory regions, but won't validate the VMA.
2. We should not allocate page tables outside of VMA boundaries: if
evil user space decides to map hugetlbfs to these ranges, bad things
will happen because we suddenly have PTE or PMD page tables where we
shouldn't have them.
3. We don't handle large PUDs that might suddenly appeared inside our page
table hierarchy.
Don't manually allocate page tables, properly validate that we have VMA and
bail out on pud_large().
All callers of page table handling functions, except
get_guest_storage_key(), call fixup_user_fault() in case they
receive an -EFAULT and retry; this will allocate the necessary page tables
if required.
To keep get_guest_storage_key() working as expected and not requiring
kvm_s390_get_skeys() to call fixup_user_fault() distinguish between
"there is simply no page table or huge page yet and the key is assumed
to be 0" and "this is a fault to be reported".
Although commit 637ff9efe5 ("s390/mm: Add huge pmd storage key handling")
introduced most of the affected code, it was actually already broken
before when using get_locked_pte() without any VMA checks.
Note: Ever since commit 637ff9efe5 ("s390/mm: Add huge pmd storage key
handling") we can no longer set a guest storage key (for example from
QEMU during VM live migration) without actually resolving a fault.
Although we would have created most page tables, we would choke on the
!pmd_present(), requiring a call to fixup_user_fault(). I would
have thought that this is problematic in combination with postcopy life
migration ... but nobody noticed and this patch doesn't change the
situation. So maybe it's just fine.
Fixes: 9fcf93b5de ("KVM: S390: Create helper function get_guest_storage_key")
Fixes: 24d5dd0208 ("s390/kvm: Provide function for setting the guest storage key")
Fixes: a7e19ab55f ("KVM: s390: handle missing storage-key facility")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20210909162248.14969-5-david@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe3d100240 ]
We should not walk/touch page tables outside of VMA boundaries when
holding only the mmap sem in read mode. Evil user space can modify the
VMA layout just before this function runs and e.g., trigger races with
page table removal code since commit dd2283f260 ("mm: mmap: zap pages
with read mmap_sem in munmap"). gfn_to_hva() will only translate using
KVM memory regions, but won't validate the VMA.
Further, we should not allocate page tables outside of VMA boundaries: if
evil user space decides to map hugetlbfs to these ranges, bad things will
happen because we suddenly have PTE or PMD page tables where we
shouldn't have them.
Similarly, we have to check if we suddenly find a hugetlbfs VMA, before
calling get_locked_pte().
Fixes: 2d42f94773 ("s390/kvm: Add PGSTE manipulation functions")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20210909162248.14969-4-david@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b159f94c86 ]
... otherwise we will try unlocking a spinlock that was never locked via a
garbage pointer.
At the time we reach this code path, we usually successfully looked up
a PGSTE already; however, evil user space could have manipulated the VMA
layout in the meantime and triggered removal of the page table.
Fixes: 1e133ab296 ("s390/mm: split arch/s390/mm/pgtable.c")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20210909162248.14969-3-david@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2d8fb8f391 ]
We should not walk/touch page tables outside of VMA boundaries when
holding only the mmap sem in read mode. Evil user space can modify the
VMA layout just before this function runs and e.g., trigger races with
page table removal code since commit dd2283f260 ("mm: mmap: zap pages
with read mmap_sem in munmap"). The pure prescence in our guest_to_host
radix tree does not imply that there is a VMA.
Further, we should not allocate page tables (via get_locked_pte()) outside
of VMA boundaries: if evil user space decides to map hugetlbfs to these
ranges, bad things will happen because we suddenly have PTE or PMD page
tables where we shouldn't have them.
Similarly, we have to check if we suddenly find a hugetlbfs VMA, before
calling get_locked_pte().
Note that gmap_discard() is different:
zap_page_range()->unmap_single_vma() makes sure to stay within VMA
boundaries.
Fixes: b31288fa83 ("s390/kvm: support collaborative memory management")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20210909162248.14969-2-david@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 753453afac ]
commit 7f4b792031 ("mt76: mt7615: add ibss support") introduced IBSS
and commit f4ec7fdf7f ("mt76: mt7615: enable support for mesh")
meshpoint support.
Both used in the "get_omac_idx"-function:
if (~mask & BIT(HW_BSSID_0))
return HW_BSSID_0;
With commit d8d59f66d1 ("mt76: mt7615: support 16 interfaces") the
ibss and meshpoint mode should "prefer hw bssid slot 1-3". However,
with that change the ibss or meshpoint mode will not send any beacon on
the mt7622 wifi anymore. Devices were still able to exchange data but
only if a bssid already existed. Two mt7622 devices will never be able
to communicate.
This commits reverts the preferation of slot 1-3 for ibss and
meshpoint. Only NL80211_IFTYPE_STATION will still prefer slot 1-3.
Tested on Banana Pi R64.
Fixes: d8d59f66d1 ("mt76: mt7615: support 16 interfaces")
Signed-off-by: Nick Hainke <vincent@systemli.org>
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211007225725.2615-1-vincent@systemli.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5245dafe3d ]
btf_header's str_off+str_len or type_off+type_len can overflow as they
are u32s. This will lead to bypassing the sanity checks during BTF
parsing, resulting in crashes afterwards. Fix by using 64-bit signed
integers for comparison.
Fixes: d812362450 ("libbpf: Fix BTF data layout checks and allow empty BTF")
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211023003157.726961-1-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e89ef634f8 ]
Bpftool creates a new JSON object for writing program metadata in plain
text mode, regardless of metadata being present or not. Then this writer
is freed if any metadata has been found and printed, but it leaks
otherwise. We cannot destroy the object unconditionally, because the
destructor prints an undesirable line break. Instead, make sure the
writer is created only after we have found program metadata to print.
Found with valgrind.
Fixes: aff52e685e ("bpftool: Support dumping metadata")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022094743.11052-1-quentin@isovalent.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ed290e1c20 ]
Though gcc conveniently compiles a simple memset to "rep stos," clang
prefers to call the libc version of memset. If a test is dynamically
linked, the libc memset isn't available in L1 (nor is the PLT or the
GOT, for that matter). Even if the test is statically linked, the libc
memset may choose to use some CPU features, like AVX, which may not be
enabled in L1. Note that __builtin_memset doesn't solve the problem,
because (a) the compiler is free to call memset anyway, and (b)
__builtin_memset may also choose to use features like AVX, which may
not be available in L1.
To avoid a myriad of problems, use an explicit "rep stos" to clear the
VMCB in generic_svm_setup(), which is called both from L0 and L1.
Reported-by: Ricardo Koller <ricarkol@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Fixes: 20ba262f86 ("selftests: KVM: AMD Nested test infrastructure")
Message-Id: <20210930003649.4026553-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ae88f676a ]
Commit ad6d66bcac ("crypto: tcrypt - include 1420 byte blocks in aead and skcipher benchmarks")
mentions:
> power-of-2 block size. So let's add 1420 bytes explicitly, and round
> it up to the next blocksize multiple of the algo in question if it
> does not support 1420 byte blocks.
but misses updating skcipher multi-buffer tests.
Fix this by using the proper (rounded) input size.
Fixes: ad6d66bcac ("crypto: tcrypt - include 1420 byte blocks in aead and skcipher benchmarks")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 01de5fcd8b ]
When building the kernel with sparse enabled 'C=1' the following
warnings shows up:
kernel/power/swap.c:390:29: warning: incorrect type in assignment (different base types)
kernel/power/swap.c:390:29: expected int ret
kernel/power/swap.c:390:29: got restricted blk_status_t
This is due to function hib_wait_io() returns a 'blk_status_t' which is
a bitwise u8. Commit 5416da01ff ("PM: hibernate: Remove
blk_status_to_errno in hib_wait_io") seemed to have mixed up the return
type. However, the 4e4cbee93d ("block: switch bios to blk_status_t")
actually broke the behaviour by returning the wrong type.
Rework so function hib_wait_io() returns a 'int' instead of
'blk_status_t' and make sure to call function
blk_status_to_errno(hb->error)' when returning from function
hib_wait_io() a int gets returned.
Fixes: 4e4cbee93d ("block: switch bios to blk_status_t")
Fixes: 5416da01ff ("PM: hibernate: Remove blk_status_to_errno in hib_wait_io")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0974812200 ]
In case that icdoff is not zero or mandatory keyed sgls are not
supported by the NVMe/RDMA target, we'll go to error flow but we'll
return 0 to the caller. Fix it by returning an appropriate error code.
Fixes: c66e2998c8 ("nvme-rdma: centralize controller setup sequence")
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2641b62d2f ]
Some Micrel KSZ8041NL PHY chips exhibit continuous RX errors after using
the power down mode bit (0.11). If the PHY is taken out of power down
mode in a certain temperature range, the PHY enters a weird state which
leads to continuously reporting RX errors. In that state, the MAC is not
able to receive or send any Ethernet frames and the activity LED is
constantly blinking. Since Linux is using the suspend callback when the
interface is taken down, ending up in that state can easily happen
during a normal startup.
Micrel confirmed the issue in errata DS80000700A [*], caused by abnormal
clock recovery when using power down mode. Even the latest revision (A4,
Revision ID 0x1513) seems to suffer that problem, and according to the
errata is not going to be fixed.
Remove the suspend/resume callback to avoid using the power down mode
completely.
[*] https://ww1.microchip.com/downloads/en/DeviceDoc/80000700A.pdf
Fixes: 1a5465f5d6 ("phy/micrel: Add suspend/resume support to Micrel PHYs")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd4bc63de7 ]
Coverity complains of a possible dereference of a null return value.
5. returned_null: kzalloc returns NULL. [show details]
6. var_assigned: Assigning: si_data = NULL return value from kzalloc.
488 si_data = kzalloc(data_size, __GFP_DMA | GFP_KERNEL);
489 cbd.length = cpu_to_le16(data_size);
490
491 dma = dma_map_single(&priv->si->pdev->dev, si_data,
492 data_size, DMA_FROM_DEVICE);
While this kzalloc() is unlikely to fail, I did notice that the function
returned without unmapping si_data.
Fix this by refactoring the error paths and checking for kzalloc()
failure.
Fixes: 888ae5a395 ("net: enetc: add tc flower psfp offload driver")
Cc: Claudiu Manoil <claudiu.manoil@nxp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc8a8bc374 ]
While looking at on-air packets using Wireshark, I noticed we're never
setting the initiator bit when sending DELBA requests to the AP: While
we set the bit on our del_ba_param_set bitmask, we forget to actually
copy that bitmask over to the command struct, which means we never
actually set the initiator bit.
Fix that and copy the bitmask over to the host_cmd_ds_11n_delba command
struct.
Fixes: 5e6e3a92b9 ("wireless: mwifiex: initial commit for Marvell mwifiex driver")
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
Acked-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211016153244.24353-5-verdre@v0yd.nl
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 515e7184bd ]
When fail to init coex module, free 'common' and 'adapter' directly, but
common->tx_thread which will access 'common' and 'adapter' is running at
the same time. That will trigger the UAF bug.
==================================================================
BUG: KASAN: use-after-free in rsi_tx_scheduler_thread+0x50f/0x520 [rsi_91x]
Read of size 8 at addr ffff8880076dc000 by task Tx-Thread/124777
CPU: 0 PID: 124777 Comm: Tx-Thread Not tainted 5.15.0-rc5+ #19
Call Trace:
dump_stack_lvl+0xe2/0x152
print_address_description.constprop.0+0x21/0x140
? rsi_tx_scheduler_thread+0x50f/0x520
kasan_report.cold+0x7f/0x11b
? rsi_tx_scheduler_thread+0x50f/0x520
rsi_tx_scheduler_thread+0x50f/0x520
...
Freed by task 111873:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0x109/0x140
kfree+0x117/0x4c0
rsi_91x_init+0x741/0x8a0 [rsi_91x]
rsi_probe+0x9f/0x1750 [rsi_usb]
Stop thread before free 'common' and 'adapter' to fix it.
Fixes: 2108df3c4b ("rsi: add coex support")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211015040335.1021546-1-william.xuanziyang@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afa0370f3a ]
Fix tag len error for sta_rec_wtbl, which causes fw parsing error for
the tags placed behind it.
Fixes: e57b790146 ("mt76: add mac80211 driver for MT7915 PCIe-based chipsets")
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0bb4e9187e ]
Without this change, garbage is seen in the hwmon name and sensors output
for mt7615 is garbled.
Fixes: 109e505ad9 ("mt76: mt7615: add thermal sensor device support")
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0ae3ff5684 ]
Without this change, garbage is seen in the hwmon name and sensors output
for mt7915 is garbled. It appears that the hwmon logic does not make a
copy of the incoming string, but instead just copies a char* and expects
it to never go away.
Fixes: 33fe9c639c ("mt76: mt7915: add thermal sensor device support")
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 99b8e19599 ]
According to the firmware usage, OFDM rates should fill out bit 6 - 13
while CCK rates should fill out bit 0 - 3 in legacy field of RA info to
make the rate adaption runs propertly. Otherwise, a unicast frame might be
picking up the unsupported rate to send out.
Fixes: 1c099ab447 ("mt76: mt7921: add MCU support")
Reported-by: Joshua Emele <jemele@chromium.org>
Co-developed-by: YN Chen <YN.Chen@mediatek.com>
Signed-off-by: YN Chen <YN.Chen@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f6e1f59885 ]
Introduce mt76_register_debugfs_fops routine in order to
define per-driver regs file operations and make sure the
device is up before reading or writing its registers
Fixes: 1d8efc741d ("mt76: mt7921: introduce Runtime PM support")
Fixes: de5ff3c9d1 ("mt76: mt7615: introduce pm_power_save delayed work")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 781f62960c ]
Update the proper firmware programming sequence to fix GTK rekey
offload failure on WPA mixed mode.
In the mt76_connac_mcu_key_iter,
gtk_tlv->proto should be only set up on pairwise key
and gtk_tlk->group_cipher should be only set up on the group key.
Otherwise, those parameters required by firmware would be set
incorrectly to cause GTK rekey offload failure on WPA mixed mode
and then disconnection follows.
Fixes: b47e21e75c ("mt76: mt7615: add gtk rekey offload support")
Co-developed-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Leon Yen <Leon.Yen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a23f80aa9c ]
The dma would be broken after rmmod flow. There are two different
cases causing this issue.
1. dma access without privilege.
2. hw access sequence borken by another context.
This patch handle both cases to avoid hw crash.
Fixes: 2b9ea5a8cf ("mt76: mt7921: add mt7921_dma_cleanup in mt7921_unregister_device")
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 47f1c08db7 ]
The bit fields of tx rate idx should be 6 bits, otherwise it might be
incorrect in HT mode.
For VHT/HE rates, only 4 bits are actually used by rate idx, the other
2 bits are used for other functions.
Fixes: c31d94af18 ("mt76: mt7915: fix tx rate related fields in tx descriptor")
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 82a980f82a ]
If total eeprom size is divisible by per-page size, the i in for loop
will exceed max page index, which happens in our newer chipset.
Fixes: 26f18380e6 ("mt76: mt7915: add support for flash mode")
Signed-off-by: Bo Jiao <bo.jiao@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd3f387371 ]
The acceptable event report should inlcude original CMD-ID. Otherwise,
drop unexpected result from fw.
Fixes: 1c099ab447 ("mt76: mt7921: add MCU support")
Signed-off-by: Jimmy Hu <Jimmy.Hu@mediatek.com>
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c33edef520 ]
Fix the following sparse warning in mt76x02_mac_write_txwi and
mt76x02_mac_tx_rate_val routines:
drivers/net/wireless/mediatek/mt76/mt76x02_mac.c:237:19:
warning: restricted __le16 degrades to intege
warning: cast from restricted __le16
drivers/net/wireless/mediatek/mt76/mt76x02_mac.c:383:28:
warning: incorrect type in assignment (different base types)
expected restricted __le16 [usertype] rate
got unsigned long
Fixes: db9f11d343 ("mt76: store wcid tx rate info in one u32 reduce locking")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit adedbc643f ]
drivers/net/wireless/mediatek/mt76/mt7915/mcu.c:114:10: error: implicit
conversion from enumeration type 'enum mt76_cipher_type' to different
enumeration type 'enum mcu_cipher_type' [-Werror,-Wenum-conversion]
return MT_CIPHER_NONE;
~~~~~~ ^~~~~~~~~~~~~~
drivers/net/wireless/mediatek/mt76/mt7921/mcu.c:114:10: error: implicit
conversion from enumeration type 'enum mt76_cipher_type' to different
enumeration type 'enum mcu_cipher_type' [-Werror,-Wenum-conversion]
return MT_CIPHER_NONE;
~~~~~~ ^~~~~~~~~~~~~~
Fixes: c368362c36 ("mt76: fix iv and CCMP header insertion")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d741abeafa ]
The mistaken structure is introduced since we added the GTK rekey offload
to mt7663. The patch fixes mt76_connac_gtk_rekey_tlv structure according
to the MT7663 and MT7921 firmware we have submitted into
linux-firmware.git.
Fixes: b47e21e75c ("mt76: mt7615: add gtk rekey offload support")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Leon Yen <Leon.Yen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3924715ffe ]
Zero out all the unused members of "req" so that we don't disclose
stack information.
Fixes: 495184ac91 ("mt76: mt7915: add support for applying pre-calibration data")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 08b3c8da87 ]
Fix the following sparse warning in mt7915_mac_add_txs_skb routine:
drivers/net/wireless/mediatek/mt76/mt7915/mac.c:1235:29:
warning: cast to restricted __le32
drivers/net/wireless/mediatek/mt76/mt7915/mac.c:1235:23:
warning: restricted __le32 degrades to integer
Fixes: 3de4cb1756 ("mt76: mt7915: add support for tx status reporting")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a2d7b2e004 ]
If an ACPI wakeup power resource is shared between multiple devices,
it may not be managed correctly.
Suppose, for example, that two devices, A and B, share a wakeup power
resource P whose wakeup_enabled flag is 0 initially. Next, suppose
that wakeup power is enabled for A and B, in this order, and disabled
for B. When wakeup power is enabled for A, P will be turned on and
its wakeup_enabled flag will be set. Next, when wakeup power is
enabled for B, P will not be touched, because its wakeup_enabled flag
is set. Now, when wakeup power is disabled for B, P will be turned
off which is incorrect, because A will still need P in order to signal
wakeup.
Moreover, if wakeup power is enabled for A and then disabled for B,
the latter will cause P to be turned off incorrectly (it will be still
needed by A), because acpi_disable_wakeup_device_power() is allowed
to manipulate power resources when the wakeup.prepare_count counter
of the given device is 0.
While the first issue could be addressed by changing the
wakeup_enabled power resource flag into a counter, addressing the
second one requires modifying acpi_disable_wakeup_device_power() to
do nothing when the target device's wakeup.prepare_count reference
counter is zero and that would cause the new counter to be redundant.
Namely, if acpi_disable_wakeup_device_power() is modified as per the
above, every change of the new counter following a wakeup.prepare_count
change would be reflected by the analogous change of the main reference
counter of the given power resource.
Accordingly, modify acpi_disable_wakeup_device_power() to do nothing
when the target device's wakeup.prepare_count reference counter is
zero and drop the power resource wakeup_enabled flag altogether.
While at it, ensure that all of the power resources that can be
turned off will be turned off when disabling device wakeup due to
a power resource manipulation error, to prevent energy from being
wasted.
Fixes: b5d667eb39 ("ACPI / PM: Take unusual configurations of power resources into account")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7a63296d6f ]
If an ACPI power resource is found to be "on" during the
initialization of the list of wakeup power resources of a device,
it is reference counted and its wakeup_enabled flag is set, which is
problematic if the deivce in question is the only user of the given
power resource, it is never runtime-suspended and it is not allowed
to wake up the system from sleep, because in that case the given
power resource will stay "on" until the system reboots and energy
will be wasted.
It is better to simply turn off wakeup power resources that are "on"
during the initialization unless their reference counters are not
zero, because that may be the only opportunity to prevent them from
staying in the "on" state all the time.
Fixes: b5d667eb39 ("ACPI / PM: Take unusual configurations of power resources into account")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fd96e35ea7 ]
A new warning in clang points out a use of bitwise OR with boolean
expressions in this driver:
drivers/platform/x86/thinkpad_acpi.c:9061:11: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical]
else if ((strlencmp(cmd, "level disengaged") == 0) |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
||
drivers/platform/x86/thinkpad_acpi.c:9061:11: note: cast one or both operands to int to silence this warning
1 error generated.
This should clearly be a logical OR so change it to fix the warning.
Fixes: fe98a52ce7 ("ACPI: thinkpad-acpi: add sysfs support to fan subdriver")
Link: https://github.com/ClangBuiltLinux/linux/issues/1476
Reported-by: Tor Vic <torvic9@mailbox.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20211018182537.2316800-1-nathan@kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86d46fdaa1 ]
Refactoring of the Atari floppy driver when converting to blk-mq
has broken the state machine in not-so-subtle ways:
finish_fdc() must be called when operations on the floppy device
have completed. This is crucial in order to relase the ST-DMA
lock, which protects against concurrent access to the ST-DMA
controller by other drivers (some DMA related, most just related
to device register access - broken beyond compare, I know).
When rewriting the driver's old do_request() function, the fact
that finish_fdc() was called only when all queued requests had
completed appears to have been overlooked. Instead, the new
request function calls finish_fdc() immediately after the last
request has been queued. finish_fdc() executes a dummy seek after
most requests, and this overwrites the state machine's interrupt
hander that was set up to wait for completion of the read/write
request just prior. To make matters worse, finish_fdc() is called
before device interrupts are re-enabled, making certain that the
read/write interupt is missed.
Shifting the finish_fdc() call into the read/write request
completion handler ensures the driver waits for the request to
actually complete. With a queue depth of 2, we won't see long
request sequences, so calling finish_fdc() unconditionally just
adds a little overhead for the dummy seeks, and keeps the code
simple.
While we're at it, kill ataflop_commit_rqs() which does nothing
but run finish_fdc() unconditionally, again likely wiping out an
in-flight request.
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Fixes: 6ec3938cff ("ataflop: convert to blk-mq")
CC: linux-block@vger.kernel.org
CC: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Link: https://lore.kernel.org/r/20211019061321.26425-1-schmitzmic@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6cb67bea94 ]
Prevent the use of page table macros and types from 2 conflicting
places. This fixes multiple build errors and warnings, e.g.:
../arch/x86/include/asm/pgtable_64_types.h:21:34: error: conflicting types for ‘pte_t’
typedef struct { pteval_t pte; } pte_t;
^~~~~
In file included from ../include/linux/mm_types_task.h:16:0,
from ../include/linux/mm_types.h:5,
from ../include/linux/buildid.h:5,
from ../include/linux/module.h:14,
from ../drivers/media/pci/ivtv/ivtv-driver.h:40,
from ../drivers/media/pci/ivtv/ivtvfb.c:29:
../arch/um/include/asm/page.h:57:39: note: previous declaration of ‘pte_t’ was here
typedef struct { unsigned long pte; } pte_t;
../arch/x86/include/asm/pgtable_types.h:284:43: error: expected ‘)’ before ‘prot’
static inline pgprot_t pgprot_nx(pgprot_t prot)
^
../include/linux/pgtable.h:914:26: note: in definition of macro ‘pgprot_nx’
#define pgprot_nx(prot) (prot)
^~~~
In file included from ../arch/x86/include/asm/memtype.h:6:0,
from ../drivers/media/pci/ivtv/ivtvfb.c:40:
../arch/x86/include/asm/pgtable_types.h:288:0: warning: "pgprot_nx" redefined
#define pgprot_nx pgprot_nx
../arch/x86/include/asm/page_types.h:11:0: warning: "PAGE_SIZE" redefined
#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
In file included from ../include/linux/mm_types_task.h:16:0,
from ../include/linux/mm_types.h:5,
from ../include/linux/buildid.h:5,
from ../include/linux/module.h:14,
from ../drivers/media/pci/ivtv/ivtv-driver.h:40,
from ../drivers/media/pci/ivtv/ivtvfb.c:29:
../arch/um/include/asm/page.h:14:0: note: this is the location of the previous definition
#define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT)
Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Andy Walls <awalls@md.metrocast.net>
Cc: linux-um@lists.infradead.org
Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 24bcbe1cc6 ]
sk_stream_kill_queues() can be called on close when there are
still outstanding skbs to transmit. Those skbs may try to queue
notifications to the error queue (e.g. timestamps).
If sk_stream_kill_queues() purges the queue without taking
its lock the queue may get corrupted, and skbs leaked.
This shows up as a warning about an rmem leak:
WARNING: CPU: 24 PID: 0 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x...
The leak is always a multiple of 0x300 bytes (the value is in
%rax on my builds, so RAX: 0000000000000300). 0x300 is truesize of
an empty sk_buff. Indeed if we dump the socket state at the time
of the warning the sk_error_queue is often (but not always)
corrupted. The ->next pointer points back at the list head,
but not the ->prev pointer. Indeed we can find the leaked skb
by scanning the kernel memory for something that looks like
an skb with ->sk = socket in question, and ->truesize = 0x300.
The contents of ->cb[] of the skb confirms the suspicion that
it is indeed a timestamp notification (as generated in
__skb_complete_tx_timestamp()).
Removing purging of sk_error_queue should be okay, since
inet_sock_destruct() does it again once all socket refs
are gone. Eric suggests this may cause sockets that go
thru disconnect() to maintain notifications from the
previous incarnations of the socket, but that should be
okay since the race was there anyway, and disconnect()
is not exactly dependable.
Thanks to Jonathan Lemon and Omar Sandoval for help at various
stages of tracing the issue.
Fixes: cb9eff0978 ("net: new user space API for time stamping of incoming and outgoing packets")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2203bd0e5c ]
The msm_gem_new_impl() function cleans up after itself so there is no
need to call drm_gem_object_put(). Conceptually, it does not make sense
to call a kref_put() function until after the reference counting has
been initialized which happens immediately after this call in the
drm_gem_(private_)object_init() functions.
In the msm_gem_import() function the "obj" pointer is uninitialized, so
it will lead to a crash.
Fixes: 05b849111c ("drm/msm: prime support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211013081315.GG6010@kili
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0a5c26712f ]
When device_register() return failed, program will goto out_kfree_type
to release 'cdev->device' by put_device(). That will call thermal_release()
to free 'cdev'. But the follow-up processes access 'cdev' continually.
That trggers the UAF bug.
====================================================================
BUG: KASAN: use-after-free in __thermal_cooling_device_register+0x75b/0xa90
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
dump_stack_lvl+0xe2/0x152
print_address_description.constprop.0+0x21/0x140
? __thermal_cooling_device_register+0x75b/0xa90
kasan_report.cold+0x7f/0x11b
? __thermal_cooling_device_register+0x75b/0xa90
__thermal_cooling_device_register+0x75b/0xa90
? memset+0x20/0x40
? __sanitizer_cov_trace_pc+0x1d/0x50
? __devres_alloc_node+0x130/0x180
devm_thermal_of_cooling_device_register+0x67/0xf0
max6650_probe.cold+0x557/0x6aa
......
Freed by task 258:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0x109/0x140
kfree+0x117/0x4c0
thermal_release+0xa0/0x110
device_release+0xa7/0x240
kobject_put+0x1ce/0x540
put_device+0x20/0x30
__thermal_cooling_device_register+0x731/0xa90
devm_thermal_of_cooling_device_register+0x67/0xf0
max6650_probe.cold+0x557/0x6aa [max6650]
Do not use 'cdev' again after put_device() to fix the problem like doing
in thermal_zone_device_register().
[dlezcano]: as requested by Rafael, change the affectation into two statements.
Fixes: 5848376181 ("thermal/drivers/core: Use a char pointer for the cooling device name")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/r/20211015024504.947520-1-william.xuanziyang@huawei.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 06f6e365e2 ]
Currently, in case of aead fallback, no associated data info is set in the
fallback request. To fix this, call aead_request_set_ad() to pass the assoclen.
Fixes: 6f03f0e8b6 ("crypto: octeontx2 - register with linux crypto framework")
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 19757cebf0 ]
Use of percpu_counter structure to track count of orphaned
sockets is causing problems on modern hosts with 256 cpus
or more.
Stefan Bach reported a serious spinlock contention in real workloads,
that I was able to reproduce with a netfilter rule dropping
incoming FIN packets.
53.56% server [kernel.kallsyms] [k] queued_spin_lock_slowpath
|
---queued_spin_lock_slowpath
|
--53.51%--_raw_spin_lock_irqsave
|
--53.51%--__percpu_counter_sum
tcp_check_oom
|
|--39.03%--__tcp_close
| tcp_close
| inet_release
| inet6_release
| sock_close
| __fput
| ____fput
| task_work_run
| exit_to_usermode_loop
| do_syscall_64
| entry_SYSCALL_64_after_hwframe
| __GI___libc_close
|
--14.48%--tcp_out_of_resources
tcp_write_timeout
tcp_retransmit_timer
tcp_write_timer_handler
tcp_write_timer
call_timer_fn
expire_timers
__run_timers
run_timer_softirq
__softirqentry_text_start
As explained in commit cf86a086a1 ("net/dst: use a smaller percpu_counter
batch for dst entries accounting"), default batch size is too big
for the default value of tcp_max_orphans (262144).
But even if we reduce batch sizes, there would still be cases
where the estimated count of orphans is beyond the limit,
and where tcp_too_many_orphans() has to call the expensive
percpu_counter_sum_positive().
One solution is to use plain per-cpu counters, and have
a timer to periodically refresh this cache.
Updating this cache every 100ms seems about right, tcp pressure
state is not radically changing over shorter periods.
percpu_counter was nice 15 years ago while hosts had less
than 16 cpus, not anymore by current standards.
v2: Fix the build issue for CONFIG_CRYPTO_DEV_CHELSIO_TLS=m,
reported by kernel test robot <lkp@intel.com>
Remove unused socket argument from tcp_too_many_orphans()
Fixes: dd24c00191 ("net: Use a percpu_counter for orphan_count")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Stefan Bach <sfb@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bc9bbb8173 ]
Currently, the kernel CONFIG_UNWINDER_ORC option is enabled by default
on x86, but the implementation of get_wchan() is still based on the frame
pointer unwinder, so the /proc/<pid>/wchan usually returned 0 regardless
of whether the task <pid> is running.
Reimplement get_wchan() by calling stack_trace_save_tsk(), which is
adapted to the ORC and frame pointer unwinders.
Fixes: ee9f8fce99 ("x86/unwind: Add the ORC unwinder")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211008111626.271115116@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a3d708925f ]
On i386, when builtin (not a loadable module), the winbond-840 driver
inspects boot_cpu_data to see what CPU family it is running on, and
then acts on that data. The "family" struct member (x86) does not exist
when running on UML, so prevent that test and do the default action.
Prevents this build error on UML + i386:
../drivers/net/ethernet/dec/tulip/winbond-840.c: In function ‘init_registers’:
../drivers/net/ethernet/dec/tulip/winbond-840.c:882:19: error: ‘struct cpuinfo_um’ has no member named ‘x86’
if (boot_cpu_data.x86 <= 4) {
Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-um@lists.infradead.org
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20211014050606.7288-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd2621d07d ]
On i386, when builtin (not a loadable module), the fealnx driver
inspects boot_cpu_data to see what CPU family it is running on, and
then acts on that data. The "family" struct member (x86) does not exist
when running on UML, so prevent that test and do the default action.
Prevents this build error on UML + i386:
../drivers/net/ethernet/fealnx.c: In function ‘netdev_open’:
../drivers/net/ethernet/fealnx.c:861:19: error: ‘struct cpuinfo_um’ has no member named ‘x86’
Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-um@lists.infradead.org
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20211014050500.5620-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4ef0c5c6b5 ]
There is a small race between copy_process() and sched_fork()
where child->sched_task_group point to an already freed pointer.
parent doing fork() | someone moving the parent
| to another cgroup
-------------------------------+-------------------------------
copy_process()
+ dup_task_struct()<1>
parent move to another cgroup,
and free the old cgroup. <2>
+ sched_fork()
+ __set_task_cpu()<3>
+ task_fork_fair()
+ sched_slice()<4>
In the worst case, this bug can lead to "use-after-free" and
cause panic as shown above:
(1) parent copy its sched_task_group to child at <1>;
(2) someone move the parent to another cgroup and free the old
cgroup at <2>;
(3) the sched_task_group and cfs_rq that belong to the old cgroup
will be accessed at <3> and <4>, which cause a panic:
[] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[] PGD 8000001fa0a86067 P4D 8000001fa0a86067 PUD 2029955067 PMD 0
[] Oops: 0000 [#1] SMP PTI
[] CPU: 7 PID: 648398 Comm: ebizzy Kdump: loaded Tainted: G OE --------- - - 4.18.0.x86_64+ #1
[] RIP: 0010:sched_slice+0x84/0xc0
[] Call Trace:
[] task_fork_fair+0x81/0x120
[] sched_fork+0x132/0x240
[] copy_process.part.5+0x675/0x20e0
[] ? __handle_mm_fault+0x63f/0x690
[] _do_fork+0xcd/0x3b0
[] do_syscall_64+0x5d/0x1d0
[] entry_SYSCALL_64_after_hwframe+0x65/0xca
[] RIP: 0033:0x7f04418cd7e1
Between cgroup_can_fork() and cgroup_post_fork(), the cgroup
membership and thus sched_task_group can't change. So update child's
sched_task_group at sched_post_fork() and move task_fork() and
__set_task_cpu() (where accees the sched_task_group) from sched_fork()
to sched_post_fork().
Fixes: 8323f26ce3 ("sched: Fix race in task_group")
Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lkml.kernel.org/r/20210915064030.2231-1-zhangqiao22@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0a491167fe ]
Most of the txpower for the ath10k firmware is stored as twicepower (0.5 dB
steps). This isn't the case for max_antenna_gain - which is still expected
by the firmware as dB.
The firmware is converting it from dB to the internal (twicepower)
representation when it calculates the limits of a channel. This can be seen
in tpc_stats when configuring "12" as max_antenna_gain. Instead of the
expected 12 (6 dB), the tpc_stats shows 24 (12 dB).
Tested on QCA9888 and IPQ4019 with firmware 10.4-3.5.3-00057.
Fixes: 02256930d9 ("ath10k: use proper tx power unit")
Signed-off-by: Sven Eckelmann <seckelmann@datto.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20190611172131.6064-1-sven@narfation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ada61aa0b1 ]
I got memory leak as follows when doing fault injection test:
unreferenced object 0xffff888102740438 (size 8):
comm "27", pid 859, jiffies 4295031351 (age 143.992s)
hex dump (first 8 bytes):
68 77 6d 6f 6e 30 00 00 hwmon0..
backtrace:
[<00000000544b5996>] __kmalloc_track_caller+0x1a6/0x300
[<00000000df0d62b9>] kvasprintf+0xad/0x140
[<00000000d3d2a3da>] kvasprintf_const+0x62/0x190
[<000000005f8f0f29>] kobject_set_name_vargs+0x56/0x140
[<00000000b739e4b9>] dev_set_name+0xb0/0xe0
[<0000000095b69c25>] __hwmon_device_register+0xf19/0x1e50 [hwmon]
[<00000000a7e65b52>] hwmon_device_register_with_info+0xcb/0x110 [hwmon]
[<000000006f181e86>] devm_hwmon_device_register_with_info+0x85/0x100 [hwmon]
[<0000000081bdc567>] tmp421_probe+0x2d2/0x465 [tmp421]
[<00000000502cc3f8>] i2c_device_probe+0x4e1/0xbb0
[<00000000f90bda3b>] really_probe+0x285/0xc30
[<000000007eac7b77>] __driver_probe_device+0x35f/0x4f0
[<000000004953d43d>] driver_probe_device+0x4f/0x140
[<000000002ada2d41>] __device_attach_driver+0x24c/0x330
[<00000000b3977977>] bus_for_each_drv+0x15d/0x1e0
[<000000005bf2a8e3>] __device_attach+0x267/0x410
When device_register() returns an error, the name allocated in
dev_set_name() will be leaked, the put_device() should be used
instead of calling hwmon_dev_release() to give up the device
reference, then the name will be freed in kobject_cleanup().
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: bab2243ce1 ("hwmon: Introduce hwmon_device_register_with_groups")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211012112758.2681084-1-yangyingliang@huawei.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e4400bbf5b ]
The NTF_EXT_LEARNED neigh flag is usually propagated back to user space
upon dump of the neighbor table. However, when used in combination with
NTF_USE flag this is not the case despite exempting the entry from the
garbage collector. This results in inconsistent state since entries are
typically marked in neigh->flags with NTF_EXT_LEARNED, but here they are
not. Fix it by propagating the creation flag to ___neigh_create().
Before fix:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
[...]
After fix:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
[...]
Fixes: 9ce33e4653 ("neighbour: support for NTF_EXT_LEARNED flag")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit beae4a6258 ]
The "msh" pointer is device managed, meaning that memstick_alloc_host()
calls device_initialize() on it. That means that it can't be free
using kfree() but must instead be freed with memstick_free_host().
Otherwise it leads to a tiny memory leak of device resources.
Fixes: 60fdd931d5 ("memstick: add support for JMicron jmb38x MemoryStick host controller")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211011123912.GD15188@kili
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4853396f03 ]
clang-14 complains about a sanity check that always passes when the
page size is 64KB or larger:
drivers/memstick/core/ms_block.c:1739:21: error: result of comparison of constant 65536 with expression of type 'unsigned short' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (msb->page_size > PAGE_SIZE) {
~~~~~~~~~~~~~~ ^ ~~~~~~~~~
This is fine, it will still work on all architectures, so just shut
up that warning with a cast.
Fixes: 0ab30494bc ("memstick: add support for legacy memorysticks")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210927094520.696665-1-arnd@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d806e334d0 ]
We need to restore context in a specified order with HCTL set in two
phases. This is similar to what omap_hsmmc_context_restore() is doing.
Otherwise SDIO can stop working on resume.
And for PM runtime and SDIO cards, we need to also save SYSCTL, IE and
ISE.
This should not be a problem currently, and these patches can be applied
whenever suitable.
Fixes: ee0f309263 ("mmc: sdhci-omap: Add Support for Suspend/Resume")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210921110029.21944-3-tony@atomide.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e0e7bd38b ]
If sdhci-omap is configured for an unused device instance and the device
is not set as disabled, we can get a NULL pointer dereference:
Unable to handle kernel NULL pointer dereference at virtual address
00000045
...
(regulator_set_voltage) from [<c07d7008>] (mmc_regulator_set_ocr+0x44/0xd0)
(mmc_regulator_set_ocr) from [<c07e2d80>] (sdhci_set_ios+0xa4/0x490)
(sdhci_set_ios) from [<c07ea690>] (sdhci_omap_set_ios+0x124/0x160)
(sdhci_omap_set_ios) from [<c07c8e94>] (mmc_power_up.part.0+0x3c/0x154)
(mmc_power_up.part.0) from [<c07c9d20>] (mmc_start_host+0x88/0x9c)
(mmc_start_host) from [<c07cad34>] (mmc_add_host+0x58/0x7c)
(mmc_add_host) from [<c07e2574>] (__sdhci_add_host+0xf0/0x22c)
(__sdhci_add_host) from [<c07eaf68>] (sdhci_omap_probe+0x318/0x72c)
(sdhci_omap_probe) from [<c06a39d8>] (platform_probe+0x58/0xb8)
AFAIK we are not seeing this with the devices configured in the mainline
kernel but this can cause issues for folks bringing up their boards.
Fixes: 7d326930d3 ("mmc: sdhci-omap: Add OMAP SDHCI driver")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210921110029.21944-2-tony@atomide.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 87a7f321bb ]
Don't always reset the driver on a TX timeout. Attempt to
recover by kicking the queue in case an IRQ was missed.
Fixes: 9e5f7d26a4 ("gve: Add workqueue and reset support")
Signed-off-by: John Fraker <jfraker@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b793db5fc ]
The problem is that "channel" is an unsigned int, when it's less 5 the
value of "channel - 5" is not a negative number as one would expect but
is very high positive value instead.
This means that "start" becomes a very high positive value. The result
of that is that we never enter the "for (i = start; i <= end; i++) {"
loop. Instead of storing the result from b43legacy_radio_aci_detect()
it just uses zero.
Fixes: ef1a628d83 ("b43: Implement dynamic PHY API")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Büsch <m@bues.ch>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211006073621.GE8404@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c1c8380b03 ]
The problem is that "channel" is an unsigned int, when it's less 5 the
value of "channel - 5" is not a negative number as one would expect but
is very high positive value instead.
This means that "start" becomes a very high positive value. The result
of that is that we never enter the "for (i = start; i <= end; i++) {"
loop. Instead of storing the result from b43legacy_radio_aci_detect()
it just uses zero.
Fixes: 75388acd0c ("[B43LEGACY]: add mac80211-based driver for legacy BCM43xx devices")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Büsch <m@bues.ch>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211006073542.GD8404@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50f742dd91 ]
By default, writes to the extended attributes security.ima will be
allowed even if the hash algorithm used for the xattr is not compiled
in the kernel (which does not make sense because the kernel would not
be able to appraise that file as it lacks support for validating the
hash).
Prevent and audit writes to the security.ima xattr if the hash algorithm
used in the new value is not available in the current kernel.
Signed-off-by: THOBY Simon <Simon.THOBY@viveris.fr>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b6f5f0c8f7 ]
Currently mtk_rng_runtime_suspend/resume is called for both runtime pm
and system sleep operations.
This is wrong as these should only be runtime ops as the name already
suggests. Currently freezing the system will lead to a call to
mtk_rng_runtime_suspend even if the device currently isn't active. This
leads to a clock warning because it is disabled/unprepared although it
isn't enabled/prepared currently.
This patch fixes this by only setting the runtime pm ops and forces to
call the runtime pm ops from the system sleep ops as well if active but
not otherwise.
Fixes: 81d2b34508 ("hwrng: mtk - add runtime PM support")
Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 18fcba469b ]
Upon receiving a PFVF message, check if the interrupt bit is set in the
message. If it is not, that means that the interrupt was probably
triggered by a collision. In this case, disregard the message and
re-enable the interrupts.
Fixes: ed8ccaef52 ("crypto: qat - Add support for SRIOV")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b768e8a39 ]
Detect a PFVF collision between the local and the remote function by
checking if the message on the PFVF CSR has been overwritten.
This is done after the remote function confirms that the message has
been received, by clearing the interrupt bit, or the maximum number of
attempts (ADF_IOV_MSG_ACK_MAX_RETRY) to check the CSR has been exceeded.
Fixes: ed8ccaef52 ("crypto: qat - Add support for SRIOV")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Co-developed-by: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cfd6fb45cf ]
clang points out inconsistencies in the FIELD_PREP() invocation in
this driver that result from the 'mask' being a 32-bit value:
drivers/crypto/ccree/cc_driver.c:117:18: error: result of comparison of constant 18446744073709551615 with expression of type 'u32' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
cache_params |= FIELD_PREP(mask, val);
^~~~~~~~~~~~~~~~~~~~~
include/linux/bitfield.h:94:3: note: expanded from macro 'FIELD_PREP'
__BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: "); \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/bitfield.h:52:28: note: expanded from macro '__BF_FIELD_CHECK'
BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull, \
~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This does not happen in other places that just pass a constant here.
Work around the warnings by widening the type of the temporary variable.
Fixes: 05c2a70591 ("crypto: ccree - rework cache parameters handling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Gilad ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 69a10678e2 ]
mn88443x_cmn_power_on() did not handle possible errors of
clk_prepare_enable() and always finished successfully so that its caller
mn88443x_probe() did not care about failed preparing/enabling of clocks
as well.
Add missed error handling in both mn88443x_cmn_power_on() and
mn88443x_probe(). This required to change the return value of the former
from "void" to "int".
Found by Linux Driver Verification project (linuxtesting.org).
Fixes: 0f408ce894 ("media: dvb-frontends: add Socionext MN88443x ISDB-S/T demodulator driver")
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Co-developed-by: Kirill Shilimanov <kirill.shilimanov@huawei.com>
Signed-off-by: Kirill Shilimanov <kirill.shilimanov@huawei.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1444232152 ]
In existing video driver implementation vpp frequency calculation in
calculate_inst_freq() is always zero because the value of vpp_freq_per_mb
is always zero for decoder.
Fixed this by correcting the calculating the vpp frequency calculation for
decoder.
Fixes: 3cfe5815ce ("media: venus: Enable low power setting for encoder")
Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7b1394892d ]
Relax this condition to make add and update commands idempotent for sets
with no timeout. The eval function already checks if the set element
timeout is available and updates it if the update command is used.
Fixes: 22fe54d5fe ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7663ad9a5d ]
RCU managed to grow a few noinstr violations:
vmlinux.o: warning: objtool: rcu_dynticks_eqs_enter()+0x0: call to rcu_dynticks_task_trace_enter() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_exit()+0xe: call to rcu_dynticks_task_trace_exit() leaves .noinstr.text section
Fix them by adding __always_inline to the relevant trivial functions.
Also replace the noinstr with __always_inline for the existing
rcu_dynticks_task_*() functions since noinstr would force noinline
them, even when empty, which seems silly.
Fixes: 7d0c9c50c5 ("rcu-tasks: Avoid IPIing userspace/idle tasks if kernel is so built")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f4873fb6a ]
AMD Rome systems and later support interleaving between three identical
ranks within a channel.
Check for this mode by counting the number of enabled chip selects and
comparing their masks. If there are exactly three enabled chip selects
and their masks are identical, then three rank interleaving is enabled.
The size of a rank is determined from its mask value. However, three
rank interleaving doesn't follow the method of swapping an interleave
bit with the most significant bit. Rather, the interleave bit is flipped
and the most significant bit remains the same. There is only a single
interleave bit in this case.
Account for this when determining the chip select size by keeping the
most significant bit at its original value and ignoring any zero bits.
This will return a full bitmask in [MSB:1].
Fixes: e53a3b267f ("EDAC/amd64: Find Chip Select memory size using Address Mask")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211005154419.2060504-1-yazen.ghannam@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f96b467583 ]
Use get_unaligned() instead of memcpy() to access potentially unaligned
memory, which, when accessed through a pointer, leads to undefined
behavior. get_unaligned() describes much better what is happening there
anyway even if memcpy() does the job.
In addition, since perf tool builds with -Werror, it would fire with:
util/intel-pt-decoder/../../../arch/x86/lib/insn.c: In function '__insn_get_emulate_prefix':
tools/include/../include/asm-generic/unaligned.h:10:15: error: packed attribute is unnecessary [-Werror=packed]
10 | const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \
because -Werror=packed would complain if the packed attribute would have
no effect on the layout of the structure.
In this case, that is intentional so disable the warning only for that
compilation unit.
That part is Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
No functional changes.
Fixes: 5ba1071f75 ("x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lkml.kernel.org/r/YVSsIkj9Z29TyUjE@zn.tnic
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa1a43262a ]
Currently, a debug message is printed if an inefficient state is detected
in the Energy Model. Unfortunately, it won't detect if the first state is
inefficient or if two successive states are. Fix this behavior.
Fixes: 27871f7a8a (PM: Introduce an Energy Model management framework)
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4925642d54 ]
In tests with two Lima boards from 8devices (QCA4531 based) on OpenWrt
19.07 we could force a silent restart of a device with no serial
output when we were sending a high amount of UDP traffic (iperf3 at 80
MBit/s in both directions from external hosts, saturating the wifi and
causing a load of about 4.5 to 6) and were then triggering an
ath9k_queue_reset().
Further debugging showed that the restart was caused by the ath79
watchdog. With disabled watchdog we could observe that the device was
constantly going into ath_isr() interrupt handler and was returning
early after the ATH_OP_HW_RESET flag test, without clearing any
interrupts. Even though ath9k_queue_reset() calls
ath9k_hw_kill_interrupts().
With JTAG we could observe the following race condition:
1) ath9k_queue_reset()
...
-> ath9k_hw_kill_interrupts()
-> set_bit(ATH_OP_HW_RESET, &common->op_flags);
...
<- returns
2) ath9k_tasklet()
...
-> ath9k_hw_resume_interrupts()
...
<- returns
3) loops around:
...
handle_int()
-> ath_isr()
...
-> if (test_bit(ATH_OP_HW_RESET,
&common->op_flags))
return IRQ_HANDLED;
x) ath_reset_internal():
=> never reached <=
And in ath_isr() we would typically see the following interrupts /
interrupt causes:
* status: 0x00111030 or 0x00110030
* async_cause: 2 (AR_INTR_MAC_IPQ)
* sync_cause: 0
So the ath9k_tasklet() reenables the ath9k interrupts
through ath9k_hw_resume_interrupts() which ath9k_queue_reset() had just
disabled. And ath_isr() then keeps firing because it returns IRQ_HANDLED
without actually clearing the interrupt.
To fix this IRQ storm also clear/disable the interrupts again when we
are in reset state.
Cc: Sven Eckelmann <sven@narfation.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Linus Lüssing <linus.luessing@c0d3.blue>
Fixes: 872b5d814f ("ath9k: do not access hardware on IRQs during reset")
Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210914192515.9273-3-linus.luessing@c0d3.blue
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 747ff7d3d7 ]
When rebooting on sc7180 Trogdor devices I see the following crash from
the wifi driver.
ath10k_snoc 18800000.wifi: firmware crashed! (guid 83493570-29a2-4e98-a83e-70048c47669c)
This is because a modem stop event looks just like a firmware crash to
the driver, the qmi connection is closed in both cases. Use the qcom ssr
notifier block to stop treating the qmi connection close event as a
firmware crash signal when the modem hasn't actually crashed. See
ath10k_qmi_event_server_exit() for more details.
This silences the crash message seen during every reboot.
Fixes: 3f14b73c38 ("ath10k: Enable MSA region dump support for WCN3990")
Cc: Youghandhar Chintala <youghand@codeaurora.org>
Cc: Abhishek Kumar <kuabhs@chromium.org>
Cc: Steev Klimaszewski <steev@kali.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Rakesh Pillai <pillair@codeaurora.org>
Tested-By: Youghandhar Chintala <youghand@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210922233341.182624-1-swboyd@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 51fa3b70d2 ]
The call to ops->suspend for the dev->dev_next case can currently
trigger a call on a null function pointer if ops->suspend is null.
Skip over the use of function ops->suspend if it is null.
Addresses-Coverity: ("Dereference after null check")
Fixes: be7fd3c3a8 ("media: em28xx: Hauppauge DualHD second tuner functionality")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5f5a66c9a ]
Commit c343bf1ba5 ("cpuidle: Fix three reference count leaks")
fixes the cleanup of kobjects; however, it removes kfree() calls
altogether, leading to memory leaks.
Fix those and also defer the initialization of dev->kobj_dev until
after the error check, so that we do not end up with a dangling
pointer.
Fixes: c343bf1ba5 ("cpuidle: Fix three reference count leaks")
Signed-off-by: Anel Orazgaliyeva <anelkz@amazon.de>
Suggested-by: Aman Priyadarshi <apeureka@amazon.de>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 38aa192a05 ]
The ecc.c file started out as part of the ECDH algorithm but got
moved out into a standalone module later. It does not build without
CRYPTO_DEFAULT_RNG, so now that other modules are using it as well we
can run into this link error:
aarch64-linux-ld: ecc.c:(.text+0xfc8): undefined reference to `crypto_default_rng'
aarch64-linux-ld: ecc.c:(.text+0xff4): undefined reference to `crypto_put_default_rng'
Move the 'select CRYPTO_DEFAULT_RNG' statement into the correct symbol.
Fixes: 0d7a78643f ("crypto: ecrdsa - add EC-RDSA (GOST 34.10) algorithm")
Fixes: 4e6602916b ("crypto: ecdsa - Add support for ECDSA signature verification")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f7262cd66 ]
debugfs_create_file() takes a pointer argument that can be used during
file operation callbacks (accessible via i_private in the inode
structure). An obvious requirement is for the pointer to refer to
valid memory when used.
When creating the debugfs file to dynamically enable / disable
kprobes, a pointer to local variable is passed to
debugfs_create_file(); which will go out of scope when the init
function returns. The reason this hasn't triggered random memory
corruption is because the pointer is not accessed during the debugfs
file callbacks.
Since the enabled state is managed by the kprobes_all_disabled global
variable, the local variable is not needed. Fix the incorrect (and
unnecessary) usage of local variable during debugfs_file_create() by
passing NULL instead.
Link: https://lkml.kernel.org/r/163163031686.489837.4476867635937014973.stgit@devnote2
Fixes: bf8f6e5b3e ("Kprobes: The ON/OFF knob thru debugfs")
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d7f26849ed ]
The runtime enabling of the ISPCK (internally clocks the pipeline inside
the ISC) has to be done after the pm_runtime for the ISC dev has been
started.
After the commit by Mauro:
the ISC failed to probe with the error:
atmel-sama5d2-isc f0008000.isc: failed to enable ispck: -13
atmel-sama5d2-isc: probe of f0008000.isc failed with error -13
This is because the enabling of the ispck is done too early in the probe,
and the PM runtime returns invalid request.
Thus, moved this clock enabling after pm_runtime_idle is called.
The ISPCK is required only for sama5d2 type of ISC.
Thus, add a bool inside the isc struct that is platform dependent.
For the sama7g5-isc, the enabling of the ISPCK is wrong and does not make
sense. Removed it from the sama7g5 probe. In sama7g5-isc, there is only
one clock, the MCK, which also clocks the internal pipeline of the ISC.
Adapted the clk_prepare and clk_unprepare to request the runtime PM
for both clocks (MCK and ISPCK) in case of sama5d2-isc, and the single
clock (MCK) in case of sama7g5-isc.
Fixes: dd97908ee3 ("media: atmel: properly get pm_runtime")
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7266dda2f1 ]
Currently a call to snd_card_new that fails will set card with a NULL
pointer, this causes a null pointer dereference on the error cleanup
path when card it passed to snd_card_free. Fix this by adding a new
error exit path that does not call snd_card_free and exiting via this
new path.
Addresses-Coverity: ("Explicit null dereference")
Fixes: 9e44d63246 ("[media] cx23885: Add ALSA support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 42bb98e420 ]
The "card" string only holds 31 characters (and the terminating NUL).
In order to avoid truncation, use a shorter card description instead of
the current result, "Trident TVMaster TM5600/6000/60".
Suggested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: e28f49b0b2 ("V4L/DVB: tm6000: fix some info messages")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2908249f38 ]
The "card" string only holds 31 characters (and the terminating NUL).
In order to avoid truncation, use a shorter card description instead of
the current result, "Silicon Labs Si470x FM Radio Re".
Suggested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 78656acdcf ("V4L/DVB (7038): USB radio driver for Silicon Labs Si470x FM Radio Receivers")
Fixes: cc35bbddfe ("V4L/DVB (12416): radio-si470x: add i2c driver for si470x")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dfadec236a ]
The "card" string only holds 31 characters (and the terminating NUL).
In order to avoid truncation, use a shorter card description instead of
the current result, "Texas Instruments Wl1273 FM Rad".
Suggested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 87d1a50ce4 ("[media] V4L2: WL1273 FM Radio: TI WL1273 FM radio driver")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ed8528346 ]
Previously it was possible, but a recent fix for uninitialized
`ret` variable broke this behavior.
v4l2_fh_is_singular_file() check is there just to determine
whether the power needs to be enabled, and it's not a failure
if it returns false.
Fixes: ba9139116b ("media: sun6i-csi: add a missing return code")
Signed-off-by: Ondrej Jirman <megous@megous.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e4625044d6 ]
Fix the build errors reported by the kernel test robot by
selecting V4L2_ASYNC:
mips-linux-ld: drivers/media/i2c/ths8200.o: in function `ths8200_remove':
ths8200.c:(.text+0x1ec): undefined reference to `v4l2_async_unregister_subdev'
mips-linux-ld: drivers/media/i2c/ths8200.o: in function `ths8200_probe':
ths8200.c:(.text+0x404): undefined reference to `v4l2_async_register_subdev'
Fixes: ed29f89497 ("media: i2c: ths8200: support asynchronous probing")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Lad Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c47dc6657 ]
A successful 'mxc_jpeg_attach_pm_domains()' call should be balanced by a
corresponding 'mxc_jpeg_detach_pm_domains()' call in the error handling
path of the probe, as already done in the remove function.
Update the error handling path accordingly.
Fixes: 2db16c6ed7 ("media: imx-jpeg: Add V4L2 driver for i.MX8 JPEG Encoder/Decoder")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2143ad413c ]
A successful 'clk_prepare()' call should be balanced by a corresponding
'clk_unprepare()' call in the error handling path of the probe, as already
done in the remove function.
Update the error handling path accordingly.
Fixes: 3003a180ef ("[media] VPU: mediatek: support Mediatek VPU")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Houlong Wei <houlong.wei@mediatek.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 48d219f9cc ]
Static analysis reports this representative problem
tda1997x.c:1939: warning: 7th function call argument is an uninitialized
value
The 7th argument is buffer[0], which is set in the earlier call to
io_readn(). When io_readn() call to io_read() fails with the first
read, buffer[0] is not set and 0 is returned and stored in len.
The later call to hdmi_infoframe_unpack()'s size parameter is the
static size of buffer, always 40, so a short read is not caught
in hdmi_infoframe_unpacks()'s checking. The variable len should be
used instead.
Zero initialize buffer to 0 so it is in a known start state.
Fixes: 9ac0038db9 ("media: i2c: Add TDA1997x HDMI receiver driver")
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 065a7c66bd ]
In case vb2ops_venc_start_streaming fails, the error value
is overwritten by the ret value of pm_runtime_put which might
be 0. Fix it.
Fixes: 985c73693f (" media: mtk-vcodec: Separating mtk encoder driver")
Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c87ed93574 ]
If the driver does not implement s_ctrl, but it does implement
s_ext_ctrls, we convert the call.
When that happens we have also to convert back the response from
s_ext_ctrls.
Fixes v4l2_compliance:
Control ioctls (Input 0):
fail: v4l2-test-controls.cpp(411): returned control value out of range
fail: v4l2-test-controls.cpp(507): invalid control 00980900
test VIDIOC_G/S_CTRL: FAIL
Fixes: 35ea11ff84 ("V4L/DVB (8430): videodev: move some functions from v4l2-dev.h to v4l2-common.h or v4l2-ioctl.h")
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d170b0ea17 ]
Obtain the clock frequency by reading the clock-frequency property if
there's no clock.
Fixes: 9fda25332c ("media: i2c: imx258: get clock from device properties and enable it via runtime PM")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 36b9d695aa ]
ttusb_dec_send_command() invokes mutex_lock_interruptible() that can
fail but then it releases the non-acquired mutex. The patch fixes that.
Found by Linux Driver Verification project (linuxtesting.org).
Fixes: dba328bab4 ("media: ttusb-dec: cleanup an error handling logic")
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11b982e950 ]
Currently the null pointer check on dvb_spi->vcc_supply is inverted and
this leads to only null values of the dvb_spi->vcc_supply being passed
to the call of regulator_disable causing null pointer dereferences.
Fix this by only calling regulator_disable if dvb_spi->vcc_supply is
not null.
Addresses-Coverity: ("Dereference after null check")
Fixes: dcb0145821 ("media: cxd2880-spi: Fix an error handling path")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4b9e3e8af4 ]
There is likely a typo here. To be consistent, we should compare
'fmt.height' with 'ctx->out.pix_fmt.height', not 'ctx->out.pix_fmt.width'.
Instead of fixing the test, just remove it and copy 'fmt' unconditionally.
Fixes: 59a635327c ("media: meson: Add M2M driver for the Amlogic GE2D Accelerator Unit")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e68ac00827 ]
When the loader indicates an internal error (result of a checked bpf
system call), it returns the result in attr.test.retval. However, tests
that rely on ASSERT_OK_PTR on NULL (returned from light skeleton) may
miss that NULL denotes an error if errno is set to 0. This would result
in skel pointer being NULL, while ASSERT_OK_PTR returning 1, leading to
a SEGV on dereference of skel, because libbpf_get_error relies on the
assumption that errno is always set in case of error for ptr == NULL.
In particular, this was observed for the ksyms_module test. When
executed using `./test_progs -t ksyms`, prior tests manipulated errno
and the test didn't crash when it failed at ksyms_module load, while
using `./test_progs -t ksyms_module` crashed due to errno being
untouched.
Fixes: 6723474373 (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-11-memxor@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 335aea75b0 ]
The overflow check in amdgpu_bo_list_create() causes a warning with
clang-14 on 64-bit architectures, since the limit can never be
exceeded.
drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list))
~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The check remains useful for 32-bit architectures, so just avoid the
warning by using size_t as the type for the count.
Fixes: 920990cb08 ("drm/amdgpu: allocate the bo_list array after the list")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 019edd01d1 ]
On a i.MX-based board with a QCA9377 Wifi chip, the following errors
are seen after launching the 'hostapd' application:
hostapd /etc/wifi.conf
Configuration file: /etc/wifi.conf
wlan0: interface state UNINITIALIZED->COUNTRY_UPDATE
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
Using interface wlan0 with hwaddr 00:1f:7b:31:04:a0 and ssid "thessid"
IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
wlan0: interface state COUNTRY_UPDATE->ENABLED
wlan0: AP-ENABLED
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
...
Fix this problem by adding the BH locking around napi-schedule(),
in the same way it was done in commit e63052a5dd ("mlx5e: add
add missing BH locking around napi_schdule()").
Its commit log provides the following explanation:
"It's not correct to call napi_schedule() in pure process
context. Because we use __raise_softirq_irqoff() we require
callers to be in a context which will eventually lead to
softirq handling (hardirq, bh disabled, etc.).
With code as is users will see:
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
"
Fixes: cfee8793a7 ("ath10k: enable napi on RX path for sdio")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210824144339.2796122-1-festevam@denx.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6dfbc3ba9 ]
When receiving a beacon or probe response, we should update the
boottime_ns field which is the timestamp the frame was received at.
(cf mac80211.h)
This fixes a scanning issue with Android since it relies on this
timestamp to determine when the AP has been seen for the last time
(via the nl80211 BSS_LAST_SEEN_BOOTTIME parameter).
Fixes: 5e3dd157d7 ("ath10k: mac80211 driver for Qualcomm Atheros 802.11ac CQA98xx devices")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1629811733-7927-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1e0083bd07 ]
The use of dma_unmap_addr()/dma_unmap_len() in the driver causes
multiple warnings when these macros are defined as empty, e.g.
in an ARCH=i386 allmodconfig build:
drivers/net/ethernet/google/gve/gve_tx_dqo.c: In function 'gve_tx_add_skb_no_copy_dqo':
drivers/net/ethernet/google/gve/gve_tx_dqo.c:494:40: error: unused variable 'buf' [-Werror=unused-variable]
494 | struct gve_tx_dma_buf *buf =
This is not how the NEED_DMA_MAP_STATE macros are meant to work,
as they rely on never using local variables or a temporary structure
like gve_tx_dma_buf.
Remote the gve_tx_dma_buf definition and open-code the contents
in all places to avoid the warning. This causes some rather long
lines but otherwise ends up making the driver slightly smaller.
Fixes: a57e5de476 ("gve: DQO: Add TX path")
Link: https://lore.kernel.org/netdev/20210723231957.1113800-1-bcf@google.com/
Link: https://lore.kernel.org/netdev/20210721151100.2042139-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1db2b0d0a3 ]
Whenever ath11k is bootup with a user country already set, cfg80211
notifies this country info to ath11k soon after registration, where the
notification is sent to the firmware for fetching the rules of this user
country input.
Multiple race conditions could be seen in this scenario where a new
request is either lost as pointed in [1] or a new regd overwrites the
default regd provided by the firmware during bootup. Note that, the
default regd is used for intersection purpose and hence it should not be
overwritten.
The main reason as pointed by [1] is the usage of ATH11K_FLAG_REGISTERED
flag which is updated after completion of core registration, whereas the
reg notification from cfg80211 and wmi events for the corresponding
request can happen much before that. Since the ATH11K_FLAG_REGISTERED is
currently used to determine if the event containing reg rules belong to
default regd or for user request, there is a possibility of the default
regd getting overwritten.
Since the default reg rules will be received only once per pdev on
firmware load, the above flag based check can be replaced with a check
to see if default_regd is already set, so that we can now always update
the new_regd. Also if the new_regd is set, this will be always used to
update the reg rules for the registered phy.
[1] https://patchwork.kernel.org/project/linux-wireless/patch/1829665.1PRlr7bOQj@ripper/
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-01460-QCAHKSWPL_SILICONZ-1
Fixes: d5c65159f2 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Sriram R <srirrama@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210721212029.142388-4-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aadf7c81a0 ]
The ath11k_dbring_bufs_replenish() and ath11k_dbring_fill_bufs()
take a "gfp" parameter but they since they take spinlocks, the
allocations they do have to be atomic. This causes a bug because
ath11k_dbring_buf_setup passes GFP_KERNEL for the gfp flags.
The fix is to use GFP_ATOMIC and remove the unused parameters.
Fixes: bd6478559e ("ath11k: Add direct buffer ring support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210812070434.GE31863@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5f5f12f5d4 ]
The max VLAN number with non-4K VLAN activated is 15, and the
range is 0..15. Not 16.
The impact should be low since we by default have 4K VLAN and
thus have 4095 VLANs to play with in this switch. There will
not be a problem unless the code is rewritten to only use
16 VLANs.
Fixes: d8652956cf ("net: dsa: realtek-smi: Add Realtek SMI driver")
Cc: Mauri Sandberg <sandberg@mailfence.com>
Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit acde891c24 ]
Directly using _usecs_to_jiffies() might be unsafe, so it's
better to use usecs_to_jiffies() instead.
Because we can see that the result of _usecs_to_jiffies()
could be larger than MAX_JIFFY_OFFSET values without the
check of the input.
Fixes: c410bf0193 ("Fix the excessive initial retransmission timeout")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6a54d6f22 ]
devlink is a software interface that doesn't depend on any hardware
capabilities. The failure in SW means memory issues, wrong parameters,
programmer error e.t.c.
Like any other such interface in the kernel, the returned status of
devlink APIs should be checked and propagated further and not ignored.
Fixes: 755f982bb1 ("qed/qede: make devlink survive recovery")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e624c70e11 ]
devlink is a software interface that doesn't depend on any hardware
capabilities. The failure in SW means memory issues, wrong parameters,
programmer error e.t.c.
Like any other such interface in the kernel, the returned status of
devlink APIs should be checked and propagated further and not ignored.
Fixes: 4ab0c6a8ff ("bnxt_en: add support to enable VF-representors")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f20311cc9c ]
On newer CAAM versions, not all accelerators are disabled if the SoC is
a non-E variant. While the driver checks most of the modules for
availability, there is one - PKHA - which sticks out. On non-E variants
it is still reported as available, that is the number of instances is
non-zero, but it has limited functionality. In particular it doesn't
support encryption and decryption, but just signing and verifying. This
is indicated by a bit in the PKHA_MISC field. Take this bit into account
if we are checking for availability.
This will the following error:
[ 8.167817] caam_jr 8020000.jr: 20000b0f: CCB: desc idx 11: : Invalid CHA selected.
Tested on an NXP LS1028A (non-E) SoC.
Fixes: d239b10d4c ("crypto: caam - add register map changes cf. Era 10")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6effad8abe ]
adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent
access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu
will still use adev->rmmio for access after amdgpu_device_unmap_mmio.
The patch is to move such SRIOV calling earlier to fini_early stage.
Fixes: 07775fc138 ("drm/amdgpu: Unmap all MMIO mappings")
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3e5f2d90c2 ]
bdev->evt_skb will get freed in the normal path and one error path
of mtk_hci_wmt_sync, while the other error paths do not free it,
which may cause a memleak. This bug is suggested by a static analysis
tool, please advise.
Fixes: e0b67035a9 ("Bluetooth: mediatek: update the common setup between MT7622 and other devices")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3c719fed0f ]
When the BSS reference holds a valid reference, it is not freed. The 'if'
condition is wrong. Instead of the 'if (bss)' check, the 'if (!bss)' check
is used.
The issue is solved by removing the unnecessary 'if' check because
cfg80211_put_bss() already performs the NULL validation.
Fixes: 6cd4fa5ab6 ("staging: wilc1000: make use of cfg80211_inform_bss_frame()")
Signed-off-by: Ajay Singh <ajay.kathat@microchip.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210916164902.74629-3-ajay.kathat@microchip.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 701668d3bf ]
We have been tracking a strange bug with Antenna Diversity Switching (ADS)
on wcn3680b for a while.
ADS is configured like this:
A. Via a firmware configuration table baked into the NV area.
1. Defines if ADS is enabled.
2. Defines which GPIOs are connected to which antenna enable pin.
3. Defines which antenna/GPIO is primary and which is secondary.
B. WCN36XX_CFG_VAL(ANTENNA_DIVERSITY, N)
N is a bitmask of available antenna.
Setting N to 3 indicates a bitmask of enabled antenna (1 | 2).
Obviously then we can set N to 1 or N to 2 to fix to a particular
antenna and disable antenna diversity.
C. WCN36XX_CFG_VAL(ASD_PROBE_INTERVAL, XX)
XX is the number of beacons between each antenna RSSI check.
Setting this value to 50 means, every 50 received beacons, run the
ADS algorithm.
D. WCN36XX_CFG_VAL(ASD_TRIGGER_THRESHOLD, YY)
YY is a two's complement integer which specifies the RSSI decibel
threshold below which ADS will run.
We default to -60db here, meaning a measured RSSI <= -60db will
trigger an ADS probe.
E. WCN36XX_CFG_VAL(ASD_RTT_RSSI_HYST_THRESHOLD, Z)
Z is a hysteresis value, indicating a delta which the RSSI must
exceed for the antenna switch to be valid.
For example if HYST_THRESHOLD == 3 AntennaId1-RSSI == -60db and
AntennaId-2-RSSI == -58db then firmware will not switch antenna.
The threshold needs to be -57db or better to satisfy the criteria.
F. A firmware feature bit also exists ANTENNA_DIVERSITY_SELECTION.
This feature bit is used by the firmware to report if
ANTENNA_DIVERSITY_SELECTION is supported. The host is not required to
toggle this bit to enable or disable ADS.
ADS works like this:
A. Every XX beacons the firmware switches to or remains on the primary
antenna.
B. The firmware then sends a Request-To-Send (RTS) packet to the AP.
C. The firmware waits for a Clear-To-Send (CTS) response from the AP.
D. The firmware then notes the received RSSI on the CTS packet.
E. The firmware then repeats steps A-D on the secondary antenna.
F. Subsequently if the RSSI on the measured antenna is better than
ASD_TRIGGER_THRESHOLD + the active antenna's RSSI then the
measured antenna becomes the active antenna.
G. If RSSI rises past ASD_TRIGGER_THRESHOLD then ADS doesn't run at
all even if there is a substantially better RSSI on the alternative
antenna.
What we have been observing is that the RTS packet is being sent but the
MAC address is a byte-swapped version of the target MAC. The ADS/RTS MAC is
corrupted only when the link is encrypted, if the AP is open the RTS MAC is
correct. Similarly if we configure the firmware to an RTS/CTS sequence for
regular data - the transmitted RTS MAC is correctly formatted.
Internally the wcn36xx firmware uses the indexes in the SMD commands to
populate and extract data from specific entries in an STA lookup table. The
AP's MAC appears a number of times in different indexes within this lookup
table, so the MAC address extracted for the data-transmit RTS and the MAC
address extracted for the ADS/RTS packet are not the same STA table index.
Our analysis indicates the relevant firmware STA table index is
"bssSelfStaIdx".
There is an STA populate function responsible for formatting the MAC
address of the bssSelfStaIdx including byte-swapping the MAC address.
Its clear then that the required STA populate command did not run for
bssSelfStaIdx.
So taking a look at the sequence of SMD commands sent to the firmware we
see the following downstream when moving from an unencrypted to encrypted
BSS setup.
- WLAN_HAL_CONFIG_BSS_REQ
- WLAN_HAL_CONFIG_STA_REQ
- WLAN_HAL_SET_STAKEY_REQ
Upstream in wcn36xx we have
- WLAN_HAL_CONFIG_BSS_REQ
- WLAN_HAL_SET_STAKEY_REQ
The solution then is to add the missing WLAN_HAL_CONFIG_STA_REQ between
WLAN_HAL_CONFIG_BSS_REQ and WLAN_HAL_SET_STAKEY_REQ.
No surprise WLAN_HAL_CONFIG_STA_REQ is the routine responsible for
populating the STA lookup table in the firmware and once done the MAC sent
by the ADS routine is in the correct byte-order.
This bug is apparent with ADS but it is also the case that any other
firmware routine that depends on the "bssSelfStaIdx" would retrieve
malformed data on an encrypted link.
Fixes: 3e977c5c52 ("wcn36xx: Define wcn3680 specific firmware parameters")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Benjamin Li <benl@squareup.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210909144428.2564650-2-bryan.odonoghue@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7ee285395b ]
It was found that the following warning was displayed when remounting
controllers from cgroup v2 to v1:
[ 8042.997778] WARNING: CPU: 88 PID: 80682 at kernel/cgroup/cgroup.c:3130 cgroup_apply_control_disable+0x158/0x190
:
[ 8043.091109] RIP: 0010:cgroup_apply_control_disable+0x158/0x190
[ 8043.096946] Code: ff f6 45 54 01 74 39 48 8d 7d 10 48 c7 c6 e0 46 5a a4 e8 7b 67 33 00 e9 41 ff ff ff 49 8b 84 24 e8 01 00 00 0f b7 40 08 eb 95 <0f> 0b e9 5f ff ff ff 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3
[ 8043.115692] RSP: 0018:ffffba8a47c23d28 EFLAGS: 00010202
[ 8043.120916] RAX: 0000000000000036 RBX: ffffffffa624ce40 RCX: 000000000000181a
[ 8043.128047] RDX: ffffffffa63c43e0 RSI: ffffffffa63c43e0 RDI: ffff9d7284ee1000
[ 8043.135180] RBP: ffff9d72874c5800 R08: ffffffffa624b090 R09: 0000000000000004
[ 8043.142314] R10: ffffffffa624b080 R11: 0000000000002000 R12: ffff9d7284ee1000
[ 8043.149447] R13: ffff9d7284ee1000 R14: ffffffffa624ce70 R15: ffffffffa6269e20
[ 8043.156576] FS: 00007f7747cff740(0000) GS:ffff9d7a5fc00000(0000) knlGS:0000000000000000
[ 8043.164663] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8043.170409] CR2: 00007f7747e96680 CR3: 0000000887d60001 CR4: 00000000007706e0
[ 8043.177539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8043.184673] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8043.191804] PKRU: 55555554
[ 8043.194517] Call Trace:
[ 8043.196970] rebind_subsystems+0x18c/0x470
[ 8043.201070] cgroup_setup_root+0x16c/0x2f0
[ 8043.205177] cgroup1_root_to_use+0x204/0x2a0
[ 8043.209456] cgroup1_get_tree+0x3e/0x120
[ 8043.213384] vfs_get_tree+0x22/0xb0
[ 8043.216883] do_new_mount+0x176/0x2d0
[ 8043.220550] __x64_sys_mount+0x103/0x140
[ 8043.224474] do_syscall_64+0x38/0x90
[ 8043.228063] entry_SYSCALL_64_after_hwframe+0x44/0xae
It was caused by the fact that rebind_subsystem() disables
controllers to be rebound one by one. If more than one disabled
controllers are originally from the default hierarchy, it means that
cgroup_apply_control_disable() will be called multiple times for the
same default hierarchy. A controller may be killed by css_kill() in
the first round. In the second round, the killed controller may not be
completely dead yet leading to the warning.
To avoid this problem, we collect all the ssid's of controllers that
needed to be disabled from the default hierarchy and then disable them
in one go instead of one by one.
Fixes: 334c3679ec ("cgroup: reimplement rebind_subsystems() using cgroup_apply_control() and friends")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f3bc07eba4 ]
Currently it66121_probe returns -EPROBE_DEFER if the there is no remote
endpoint found in the device tree which doesn't seem helpful, since this
is not going to change later and it is never checked if the next bridge
has been initialized yet. It will fail in that case later while doing
drm_bridge_attach for the next bridge in it66121_bridge_attach.
Since the bindings documentation for it66121 bridge driver states
there has to be a remote endpoint defined, its safe to return -EINVAL
in that case.
This additonally adds a check, if the remote endpoint is enabled and
returns -EPROBE_DEFER, if the remote bridge hasn't been initialized
(yet).
Fixes: 988156dc2f ("drm: bridge: add it66121 driver")
Signed-off-by: Alex Bee <knaerzche@gmail.com>
Signed-off-by: Robert Foss <robert.foss@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210918140420.231346-1-knaerzche@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cbcca2e396 ]
Dan Carpenter points out that we have a code path that permits a NULL
netdev pointer to be passed to netif_carrier_off(), which will cause
a kernel oops. In any case, we need to set pl->old_link_state to false
to have the desired effect when there is no netdev present.
Fixes: f97493657c ("net: phylink: add suspend/resume support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aed0826b0c ]
The key_domain member in struct net only exists if we define CONFIG_KEYS.
So we should add the define when we used key_domain.
Fixes: 9b24261051 ("keys: Network namespace domain tag")
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2507003a1d ]
lock_is_held_type(, 1) detects acquired read locks. It only recognized
locks acquired with lock_acquire_shared(). Read locks acquired with
lock_acquire_shared_recursive() are not recognized because a `2' is
stored as the read value.
Rework the check to additionally recognise lock's read value one and two
as a read held lock.
Fixes: e918188611 ("locking: More accurate annotations for read_lock()")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20210903084001.lblecrvz4esl4mrr@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e4f8681911 ]
The hardware sets the TMUWCF bit back to 0 when the TMU write
combiner flush completes so we should be checking for that instead
of the L2TFLS bit.
v2 (Melissa Wen):
- Add Signed-off-by and Fixes tags.
- Change the error message for the timeout to be more clear.
Fixes spurious Vulkan CTS failures in:
dEQP-VK.binding_model.descriptorset_random.*
Fixes: d223f98f02 ("drm/v3d: Add support for compute shader dispatch.")
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210915100507.3945-1-itoral@igalia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9af9dcf11b ]
The asm_cpu_bringup_and_idle() function is required to push the return
value on the stack in order to make ORC happy, but the only reason
objtool doesn't complain is because of a happy accident.
The thing is that asm_cpu_bringup_and_idle() doesn't return, so
validate_branch() never terminates and falls through to the next
function, which in the normal case is the hypercall_page. And that, as
it happens, is 4095 NOPs and a RET.
Make asm_cpu_bringup_and_idle() terminate on it's own, by making the
function it calls as a dead-end. This way we no longer rely on what
code happens to come after.
Fixes: c3881eb58d ("x86/xen: Make the secondary CPU idle tasks reliable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20210624095147.693801717@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ad74d39c5 ]
The current definition of 2W burst length is invalid.
This patch fixes it. Current downstream DEU driver doesn't
use DMA. An incorrect burst length value doesn't cause any
errors. This patch also adds other burst length values.
Fixes: dfec1a827d ("MIPS: Lantiq: Add DMA support")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f0b2b2df54 ]
The sync_sched_exp_online_cleanup() checks to see if RCU needs
an expedited quiescent state from the incoming CPU, sending it
an IPI if so. Before sending IPI, it checks whether expedited
qs need has been already requested for the incoming CPU, by
checking rcu_data.cpu_no_qs.b.exp for the current cpu, on which
sync_sched_exp_online_cleanup() is running. This works for the
case where incoming CPU is same as self. However, for the case
where incoming CPU is different from self, expedited request
won't get marked, which can potentially delay reporting of
expedited quiescent state for the incoming CPU.
Fixes: e015a34112 ("rcu: Avoid self-IPI in sync_sched_exp_online_cleanup()")
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 03e601f48b ]
If libbpf encounters an ELF file that has been stripped of its symbol
table, it will crash in bpf_object__add_programs() when trying to
dereference the obj->efile.symbols pointer.
Fix this by erroring out of bpf_object__elf_collect() if it is not able
able to find the symbol table.
v2:
- Move check into bpf_object__elf_collect() and add nice error message
Fixes: 6245947c1b ("libbpf: Allow gaps in BPF program sections to support overriden weak functions")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210901114812.204720-1-toke@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49d8a56064 ]
Before freeing struct sco_conn, all delayed timeout work should be
cancelled. Otherwise, sco_sock_timeout could potentially use the
sco_conn after it has been freed.
Additionally, sco_conn.timeout_work should be initialized when the
connection is allocated, not when the channel is added. This is
because an sco_conn can create channels with multiple sockets over its
lifetime, which happens if sockets are released but the connection
isn't deleted.
Fixes: ba316be1b6 ("Bluetooth: schedule SCO timeouts with delayed_work")
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3a5f3d61de ]
These two arrays are populated with data read from the I2C device
through regmap_read(), and the data is then compared with hardcoded
vendor/product ID values of supported chips.
However, the return value of regmap_read() was never checked. This is
fine, as long as the two arrays are zero-initialized, so that we don't
compare the vendor/product IDs against whatever garbage is left on the
stack.
Address this issue by zero-initializing these two arrays.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Fixes: 988156dc2f ("drm: bridge: add it66121 driver")
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Robert Foss <robert.foss@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210827163956.27517-1-paul@crapouillou.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 55285e21f0 ]
Atm the EFI FB platform driver gets a runtime PM reference for the
associated GFX PCI device during probing the EFI FB platform device and
releases it only when the platform device gets unbound.
When fbcon switches to the FB provided by the PCI device's driver (for
instance i915/drmfb), the EFI FB will get only unregistered without the
EFI FB platform device getting unbound, keeping the runtime PM reference
acquired during the platform device probing. This reference will prevent
the PCI driver from runtime suspending the device.
Fix this by releasing the RPM reference from the EFI FB's destroy hook,
called when the FB gets unregistered.
While at it assert that pm_runtime_get_sync() didn't fail.
v2:
- Move pm_runtime_get_sync() before register_framebuffer() to avoid its
race wrt. efifb_destroy()->pm_runtime_put(). (Daniel)
- Assert that pm_runtime_get_sync() didn't fail.
- Clarify commit message wrt. platform/PCI device/driver and driver
removal vs. device unbinding.
Fixes: a6c0fd3d5a ("efifb: Ensure graphics device for efifb stays at PCI D0")
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210809133146.2478382-1-imre.deak@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7b1d02fc4 ]
The internal stream state sets the timeout to 120 seconds 2 seconds
after the creation of the flow, attach this internal stream state to the
IPS_ASSURED flag for consistent event reporting.
Before this patch:
[NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
[DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
Note IPS_ASSURED for the flow not yet in the internal stream state.
after this update:
[NEW] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 [UNREPLIED] src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 30 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282
[UPDATE] udp 17 120 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
[DESTROY] udp 17 src=10.246.11.13 dst=216.239.35.0 sport=37282 dport=123 src=216.239.35.0 dst=10.246.11.13 sport=123 dport=37282 [ASSURED]
Before this patch, short-lived UDP flows never entered IPS_ASSURED, so
they were already candidate flow to be deleted by early_drop under
stress.
Before this patch, IPS_ASSURED is set on regardless the internal stream
state, attach this internal stream state to IPS_ASSURED.
packet #1 (original direction) enters NEW state
packet #2 (reply direction) enters ESTABLISHED state, sets on IPS_SEEN_REPLY
paclet #3 (any direction) sets on IPS_ASSURED (if 2 seconds since the
creation has passed by).
Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66e29fcda1 ]
With idle polling, IPIs are not sent when a CPU idle, but queued
and run later from do_idle(). The default kgdb_call_nmi_hook()
implementation gets the pointer to struct pt_regs from get_irq_reqs(),
which doesn't work in that case because it was not called from the
IPI interrupt handler. Fix it by defining our own kgdb_roundup()
function which sents an IPI_ENTER_KGDB. When that IPI is received
on the target CPU kgdb_nmicallback() is called.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e0ba125c2 ]
With 64 bit kernels unwind_special() is not working because
it compares the pc to the address of the function descriptor.
Add a helper function that compares pc with the dereferenced
address. This fixes all of the backtraces on my c8000. Without
this changes, a lot of backtraces are missing in kdb or the
show-all-tasks command from /proc/sysrq-trigger.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9cc2fa4f4a ]
The function end_of_stack() returns a pointer to the last entry of a
stack. For architectures like parisc where the stack grows upwards
return the pointer to the highest address in the stack.
Without this change I faced a crash on parisc, because the stackleak
functionality wrote STACKLEAK_POISON to the lowest address and thus
overwrote the first 4 bytes of the task_struct which included the
TIF_FLAGS.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2de71ee153 ]
This patch fixes the encoding for INST_RETIRED.PREC_DIST as published by Intel
(download.01.org/perfmon/) for Icelake. The official encoding
is event code 0x00 umask 0x1, a change from Skylake where it was code 0xc0
umask 0x1.
With this patch applied it is possible to run:
$ perf record -a -e cpu/event=0x00,umask=0x1/pp .....
Whereas before this would fail.
To avoid problems with tools which may use the old code, we maintain the old
encoding for Icelake.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211014001214.2680534-1-eranian@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f35dcaa0a8 ]
close_range() test type conflicts with close_range() library call in
x86_64-linux-gnu/bits/unistd_ext.h. Fix it by changing the name to
core_close_range().
gcc -g -I../../../../usr/include/ close_range_test.c -o ../tools/testing/selftests/core/close_range_test
In file included from close_range_test.c:16:
close_range_test.c:57:6: error: conflicting types for ‘close_range’; have ‘void(struct __test_metadata *)’
57 | TEST(close_range)
| ^~~~~~~~~~~
../kselftest_harness.h:181:21: note: in definition of macro ‘__TEST_IMPL’
181 | static void test_name(struct __test_metadata *_metadata); \
| ^~~~~~~~~
close_range_test.c:57:1: note: in expansion of macro ‘TEST’
57 | TEST(close_range)
| ^~~~
In file included from /usr/include/unistd.h:1204,
from close_range_test.c:13:
/usr/include/x86_64-linux-gnu/bits/unistd_ext.h:56:12: note: previous declaration of ‘close_range’ with type ‘int(unsigned int, unsigned int, int)’
56 | extern int close_range (unsigned int __fd, unsigned int __max_fd,
| ^~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 285f68afa8 ]
The following issue is observed with CONFIG_DEBUG_PREEMPT when KVM loads:
KVM: vmx: using Hyper-V Enlightened VMCS
BUG: using smp_processor_id() in preemptible [00000000] code: systemd-udevd/488
caller is set_hv_tscchange_cb+0x16/0x80
CPU: 1 PID: 488 Comm: systemd-udevd Not tainted 5.15.0-rc5+ #396
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 12/17/2019
Call Trace:
dump_stack_lvl+0x6a/0x9a
check_preemption_disabled+0xde/0xe0
? kvm_gen_update_masterclock+0xd0/0xd0 [kvm]
set_hv_tscchange_cb+0x16/0x80
kvm_arch_init+0x23f/0x290 [kvm]
kvm_init+0x30/0x310 [kvm]
vmx_init+0xaf/0x134 [kvm_intel]
...
set_hv_tscchange_cb() can get preempted in between acquiring
smp_processor_id() and writing to HV_X64_MSR_REENLIGHTENMENT_CONTROL. This
is not an issue by itself: HV_X64_MSR_REENLIGHTENMENT_CONTROL is a
partition-wide MSR and it doesn't matter which particular CPU will be
used to receive reenlightenment notifications. The only real problem can
(in theory) be observed if the CPU whose id was acquired with
smp_processor_id() goes offline before we manage to write to the MSR,
the logic in hv_cpu_die() won't be able to reassign it correctly.
Reported-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20211012155005.1613352-1-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 43ea9bd84f ]
Firmware link offload monitoring can be made to work in 3/4 cases by
switching on firmware feature bit WLANACTIVE_OFFLOAD
- Secure power-save on
- Secure power-save off
- Open power-save on
However, with an open AP if we switch off power-saving - thus never
entering Beacon Mode Power Save - BMPS, firmware never forwards loss
of beacon upwards.
We had hoped that WLANACTIVE_OFFLOAD and some fixes for sequence numbers
would unblock this but, it hasn't and further investigation is required.
Its possible to have a complete set of Secure power-save on/off and Open
power-save on/off provided we use Linux' link monitoring mechanism.
While we debug the Open AP failure we need to fix upstream.
This reverts commit c973fdad79f6eaf247d48b5fc77733e989eb01e1.
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211025093037.3966022-2-bryan.odonoghue@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df0697801d ]
If the system is resumed because of an incoming packet, the wcn36xx RX
interrupts is fired before actual resuming of the wireless/mac80211
stack, causing any received packets to be simply dropped. E.g. a ping
request causes a system resume, but is dropped and so never forwarded
to the IP stack.
This change fixes that, disabling DMA interrupts on suspend to no pass
packets until mac80211 is resumed and ready to handle them.
Note that it's not incompatible with RX irq wake.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1635150496-19290-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a27ca3947 ]
For packets originating from hardware scan, the channel and band is
included in the buffer descriptor (bd->rf_band & bd->rx_ch).
For 2Ghz band the channel value is directly reported in the 4-bit
rx_ch field. For 5Ghz band, the rx_ch field contains a mapping
index (given the 4-bit limitation).
The reserved0 value field is also used to extend 4-bit mapping to
5-bit mapping to support more than 16 5Ghz channels.
This change adds correct reporting of the frequency/band, that is
used in scan mechanism. And is required for 5Ghz hardware scan
support.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634554678-7993-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ef9dc0f14 ]
We got the following lockdep splat while running fstests (specifically
btrfs/003 and btrfs/020 in a row) with the new rc. This was uncovered
by 87579e9b7d ("loop: use worker per cgroup instead of kworker") which
converted loop to using workqueues, which comes with lockdep
annotations that don't exist with kworkers. The lockdep splat is as
follows:
WARNING: possible circular locking dependency detected
5.14.0-rc2-custom+ #34 Not tainted
------------------------------------------------------
losetup/156417 is trying to acquire lock:
ffff9c7645b02d38 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x84/0x600
but task is already holding lock:
ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #5 (&lo->lo_mutex){+.+.}-{3:3}:
__mutex_lock+0xba/0x7c0
lo_open+0x28/0x60 [loop]
blkdev_get_whole+0x28/0xf0
blkdev_get_by_dev.part.0+0x168/0x3c0
blkdev_open+0xd2/0xe0
do_dentry_open+0x163/0x3a0
path_openat+0x74d/0xa40
do_filp_open+0x9c/0x140
do_sys_openat2+0xb1/0x170
__x64_sys_openat+0x54/0x90
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #4 (&disk->open_mutex){+.+.}-{3:3}:
__mutex_lock+0xba/0x7c0
blkdev_get_by_dev.part.0+0xd1/0x3c0
blkdev_get_by_path+0xc0/0xd0
btrfs_scan_one_device+0x52/0x1f0 [btrfs]
btrfs_control_ioctl+0xac/0x170 [btrfs]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (uuid_mutex){+.+.}-{3:3}:
__mutex_lock+0xba/0x7c0
btrfs_rm_device+0x48/0x6a0 [btrfs]
btrfs_ioctl+0x2d1c/0x3110 [btrfs]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#11){.+.+}-{0:0}:
lo_write_bvec+0x112/0x290 [loop]
loop_process_work+0x25f/0xcb0 [loop]
process_one_work+0x28f/0x5d0
worker_thread+0x55/0x3c0
kthread+0x140/0x170
ret_from_fork+0x22/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
process_one_work+0x266/0x5d0
worker_thread+0x55/0x3c0
kthread+0x140/0x170
ret_from_fork+0x22/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
__lock_acquire+0x1130/0x1dc0
lock_acquire+0xf5/0x320
flush_workqueue+0xae/0x600
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x650 [loop]
lo_ioctl+0x29d/0x780 [loop]
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
(wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&lo->lo_mutex);
lock(&disk->open_mutex);
lock(&lo->lo_mutex);
lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/156417:
#0: ffff9c7647395468 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x650 [loop]
stack backtrace:
CPU: 8 PID: 156417 Comm: losetup Not tainted 5.14.0-rc2-custom+ #34
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
dump_stack_lvl+0x57/0x72
check_noncircular+0x10a/0x120
__lock_acquire+0x1130/0x1dc0
lock_acquire+0xf5/0x320
? flush_workqueue+0x84/0x600
flush_workqueue+0xae/0x600
? flush_workqueue+0x84/0x600
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x650 [loop]
lo_ioctl+0x29d/0x780 [loop]
? __lock_acquire+0x3a0/0x1dc0
? update_dl_rq_load_avg+0x152/0x360
? lock_is_held_type+0xa5/0x120
? find_held_lock.constprop.0+0x2b/0x80
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f645884de6b
Usually the uuid_mutex exists to protect the fs_devices that map
together all of the devices that match a specific uuid. In rm_device
we're messing with the uuid of a device, so it makes sense to protect
that here.
However in doing that it pulls in a whole host of lockdep dependencies,
as we call mnt_may_write() on the sb before we grab the uuid_mutex, thus
we end up with the dependency chain under the uuid_mutex being added
under the normal sb write dependency chain, which causes problems with
loop devices.
We don't need the uuid mutex here however. If we call
btrfs_scan_one_device() before we scratch the super block we will find
the fs_devices and not find the device itself and return EBUSY because
the fs_devices is open. If we call it after the scratch happens it will
not appear to be a valid btrfs file system.
We do not need to worry about other fs_devices modifying operations here
because we're protected by the exclusive operations locking.
So drop the uuid_mutex here in order to fix the lockdep splat.
A more detailed explanation from the discussion:
We are worried about rm and scan racing with each other, before this
change we'll zero the device out under the UUID mutex so when scan does
run it'll make sure that it can go through the whole device scan thing
without rm messing with us.
We aren't worried if the scratch happens first, because the result is we
don't think this is a btrfs device and we bail out.
The only case we are concerned with is we scratch _after_ scan is able
to read the superblock and gets a seemingly valid super block, so lets
consider this case.
Scan will call device_list_add() with the device we're removing. We'll
call find_fsid_with_metadata_uuid() and get our fs_devices for this
UUID. At this point we lock the fs_devices->device_list_mutex. This is
what protects us in this case, but we have two cases here.
1. We aren't to the device removal part of the RM. We found our device,
and device name matches our path, we go down and we set total_devices
to our super number of devices, which doesn't affect anything because
we haven't done the remove yet.
2. We are past the device removal part, which is protected by the
device_list_mutex. Scan doesn't find the device, it goes down and
does the
if (fs_devices->opened)
return -EBUSY;
check and we bail out.
Nothing about this situation is ideal, but the lockdep splat is real,
and the fix is safe, tho admittedly a bit scary looking.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ copy more from the discussion ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 44bee215f7 ]
Fix a warning reported by smatch that ret could be returned without
initialized. The dedupe operations are supposed to to return 0 for a 0
length range but the caller does not pass olen == 0. To keep this
behaviour and also fix the warning initialize ret to 0.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d730ee686 ]
Let GK45 not go into BIOS for determining the AC power state.
The BIOS wrongly returns 0, so hardcode the power state to 1.
The mini PC GK45 by Besstar Tech Lld. (aka Kodlix) just runs
off AC. It does not include any batteries. Nevertheless BIOS
reports AC off:
root@kodlix:/usr/src/linux# cat /sys/class/power_supply/ADP1/online
0
root@kodlix:/usr/src/linux# modprobe acpi_dbg
root@kodlix:/usr/src/linux# tools/power/acpi/acpidbg
- find _PSR
\_SB.PCI0.SBRG.H_EC.ADP1._PSR Method 000000009283cee8 001 Args 0 Len 001C Aml 00000000f54e5f67
- execute \_SB.PCI0.SBRG.H_EC.ADP1._PSR
Evaluating \_SB.PCI0.SBRG.H_EC.ADP1._PSR
Evaluation of \_SB.PCI0.SBRG.H_EC.ADP1._PSR returned object 00000000dc08c187, external buffer length 18
[Integer] = 0000000000000000
that should be
[Integer] = 0000000000000001
Signed-off-by: Stefan Schaeckeler <schaecsn@gmx.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8c9c296adf ]
The VRF driver invokes netfilter for output+postrouting hooks so that users
can create rules that check for 'oif $vrf' rather than lower device name.
This is a problem when NAT rules are configured.
To avoid any conntrack involvement in round 1, tag skbs as 'untracked'
to prevent conntrack from picking them up.
This gets cleared before the packet gets handed to the ip stack so
conntrack will be active on the second iteration.
One remaining issue is that a rule like
output ... oif $vrfname notrack
won't propagate to the second round because we can't tell
'notrack set via ruleset' and 'notrack set by vrf driver' apart.
However, this isn't a regression: the 'notrack' removal happens
instead of unconditional nf_reset_ct().
I'd also like to avoid leaking more vrf specific conditionals into the
netfilter infra.
For ingress, conntrack has already been done before the packet makes it
to the vrf driver, with this patch egress does connection tracking with
lower/physical device as well.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 345dac33f5 ]
When configuring the kernel for big-endian, we set either BE-8 or BE-32
based on the CPU architecture level. Until linux-4.4, we did not have
any ARMv7-M platform allowing big-endian builds, but now i.MX/Vybrid
is in that category, adn we get a build error because of this:
arch/arm/kernel/module-plts.c: In function 'get_module_plt':
arch/arm/kernel/module-plts.c:60:46: error: implicit declaration of function '__opcode_to_mem_thumb32' [-Werror=implicit-function-declaration]
This comes down to picking the wrong default, ARMv7-M uses BE8
like ARMv7-A does. Changing the default gets the kernel to compile
and presumably works.
https://lore.kernel.org/all/1455804123-2526139-2-git-send-email-arnd@arndb.de/
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7427f3bb49 ]
So far, glock_hash_walk took a reference on each glock it iterated over, and it
was the examiner's responsibility to drop those references. Dropping the final
reference to a glock can sleep and the examiners are called in a RCU critical
section with spin locks held, so examiners that didn't need the extra reference
had to drop it asynchronously via gfs2_glock_queue_put or similar. This wasn't
done correctly in thaw_glock which did call gfs2_glock_put, and not at all in
dump_glock_func.
Change glock_hash_walk to not take glock references at all. That way, the
examiners that don't need them won't have to bother with slow asynchronous
puts, and the examiners that do need references can take them themselves.
Reported-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 486408d690 ]
In gfs2_inode_lookup and gfs2_create_inode, we're calling
gfs2_cancel_delete_work which currently cancels any remote delete work
(delete_work_func) synchronously. This means that if the work is
currently running, it will wait for it to finish. We're doing this to
pevent a previous instance of an inode from having any influence on the
next instance.
However, delete_work_func uses gfs2_inode_lookup internally, and we can
end up in a deadlock when delete_work_func gets interrupted at the wrong
time. For example,
(1) An inode's iopen glock has delete work queued, but the inode
itself has been evicted from the inode cache.
(2) The delete work is preempted before reaching gfs2_inode_lookup.
(3) Another process recreates the inode (gfs2_create_inode). It tries
to cancel any outstanding delete work, which blocks waiting for
the ongoing delete work to finish.
(4) The delete work calls gfs2_inode_lookup, which blocks waiting for
gfs2_create_inode to instantiate and unlock the new inode =>
deadlock.
It turns out that when the delete work notices that its inode has been
re-instantiated, it will do nothing. This means that it's safe to
cancel the delete work asynchronously. This prevents the kind of
deadlock described above.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 61e18ce734 ]
When addr_gen_mode is set to IN6_ADDR_GEN_MODE_NONE, the link-local addr
should not be generated. But it isn't the case for GRE (as well as GRE6)
and SIT tunnels. Make it so that tunnels consider the addr_gen_mode,
especially for IN6_ADDR_GEN_MODE_NONE.
Do this in add_v4_addrs() to cover both GRE and SIT only if the addr
scope is link.
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Acked-by: Antonio Quartulli <a@unstable.cc>
Link: https://lore.kernel.org/r/20211020200618.467342-1-ssuryaextr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b3ea5d56f2 ]
Currently the stacktrace on clang compiled arm kernel uses the 'lr'
register to find the first frame address from pt_regs. However, that
is wrong after calling another function, because the 'lr' register
is used by 'bl' instruction and never be recovered.
As same as gcc arm kernel, directly use the frame pointer (r11) of
the pt_regs to find the first frame address.
Note that this fixes kretprobe stacktrace issue only with
CONFIG_UNWINDER_FRAME_POINTER=y. For the CONFIG_UNWINDER_ARM,
we need another fix.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c3867ab592 ]
get_warnings_count() does fclose() using File * returned from popen().
Fix it to call pclose() as it should.
tools/testing/selftests/kvm/x86_64/mmio_warning_test
x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:87:9: warning: ‘fclose’ called on pointer returned from a mismatched allocation function [-Wmismatched-dealloc]
87 | fclose(f);
| ^~~~~~~~~
x86_64/mmio_warning_test.c:84:13: note: returned from ‘popen’
84 | f = popen("dmesg | grep \"WARNING:\" | wc -l", "r");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 14831fad73 ]
When running the following command without arm-linux-gnueabi-gcc in
one's $PATH, the following warning is observed:
$ ARCH=arm64 CROSS_COMPILE_COMPAT=arm-linux-gnueabi- make -j72 LLVM=1 mrproper
make[1]: arm-linux-gnueabi-gcc: No such file or directory
This is because KCONFIG is not run for mrproper, so CONFIG_CC_IS_CLANG
is not set, and we end up eagerly evaluating various variables that try
to invoke CC_COMPAT.
This is a similar problem to what was observed in
commit dc960bfeed ("h8300: suppress error messages for 'make clean'")
Reported-by: Lucas Henneman <henneman@google.com>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20211019223646.1146945-4-ndesaulniers@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2b81a5f015 ]
When reading the partition table on initial scan hits an I/O error the
I/O will hang with the scan_mutex held:
[<0>] do_read_cache_page+0x49b/0x790
[<0>] read_part_sector+0x39/0xe0
[<0>] read_lba+0xf9/0x1d0
[<0>] efi_partition+0xf1/0x7f0
[<0>] bdev_disk_changed+0x1ee/0x550
[<0>] blkdev_get_whole+0x81/0x90
[<0>] blkdev_get_by_dev+0x128/0x2e0
[<0>] device_add_disk+0x377/0x3c0
[<0>] nvme_mpath_set_live+0x130/0x1b0 [nvme_core]
[<0>] nvme_mpath_add_disk+0x150/0x160 [nvme_core]
[<0>] nvme_alloc_ns+0x417/0x950 [nvme_core]
[<0>] nvme_validate_or_alloc_ns+0xe9/0x1e0 [nvme_core]
[<0>] nvme_scan_work+0x168/0x310 [nvme_core]
[<0>] process_one_work+0x231/0x420
and trying to delete the controller will deadlock as it tries to grab
the scan mutex:
[<0>] nvme_mpath_clear_ctrl_paths+0x25/0x80 [nvme_core]
[<0>] nvme_remove_namespaces+0x31/0xf0 [nvme_core]
[<0>] nvme_do_delete_ctrl+0x4b/0x80 [nvme_core]
As we're now properly ordering the namespace list there is no need to
hold the scan_mutex in nvme_mpath_clear_ctrl_paths() anymore.
And we always need to kick the requeue list as the path will be marked
as unusable and I/O will be requeued _without_ a current path.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2351ead99c ]
When removing a port, all its controllers are being removed, but there
are queues on the port that doesn't belong to any controller (during
connection time). This causes a use-after-free bug for any command
that dereferences req->port (like in nvmet_alloc_ctrl). Those queues
should be destroyed before freeing the port via configfs. Destroy
the remaining queues after the accept_work was cancelled guarantees
that no new queue will be created.
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fcf73a804c ]
When removing a port, all its controllers are being removed, but there
are queues on the port that doesn't belong to any controller (during
connection time). This causes a use-after-free bug for any command
that dereferences req->port (like in nvmet_alloc_ctrl). Those queues
should be destroyed before freeing the port via configfs. Destroy the
remaining queues after the RDMA-CM was destroyed guarantees that no
new queue will be created.
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3e19dcc4c ]
When a port is removed through configfs, any connected controllers
are starting teardown flow asynchronously and can still send commands.
This causes a use-after-free bug for any command that dereferences
req->port (like in nvmet_parse_io_cmd).
To fix this, wait for all the teardown scheduled works to complete
(like release_work at rdma/tcp drivers). This ensures there are no
active controllers when the port is eventually removed.
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1ecda6393d ]
The mailbox is initialized after the interrupt handler is installed. As
the firmware is loaded and started even later, it should not happen that
the interrupt occurs without the mailbox being initialized.
As the Linux Driver Verification project (linuxtesting.org) keeps
reporting this as an error, add a check to ignore interrupts before the
mailbox is initialized to fix this potential null pointer dereference.
Reported-by: Yuri Savinykh <s02190703@gse.cs.msu.ru>
Reported-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 037057a5a9 ]
This check is meant to catch cases where a requeue is attempted on a
request that is still inserted. It's never really been useful to catch any
misuse, and now it's actively wrong. Outside of that, this should not be a
BUG_ON() to begin with.
Remove the check as it's now causing active harm, as requeue off the plug
path will trigger it even though the request state is just fine.
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/linux-block/CAHj4cs80zAUc2grnCZ015-2Rvd-=gXRfB_dFKy=RTm+wRo09HQ@mail.gmail.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7ce1bb83a1 ]
If CONFIG_CFI_CLANG=y, attempting to read an event histogram will cause
the kernel to panic due to failed CFI check.
1. echo 'hist:keys=common_pid' >> events/sched/sched_switch/trigger
2. cat events/sched/sched_switch/hist
3. kernel panics on attempting to read hist
This happens because the sort() function expects a generic
int (*)(const void *, const void *) pointer for the compare function.
To prevent this CFI failure, change tracing map cmp_entries_* function
signatures to match this.
Also, fix the build error reported by the kernel test robot [1].
[1] https://lore.kernel.org/r/202110141140.zzi4dRh4-lkp@intel.com/
Link: https://lkml.kernel.org/r/20211014045217.3265162-1-kaleshsingh@google.com
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d25302e465 ]
Some unfriendly component, such as dpdk, write the same mask to
unbound kworker cpumask again and again. Every time it write to
this interface some work is queue to cpu, even though the mask
is same with the original mask.
So, fix it by return success and do nothing if the cpumask is
equal with the old one.
Signed-off-by: Mengen Sun <mengensun@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4f8d7abaa4 ]
This might matter, for example, if the underlying type of enum xz_check
was a signed char. In such a case the validation wouldn't have caught an
unsupported header. I don't know if this problem can occur in the kernel
on any arch but it's still good to fix it because some people might copy
the XZ code to their own projects from Linux instead of the upstream
XZ Embedded repository.
This change may increase the code size by a few bytes. An alternative
would have been to use an unsigned int instead of enum xz_check but
using an enumeration looks cleaner.
Link: https://lore.kernel.org/r/20211010213145.17462-3-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 83d3c4f22a ]
With valid files, the safety margin described in lib/decompress_unxz.c
ensures that these buffers cannot overlap. But if the uncompressed size
of the input is larger than the caller thought, which is possible when
the input file is invalid/corrupt, the buffers can overlap. Obviously
the result will then be garbage (and usually the decoder will return
an error too) but no other harm will happen when such an over-run occurs.
This change only affects uncompressed LZMA2 chunks and so this
should have no effect on performance.
Link: https://lore.kernel.org/r/20211010213145.17462-2-xiang@kernel.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b9e2291e3 ]
When the in memory flag is changed, we need to persist the change in the
rdev superblock flags. This is needed for "writemostly" and "failfast".
Reviewed-by: Li Feng <fengli@smartx.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 662167e59d ]
platform_device_unregister() should only be called when
a respective platform_device_register() is called. However
the floppy driver currently allows failures when registring
a drive and a bail out could easily cause an invalid call
to platform_device_unregister() where it was not intended.
Fix this by adding a bool to keep track of when the platform
device was registered for a drive.
This does not fix any known panic / bug. This issue was found
through code inspection while preparing the driver to use the
up and coming support for device_add_disk() error handling.
From what I can tell from code inspection, chances of this
ever happening should be insanely small, perhaps OOM.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210927220302.1073499-5-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba0ffdd8ce ]
Particularly for NVMe with efficient deferred submission for many
requests, there are nice benefits to be seen by bumping the default max
plug count from 16 to 32. This is especially true for virtualized setups,
where the submit part is more expensive. But can be noticed even on
native hardware.
Reduce the multiple queue factor from 4 to 2, since we're changing the
default size.
While changing it, move the defines into the block layer private header.
These aren't values that anyone outside of the block layer uses, or
should use.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d012f9189f ]
The function can loop and lock the system if for whatever reason the bit
for the target sensor is NEVER valid. This is the case if a sensor is
disabled by the factory and the valid bit is never reported as actually
valid. Add a timeout check and exit if a timeout occurs. As this is
a very rare condition, handle the timeout only if the first read fails.
While at it also rework the function to improve readability and convert
to poll_timeout generic macro.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211007172859.583-1-ansuelsmth@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67ca5159db ]
It seems reasonable to fine-tune only some of the skew values when using
one of the rgmii-*id PHY modes, and even when all skew values are
specified, using the correct ID PHY mode makes sense for documentation
purposes. Such a configuration also appears in the binding docs in
Documentation/devicetree/bindings/net/micrel-ksz90x1.txt, so the driver
should not warn about it.
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Link: https://lore.kernel.org/r/20211012103402.21438-1-matthias.schiffer@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1f3b22e4eb ]
[Why&How]
When system boots in headless mode, connecting a 4k display creates a
null pointer dereference due to hubp for a certain plane being null.
Add a condition to check for null hubp before dereferencing it.
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c10383e8dd ]
On some systems the ACPI namespace contains device objects that are
not used in certain configurations of the system. If they start off
in the D0 power state configuration, they will stay in it until the
system reboots, because of the lack of any mechanism possibly causing
their configuration to change. If that happens, they may prevent
some power resources from being turned off or generally they may
prevent the platform from getting into the deepest low-power states
thus causing some energy to be wasted.
Address this issue by changing the configuration of unused ACPI
device objects to the D3cold power state one after carrying out
the ACPI-based enumeration of devices.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214091
Link: https://lore.kernel.org/linux-acpi/20211007205126.11769-1-mario.limonciello@amd.com/
Reported-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2835f327bd ]
Some buggy firmware and/or brand new batteries can support a charge that's
slightly over the reported design capacity. In such cases, the kernel will
report to userspace that the charging state of the battery is "Unknown",
when in reality the battery charge is "Full", at least from the design
capacity point of view. Make the fallback condition accepts capacities
over the designed capacity so userspace knows that is full.
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 814a66741b ]
Both iov_iter_get_pages and iov_iter_get_pages_alloc return the number
of bytes of the iovec they could get the pages for. When they cannot
get any pages, they're supposed to return 0, but when the start of the
iovec isn't page aligned, the calculation goes wrong and they return a
negative value. Fix both functions.
In addition, change iov_iter_get_pages_alloc to return NULL in that case
to prevent resource leaks.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8105c2abbf ]
The issue happens in several error handling paths on two refcounted
object related to the object "host" (dma_chan_rx, dma_chan_tx). In
these paths, the function forgets to decrement one or both objects'
reference count increased earlier by dma_request_chan(), causing
reference count leaks.
Fix it by balancing the refcounts of both objects in some error
handling paths. In correspondence with the changes in moxart_probe(),
IS_ERR() is replaced with IS_ERR_OR_NULL() in moxart_remove() as well.
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Link: https://lore.kernel.org/r/20211009041918.28419-1-xiongx18@fudan.edu.cn
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4b6012a783 ]
kzalloc() is used to allocate memory for cd->detectors, and if it fails,
channel_detector_exit() behind the label fail will be called:
channel_detector_exit(dpd, cd);
In channel_detector_exit(), cd->detectors is dereferenced through:
struct pri_detector *de = cd->detectors[i];
To fix this possible null-pointer dereference, check cd->detectors before
the for loop to dereference cd->detectors.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210805153854.154066-1-islituo@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 21ccc9cd72 ]
When building the files in the tracefs file system, do not by default set
any permissions for OTH (other). This will make it easier for admins who
want to define a group for accessing tracefs and not having to first
disable all the permission bits for "other" in the file system.
As tracing can leak sensitive information, it should never by default
allowing all users access. An admin can still set the permission bits for
others to have access, which may be useful for creating a honeypot and
seeing who takes advantage of it and roots the machine.
Link: https://lkml.kernel.org/r/20210818153038.864149276@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49d67e4457 ]
The tracefs file system is by default mounted such that only root user can
access it. But there are legitimate reasons to create a group and allow
those added to the group to have access to tracing. By changing the
permissions of the tracefs mount point to allow access, it will allow
group access to the tracefs directory.
There should not be any real reason to allow all access to the tracefs
directory as it contains sensitive information. Have the default
permission of directories being created not have any OTH (other) bits set,
such that an admin that wants to give permission to a group has to first
disable all OTH bits in the file system.
Link: https://lkml.kernel.org/r/20210818153038.664127804@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ec6abe831a ]
This fix the deadlock with the BO reservations during SVM_BO evictions
while allocations in VRAM are concurrently performed. More specific,
while the ttm waits for the fence to be signaled (ttm_bo_wait), it
already has the BO reserved. In parallel, the restore worker might be
running, prefetching memory to VRAM. This also requires to reserve the
BO, but blocks the mmap semaphore first. The deadlock happens when the
SVM_BO eviction worker kicks in and waits for the mmap semaphore held
in restore worker. Preventing signal the fence back, causing the
deadlock until the ttm times out.
We don't need to hold the BO reservation anymore during validation
and mapping. Now the physical addresses are taken from hmm_range_fault.
We also take migrate_mutex to prevent range migration while
validate_and_map update GPU page table.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 146e5e7333 ]
Due to deadlocks in the networking subsystem spotted 12 years ago[1],
a workaround was put in place[2] to avoid taking the rtnl lock when it
was not available and restarting the syscall (back to VFS, letting
userspace spin). The following construction is found a lot in the net
sysfs and sysctl code:
if (!rtnl_trylock())
return restart_syscall();
This can be problematic when multiple userspace threads use such
interfaces in a short period, making them to spin a lot. This happens
for example when adding and moving virtual interfaces: userspace
programs listening on events, such as systemd-udevd and NetworkManager,
do trigger actions reading files in sysfs. It gets worse when a lot of
virtual interfaces are created concurrently, say when creating
containers at boot time.
Returning early without hitting the above pattern when the syscall will
fail eventually does make things better. While it is not a fix for the
issue, it does ease things.
[1] https://lore.kernel.org/netdev/49A4D5D5.5090602@trash.net/https://lore.kernel.org/netdev/m14oyhis31.fsf@fess.ebiederm.org/
and https://lore.kernel.org/netdev/20090226084924.16cb3e08@nehalam/
[2] Rightfully, those deadlocks are *hard* to solve.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ea2b9a3371 ]
bus_info field had a different value for the media entity and the video
device.
Fixes v4l2-compliance:
v4l2-compliance.cpp(637): media bus_info 'PCI:0000:00:05.0' differs from
V4L2 bus_info 'PCI:viewfinder'
Reviewed-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 553481e380 ]
For a try_fmt call, the node noes not need to be enabled.
Fixes v4l2-compliance
fail: v4l2-test-formats.cpp(717): Video Output Multiplanar is valid, but
no TRY_FMT was implemented
test VIDIOC_TRY_FMT: FAIL
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2ae0aa4758 ]
Keeping devlink port inside VSI data structure causes some issues.
Since VF VSI is released during reset that means that we have to
unregister devlink port and register it again every time reset is
triggered. With the new changes in devlink API it
might cause deadlock issues. After calling
devlink_port_register/devlink_port_unregister devlink API is going to
lock rtnl_mutex. It's an issue when VF reset is triggered in netlink
operation context (like setting VF MAC address or VLAN),
because rtnl_lock is already taken by netlink. Another call of
rtnl_lock from devlink API results in dead-lock.
By moving devlink port to PF/VF we avoid creating/destroying it
during reset. Since this patch, devlink ports are created during
ice_probe, destroyed during ice_remove for PF and created during
ice_repr_add, destroyed during ice_repr_rem for VF.
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1517176906 ]
When applying the policy min/max limits, the requested frequency is
simply clamped to not be out of range. It means, however, if one of the
boundaries isn't an available frequency, the frequency resolution can
return a value out of those limits, depending on the relation used.
e.g. freq{0,1,2} being available frequencies.
freq0 policy->min freq1 policy->max freq2
| | | | |
17kHz 18kHz 19kHz 20kHz 21kHz
__resolve_freq(21kHz, CPUFREQ_RELATION_L) -> 21kHz (out of bounds)
__resolve_freq(17kHz, CPUFREQ_RELATION_H) -> 17kHz (out of bounds)
If, during the policy init, we resolve the requested min/max to existing
frequencies, we ensure that any CPUFREQ_RELATION_* would resolve to a
frequency which is inside the policy min/max range.
Making the policy limits rigid helps to introduce the inefficient
frequencies support. Resolving an inefficient frequency to an efficient
one should not transgress policy->max (which can be set for thermal
reason) and having a value we can trust simplify this comparison.
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d3c4b6f64a ]
ACPICA commit 0762982923f95eb652cf7ded27356b247c9774de
During wakeup from system-wide sleep states, acpi_get_sleep_type_data()
is called and it tries to get memory from the slab allocator in order
to evaluate a control method, but if KFENCE is enabled in the kernel,
the memory allocation attempt causes an IRQ work to be queued and a
self-IPI to be sent to the CPU running the code which requires the
memory controller to be ready, so if that happens too early in the
wakeup path, it doesn't work.
Prevent that from taking place by calling acpi_get_sleep_type_data()
for S0 upfront, when preparing to enter a given sleep state, and
saving the data obtained by it for later use during system wakeup.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214271
Reported-by: Reik Keutterling <spielkind@gmail.com>
Tested-by: Reik Keutterling <spielkind@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1c36432b27 ]
Previously, 'make -C sched run_tests' will block forever when it occurs
something wrong where the *selftests framework* is waiting for its child
processes to exit.
[root@iaas-rpma sched]# ./cs_prctl_test
## Create a thread/process/process group hiearchy
Not a core sched system
tid=74985, / tgid=74985 / pgid=74985: ffffffffffffffff
Not a core sched system
tid=74986, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
tid=74988, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
tid=74989, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
tid=74990, / tgid=74986 / pgid=74985: ffffffffffffffff
Not a core sched system
tid=74987, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
tid=74991, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
tid=74992, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
tid=74993, / tgid=74987 / pgid=74985: ffffffffffffffff
Not a core sched system
(268) FAILED: get_cs_cookie(0) == 0
## Set a cookie on entire process group
-1 = prctl(62, 1, 0, 2, 0)
core_sched create failed -- PGID: Invalid argument
(cs_prctl_test.c:272) -
[root@iaas-rpma sched]# ps
PID TTY TIME CMD
4605 pts/2 00:00:00 bash
74986 pts/2 00:00:00 cs_prctl_test
74987 pts/2 00:00:00 cs_prctl_test
74999 pts/2 00:00:00 ps
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chris Hyser <chris.hyser@oracle.com>
Link: https://lore.kernel.org/r/20210902024333.75983-1-lizhijian@cn.fujitsu.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a130e8fbc7 ]
/proc/uptime reports idle time by reading the CPUTIME_IDLE field from
the per-cpu kcpustats. However, on NO_HZ systems, idle time is not
continually updated on idle cpus, leading this value to appear
incorrectly small.
/proc/stat performs an accounting update when reading idle time; we
can use the same approach for uptime.
With this patch, /proc/stat and /proc/uptime now agree on idle time.
Additionally, the following shows idle time tick up consistently on an
idle machine:
(while true; do cat /proc/uptime; sleep 1; done) | awk '{print $2-prev; prev=$2}'
Reported-by: Luigi Rizzo <lrizzo@google.com>
Signed-off-by: Josh Don <joshdon@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lkml.kernel.org/r/20210827165438.3280779-1-joshdon@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b36eb5e7b7 ]
Don't do kfree or other risky things when oops_in_progress is set.
It's easy enough to avoid doing them
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fc41665498 ]
If rcsi2_code_to_fmt() return NULL, then null pointer dereference occurs
in the next cycle. That should not be possible now but adding checking
protects from future bugs.
The patch adds checking if format is NULL.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 49c3eb3036 ]
The Cyberbook T116 tablet contains quite generic names in the sys_vendor
and product_name DMI strings, without this patch brcmfmac will try to load:
"brcmfmac43455-sdio.Default string-Default string.txt" as nvram file which
is way too generic.
The nvram file shipped on the factory Android image contains the exact
same settings as those used on the AcePC T8 mini PC, so point the new
DMI nvram filename quirk to the acepc-t8 nvram file.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210928160633.96928-1-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c15b5fc054 ]
When CONFIG_PRINTK is not set, the CMPXCHG_BUGCHECK() macro calls
_printk(), but _printk() is a static inline function, not available
as an extern.
Since the purpose of the macro is to print the BUGCHECK info,
make this config option depend on PRINTK.
Fixes multiple occurrences of this build error:
../include/linux/printk.h:208:5: error: static declaration of '_printk' follows non-static declaration
208 | int _printk(const char *s, ...)
| ^~~~~~~
In file included from ../arch/ia64/include/asm/cmpxchg.h:5,
../arch/ia64/include/uapi/asm/cmpxchg.h:146:28: note: previous declaration of '_printk' with type 'int(const char *, ...)'
146 | extern int _printk(const char *fmt, ...);
Cc: linux-ia64@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a5991c4e94 ]
When adding an internal scratch buffer to improve buffer handling when
stopping it was also erroneously used when syncing at capture start.
This led to that the first three buffers captured were always dropped
as they were captured in the scratch buffer instead of in a buffer
provided by the user.
Allow the hardware to be given user provided buffers when preparing for
capture in the stopped state. This still allows the driver to sync with
the hardware and always completes the buffers to user-space in the
correct order as no buffers are completed before the sync is complete.
This change improves the driver as buffers are completed and given to
the user three frames earlier than before.
The change also fixes a warning produced by v4l2-compliance,
warn: v4l2-test-buffers.cpp(448): got sequence number 3, expected 0
[hverkuil: fixed some typos in the Subject and the log message]
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6d0d779b21 ]
Some tools like v4l2-compliance let users select a media device based
on the bus_info string which can be quite convenient. Use a unique
string for that.
This also fixes the following v4l2-compliance warning:
warn: v4l2-test-media.cpp(52): empty bus_info
Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a4b83deb3e ]
With the new DMA API we need an extension of the videobuf2 API.
Previously, videobuf2 core would set the non-coherent DMA bit
in the vb2_queue dma_attr field (if user-space would pass a
corresponding memory hint); the vb2 core then would pass the
vb2_queue dma_attrs to the vb2 allocators. The vb2 allocator
would use the queue's dma_attr and the DMA API would allocate
either coherent or non-coherent memory.
But we cannot do this anymore, since there is no corresponding DMA
attr flag and, hence, there is no way for the allocator to become
aware of what type of allocation user-space has requested. So we
need to pass more context from videobuf2 core to the allocators.
Fix this by changing the call_ptr_memop() macro to pass the
vb2 pointer to the corresponding op callbacks.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cdfaf4752e ]
If of_device_get_match_data() return NULL,
then null pointer dereference occurs in s5p_mfc_init_pm().
The patch adds checking if dev->variant is NULL.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8515965e5e ]
The variable pdev is assigned to dev->plat_dev, and dev->plat_dev is
checked in:
if (!dev->plat_dev)
This indicates both dev->plat_dev and pdev can be NULL. If so, the
function dev_err() is called to print error information.
dev_err(&pdev->dev, "No platform data specified\n");
However, &pdev->dev is an illegal address, and it is dereferenced in
dev_err().
To fix this possible null-pointer dereference, replace dev_err() with
mfc_err().
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 76e21bb8be ]
vidtv_bridge_remove() releases and cleans up everything except for dvb
itself. The patch adds this missed release.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3f60e7e1a ]
All the entities must have a unique name. We can have a descriptive and
unique name by appending the function and the entity->id.
This is even resilent to multi chain devices.
Fixes v4l2-compliance:
Media Controller ioctls:
fail: v4l2-test-media.cpp(205): v2_entity_names_set.find(key) != v2_entity_names_set.end()
test MEDIA_IOC_G_TOPOLOGY: FAIL
fail: v4l2-test-media.cpp(394): num_data_links != num_links
test MEDIA_IOC_ENUM_ENTITIES/LINKS: FAIL
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ffccdde5f0 ]
The device is doing something unexpected with the control. Either because
the protocol is not properly implemented or there has been a HW error.
Fixes v4l2-compliance:
Control ioctls (Input 0):
fail: v4l2-test-controls.cpp(448): s_ctrl returned an error (22)
test VIDIOC_G/S_CTRL: FAIL
fail: v4l2-test-controls.cpp(698): s_ext_ctrls returned an error (22)
test VIDIOC_G/S/TRY_EXT_CTRLS: FAIL
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 548fa43a58 ]
At the moment of enabling irq handling:
1922 ret = devm_request_threaded_irq(&pdev->dev, irq, dcmi_irq_callback,
1923 dcmi_irq_thread, IRQF_ONESHOT,
1924 dev_name(&pdev->dev), dcmi);
there is still uninitialized field sd_format of struct stm32_dcmi *dcmi.
If an interrupt occurs in the interval between the installation of the
interrupt handler and the initialization of this field, NULL pointer
dereference happens.
This field is dereferenced in the handler function without any check:
457 if (dcmi->sd_format->fourcc == V4L2_PIX_FMT_JPEG &&
458 dcmi->misr & IT_FRAME) {
The patch moves interrupt handler installation
after initialization of the sd_format field that happens in
dcmi_graph_notify_complete() via dcmi_set_default_fmt().
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Dmitriy Ulitin <ulitin@ispras.ru>
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e16f5e39ac ]
There were several issues with handling errors in lm3554_probe():
- Probe did not set the error code when v4l2_ctrl_handler_init() failed.
- It intermixed gotos for handling errors of v4l2_ctrl_handler_init()
and media_entity_pads_init().
- It did not set the error code for failures of v4l2_ctrl_new_custom().
- Probe did not free resources in case of failures of
atomisp_register_i2c_module().
The patch fixes all these issues.
Found by Linux Driver Verification project (linuxtesting.org).
Link: https://lore.kernel.org/linux-media/20210810162943.19852-1-novikov@ispras.ru
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0961ba6dd2 ]
To prevent corrupted frames after starting and stopping the sensor its
datasheet specifies a specific pause sequence to follow:
Stopping:
Set Pause_Restart Bit -> Set Restart Bit -> Set Chip_Enable Off
Restarting:
Set Chip_Enable On -> Clear Pause_Restart Bit
The Restart Bit is cleared automatically and must not be cleared
manually as this would cause undefined behavior.
Signed-off-by: Dirk Bender <d.bender@phytec.de>
Signed-off-by: Stefan Riedmueller <s.riedmueller@phytec.de>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ffd2f89ad0 ]
Whenever the interface is brought up/down then set_rx_mode
function is called by the stack which enables promisc/allmulti
MCAM entries. But there are cases when driver brings
interface down and then up such as while changing number
of channels. In these cases promisc/allmulti MCAM entries
are left disabled as set_rx_mode callback is not called.
This patch enables these MCAM entries in all such cases.
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86a03dad0f ]
For fragmented packets, ath11k reassembles each fragment as a normal
packet and then reinjects it into HW ring. In this case, the DMA
direction should be DMA_TO_DEVICE, not DMA_FROM_DEVICE, otherwise
invalid payload will be reinjected to HW and then delivered to host.
What is more, since arbitrary memory could be allocated to the frame, we
don't know what kind of data is contained in the buffer reinjected.
Thus, as a bad result, private info may be leaked.
Note that this issue is only found on Intel platform.
Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1
Signed-off-by: Baochen Qiang <bqiang@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210916064617.20006-1-bqiang@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 441b3b5911 ]
When wlan interface is up, 11d scan is sent to the firmware, and the
firmware needs to spend couple of seconds to complete the 11d scan. If
immediately a normal scan from user space arrives to ath11k, then the
normal scan request is also sent to the firmware, but the scan started
event will be reported to ath11k until the 11d scan complete. When timed
out for the scan started in ath11k, ath11k stops the normal scan and the
firmware reports WMI_SCAN_EVENT_DEQUEUED to ath11k for the normal scan.
ath11k has no handler for the event and then timed out for the scan
completed in ath11k_scan_stop(), and ath11k prints the following error
message.
[ 1491.604750] ath11k_pci 0000:02:00.0: failed to receive scan abort comple: timed out
[ 1491.604756] ath11k_pci 0000:02:00.0: failed to stop scan: -110
[ 1491.604758] ath11k_pci 0000:02:00.0: failed to start hw scan: -110
Add a handler for WMI_SCAN_EVENT_DEQUEUED and then complete the scan to
get rid of the above error message.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
Signed-off-by: Wen Gong <wgong@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210914164226.38843-1-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 69a0fcf8a9 ]
During firmware recovery, the default reg rules which are
received via WMI_REG_CHAN_LIST_CC_EVENT can overwrite
the currently configured user regd.
See below snap for example,
root@OpenWrt:/# iw reg get | grep country
country FR: DFS-ETSI
country FR: DFS-ETSI
country FR: DFS-ETSI
country FR: DFS-ETSI
root@OpenWrt:/# echo assert > /sys/kernel/debug/ath11k/ipq8074\ hw2.0/simulate_f
w_crash
<snip>
[ 5290.471696] ath11k c000000.wifi1: pdev 1 successfully recovered
root@OpenWrt:/# iw reg get | grep country
country FR: DFS-ETSI
country US: DFS-FCC
country US: DFS-FCC
country US: DFS-FCC
In the above, the user configured country 'FR' is overwritten
when the rules of default country 'US' are received and updated during
recovery. Hence avoid processing of these rules in general
during firmware recovery as they have been already applied during
driver registration or after last set user country is configured.
This scenario applies for both AP and STA devices basically because
cfg80211 is not aware of the recovery and only the driver recovers, but
changing or resetting of the reg domain during recovery is not needed so
as to continue with the configured regdomain currently in use.
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-01460-QCAHKSWPL_SILICONZ-1
Signed-off-by: Sriram R <srirrama@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210721212029.142388-3-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b69c99463d ]
The purpose of this test is to verify that after a short activity passes,
the reported time is reasonable: not zero (which could be reported by
mistake), and not something outrageous (which would be indicative of an
issue in used units).
However, the idle time is reported in units of clock_t, or hundredths of
second. If the initial sequence of commands is very quick, it is possible
that the idle time is reported as just flat-out zero. When this test was
recently enabled in our nightly regression, we started seeing spurious
failures for exactly this reason.
Therefore buffer the delay leading up to the test with a sleep, to make
sure there is no legitimate way of reporting 0.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7f595d6a6c ]
fscrypt currently requires a 512-bit master key when AES-256-XTS is
used, since AES-256-XTS keys are 512-bit and fscrypt requires that the
master key be at least as long any key that will be derived from it.
However, this is overly strict because AES-256-XTS doesn't actually have
a 512-bit security strength, but rather 256-bit. The fact that XTS
takes twice the expected key size is a quirk of the XTS mode. It is
sufficient to use 256 bits of entropy for AES-256-XTS, provided that it
is first properly expanded into a 512-bit key, which HKDF-SHA512 does.
Therefore, relax the check of the master key size to use the security
strength of the derived key rather than the size of the derived key
(except for v1 encryption policies, which don't use HKDF).
Besides making things more flexible for userspace, this is needed in
order for the use of a KDF which only takes a 256-bit key to be
introduced into the fscrypt key hierarchy. This will happen with
hardware-wrapped keys support, as all known hardware which supports that
feature uses an SP800-108 KDF using AES-256-CMAC, so the wrapped keys
are wrapped 256-bit AES keys. Moreover, there is interest in fscrypt
supporting the same type of AES-256-CMAC based KDF in software as an
alternative to HKDF-SHA512. There is no security problem with such
features, so fix the key length check to work properly with them.
Reviewed-by: Paul Crowley <paulcrowley@google.com>
Link: https://lore.kernel.org/r/20210921030303.5598-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5fa6863ba6 ]
Currently for SPI devices we use the spi_device_id for module autoloading
even on systems using device tree, meaning that listing a compatible string
in the of_match_table isn't enough to have the module for a SPI driver
autoloaded.
We attempted to fix this by generating OF based modaliases for devices
instantiated from DT in 3ce6c9e261 ("spi: add of_device_uevent_modalias
support") but this meant we no longer reported spi_device_id based aliases
which broke drivers such as spi-nor which don't list all the compatible
strings they support directly for DT, and in at least that case it's not
super practical to do so given the very large number of compatibles
needed, much larger than the number spi_device_ids due to vendor strings.
As a result fell back to using spi_device_id based modalises.
Try to close the gap by printing a warning when a SPI driver has a DT
compatible that won't be matched as a SPI device ID with the goal of having
drivers provide both. Given fallback compatibles this check is going to be
excessive but it should be robust which is probably more important here.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210921192149.50740-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c606008b70 ]
When creating a new virtual interface in mwifiex_add_virtual_intf(), we
update our internal driver states like bss_type, bss_priority, bss_role
and bss_mode to reflect the mode the firmware will be set to.
When switching virtual interface mode using
mwifiex_init_new_priv_params() though, we currently only update bss_mode
and bss_role. In order for the interface mode switch to actually work,
we also need to update bss_type to its proper value, so do that.
This fixes a crash of the firmware (because the driver tries to execute
commands that are invalid in AP mode) when switching from station mode
to AP mode.
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210914195909.36035-9-verdre@v0yd.nl
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c2e9666cdf ]
We currently handle changing from the P2P to the STATION virtual
interface type slightly different than changing from P2P to ADHOC: When
changing to STATION, we don't send the SET_BSS_MODE command. We do send
that command on all other type-changes though, and it probably makes
sense to send the command since after all we just changed our BSS_MODE.
Looking at prior changes to this part of the code, it seems that this is
simply a leftover from old refactorings.
Since sending the SET_BSS_MODE command is the only difference between
mwifiex_change_vif_to_sta_adhoc() and the current code, we can now use
mwifiex_change_vif_to_sta_adhoc() for both switching to ADHOC and
STATION interface type.
This does not fix any particular bug and just "looked right", so there's
a small chance it might be a regression.
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210914195909.36035-4-verdre@v0yd.nl
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 44b979fa30 ]
Current code has an explicit check for hitting the task stack guard;
but overflowing any of the other stacks will get you a non-descript
general #DF warning.
Improve matters by using get_stack_info_noinstr() to detetrmine if and
which stack guard page got hit, enabling a better stack warning.
In specific, Michael Wang reported what turned out to be an NMI
exception stack overflow, which is now clearly reported as such:
[] BUG: NMI stack guard page was hit at 0000000085fd977b (stack is 000000003a55b09e..00000000d8cce1a5)
Reported-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Wang <yun.wang@linux.alibaba.com>
Link: https://lkml.kernel.org/r/YUTE/NuqnaWbST8n@hirez.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a2d3cbc80d ]
In the code for xts_crypt(), we check for the err value returned by
skcipher_walk_virt() and return from the function if it is non zero.
However, skcipher_walk_virt() can set walk.nbytes to 0, which would cause
us to call kernel_fpu_begin(), and then skip the kernel_fpu_end() call.
This patch checks for the walk.nbytes value instead, and returns if
walk.nbytes is 0. This prevents us from calling kernel_fpu_begin() in
the first place and also covers the case of having a non zero err value
returned from skcipher_walk_virt().
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit feab5bb8f1 ]
pdev_id in structure 'wmi_pdev_bss_chan_info_event' is wrongly placed
at the beginning. This causes invalid values in survey dump. Hence, align
the structure with the firmware.
Note: The firmware releases follow this order since the feature was
implemented. Also, it is not changing across the branches including
QCA6390.
Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.1.0.1-01228-QCAHKSWPL_SILICONZ-1
Signed-off-by: Ritesh Singh <ritesi@codeaurora.org>
Signed-off-by: Seevalamuthu Mariappan <seevalam@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210720214922.118078-3-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0db7c32ad3 ]
Early in debugging, it made some sense to differentiate the first
iteration from subsequent iterations, but now this just causes confusion.
This commit therefore moves the "set_tasks_gp_state(rtp, RTGS_WAIT_CBS)"
statement to the beginning of the "for" loop in rcu_tasks_kthread().
Reported-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1e080f1775 ]
mq / mqprio make the default child qdiscs visible. They only do
so for the qdiscs which are within real_num_tx_queues when the
device is registered. Depending on order of calls in the driver,
or if user space changes config via ethtool -L the number of
qdiscs visible under tc qdisc show will differ from the number
of queues. This is confusing to users and potentially to system
configuration scripts which try to make sure qdiscs have the
right parameters.
Add a new Qdisc_ops callback and make relevant qdiscs TTRT.
Note that this uncovers the "shortcut" created by
commit 1f27cde313 ("net: sched: use pfifo_fast for non real queues")
The default child qdiscs beyond initial real_num_tx are always
pfifo_fast, no matter what the sysfs setting is. Fixing this
gets a little tricky because we'd need to keep a reference
on whatever the default qdisc was at the time of creation.
In practice this is likely an non-issue the qdiscs likely have
to be configured to non-default settings, so whatever user space
is doing such configuration can replace the pfifos... now that
it will see them.
Reported-by: Matthew Massey <matthewmassey@fb.com>
Reviewed-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ca9ce2ba4 ]
Different SoCs have a different number of channels, e.g .:
* amazon-se has 10 channels,
* danube+ar9 have 20 channels,
* vr9 has 28 channels,
* ar10 has 24 channels.
We can read the ID register and, depending on the reported
number of channels, reset the appropriate number of channels.
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c12aa581f6 ]
Reading the DMA registers immediately after the reset causes
Data Bus Error. Adding a small delay fixes this issue.
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 71921a9606 ]
rcutorture is generating some nesting scenarios that are not compatible on PREEMPT_RT.
For example:
preempt_disable();
rcu_read_lock_bh();
preempt_enable();
rcu_read_unlock_bh();
The problem here is that on PREEMPT_RT the bottom halves have to be
disabled and enabled in preemptible context.
Reorder locking: start with BH locking and continue with then with
disabling preemption or interrupts. In the unlocking do it reverse by
first enabling interrupts and preemption and BH at the very end.
Ensure that on PREEMPT_RT BH locking remains unchanged if in
non-preemptible context.
Link: https://lkml.kernel.org/r/20190911165729.11178-6-swood@redhat.com
Link: https://lkml.kernel.org/r/20210819182035.GF4126399@paulmck-ThinkPad-P17-Gen-1
Signed-off-by: Scott Wood <swood@redhat.com>
[bigeasy: Drop ATOM_BH, make it only about changing BH in atomic
context. Allow enabling RCU in IRQ-off section. Reword commit message.]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 99c23da0ee ]
The sco_send_frame() also takes lock_sock() during memcpy_from_msg()
call that may be endlessly blocked by a task with userfaultd
technique, and this will result in a hung task watchdog trigger.
Just like the similar fix for hci_sock_sendmsg() in commit
92c685dc5de0 ("Bluetooth: reorganize functions..."), this patch moves
the memcpy_from_msg() out of lock_sock() for addressing the hang.
This should be the last piece for fixing CVE-2021-3640 after a few
already queued fixes.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 820a2ab23d ]
2 improvements to the Lenovo Ideapad D330 panel-orientation quirks:
1. Some versions of the Lenovo Ideapad D330 have a DMI_PRODUCT_NAME of
"81H3" and others have "81MD". Testing has shown that the "81MD" also has
a 90 degree mounted panel. Drop the DMI_PRODUCT_NAME from the existing
quirk so that the existing quirk matches both variants.
2. Some of the Lenovo Ideapad D330 models have a HD (800x1280) screen
instead of a FHD (1200x1920) screen (both are mounted right-side-up) add
a second Lenovo Ideapad D330 quirk for the HD version.
Changes in v2:
- Add a new quirk for Lenovo Ideapad D330 models with a HD screen instead
of a FHD screen
Link: https://github.com/systemd/systemd/pull/18884
Acked-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210530110428.12994-2-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f492283b15 ]
It is expected from the clients to follow the below steps on an imported
dmabuf fd:
a) dmabuf = dma_buf_get(fd) // Get the dmabuf from fd
b) dma_buf_attach(dmabuf); // Clients attach to the dmabuf
o Here the kernel does some slab allocations, say for
dma_buf_attachment and may be some other slab allocation in the
dmabuf->ops->attach().
c) Client may need to do dma_buf_map_attachment().
d) Accordingly dma_buf_unmap_attachment() should be called.
e) dma_buf_detach () // Clients detach to the dmabuf.
o Here the slab allocations made in b) are freed.
f) dma_buf_put(dmabuf) // Can free the dmabuf if it is the last
reference.
Now say an erroneous client failed at step c) above thus it directly
called dma_buf_put(), step f) above. Considering that it may be the last
reference to the dmabuf, buffer will be freed with pending attachments
left to the dmabuf which can show up as the 'memory leak'. This should
at least be reported as the WARN().
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1627043468-16381-1-git-send-email-charante@codeaurora.org
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit c87761db21 upstream.
In current code, the devres group for aggregate master is left open
after call to component_master_add_*(). This leads to problems when the
master does further managed allocations on its own. When any
participating driver calls component_del(), this leads to immediate
release of resources.
This came up when investigating a page fault occurring with i915 DRM
driver unbind with 5.15-rc1 kernel. The following sequence occurs:
i915_pci_remove()
-> intel_display_driver_unregister()
-> i915_audio_component_cleanup()
-> component_del()
-> component.c:take_down_master()
-> hdac_component_master_unbind() [via master->ops->unbind()]
-> devres_release_group(master->parent, NULL)
With older kernels this has not caused issues, but with audio driver
moving to use managed interfaces for more of its allocations, this no
longer works. Devres log shows following to occur:
component_master_add_with_match()
[ 126.886032] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000323ccdc5 devm_component_match_release (24 bytes)
[ 126.886045] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000865cdb29 grp< (0 bytes)
[ 126.886049] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 grp< (0 bytes)
audio driver completes its PCI probe()
[ 126.892238] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 pcim_iomap_release (48 bytes)
component_del() called() at DRM/i915 unbind()
[ 137.579422] i915 0000:00:02.0: DEVRES REL 00000000ef44c293 grp< (0 bytes)
[ 137.579445] snd_hda_intel 0000:00:1f.3: DEVRES REL 00000000865cdb29 grp< (0 bytes)
[ 137.579458] snd_hda_intel 0000:00:1f.3: DEVRES REL 000000001b480725 pcim_iomap_release (48 bytes)
So the "devres_release_group(master->parent, NULL)" ends up freeing the
pcim_iomap allocation. Upon next runtime resume, the audio driver will
cause a page fault as the iomap alloc was released without the driver
knowing about it.
Fix this issue by using the "struct master" pointer as identifier for
the devres group, and by closing the devres group after
the master->ops->bind() call is done. This allows devres allocations
done by the driver acting as master to be isolated from the binding state
of the aggregate driver. This modifies the logic originally introduced in
commit 9e1ccb4a77 ("drivers/base: fix devres handling for master device")
Fixes: 9e1ccb4a77 ("drivers/base: fix devres handling for master device")
Cc: stable@vger.kernel.org
Acked-by: Imre Deak <imre.deak@intel.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
Link: https://lore.kernel.org/r/20211013161345.3755341-1-kai.vehmanen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0cf48167b8 upstream.
The gauge requires us to clear the status bits manually for some alerts
to be properly dismissed. Previously the IRQ was configured to react only
on falling edge, which wasn't technically correct (the ALRT line is active
low), but it had a happy side-effect of preventing interrupt storms
on uncleared alerts from happening.
Fixes: 7fbf6b731b ("power: supply: max17042: Do not enforce (incorrect) interrupt trigger type")
Cc: <stable@vger.kernel.org>
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9aaa81c336 upstream.
Chipidea core was calling the interrupt handler from non-IRQ context
with interrupts enabled, something which can lead to a deadlock if
there's an actual interrupt trying to take a lock that's already held
(e.g. the controller lock in udc_irq()).
Add a wrapper that can be used to fake interrupts instead of calling the
handler directly.
Fixes: 3ecb3e09b0 ("usb: chipidea: Use extcon framework for VBUS and ID detect")
Fixes: 876d4e1e82 ("usb: chipidea: core: add wakeup support for extcon")
Cc: Peter Chen <peter.chen@kernel.org>
Cc: stable@vger.kernel.org # 4.4
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211021083447.20078-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26df977a90 upstream.
The bindings file for this driver is defining the property as 'reg' but
the driver was reading it with the 'num' name. The bindings actually had
the 'num' property when added in
commit ea52c21268 ("dt-bindings: iio: dac: Add docs for AD5770R DAC")
and then changed it to 'reg' in
commit 2cf3818f18 ("dt-bindings: iio: dac: AD5570R fix bindings errors").
However, both these commits landed in v5.7 so the assumption is
that either 'num' is not being used or if it is, the validations were not
done.
Anyways, if someone comes back yelling about this, we might just support
both of the properties in the future. Not ideal, but that's life...
Fixes: 2cf3818f18 ("dt-bindings: iio: dac: AD5570R fix bindings errors")
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20210818080525.62790-1-nuno.sa@analog.com
Cc: Stable@vger.kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 558df982d4 upstream.
On success i2c_master_send() returns the number of bytes written. The
call from iio_write_channel_info(), however, expects the return value to
be zero on success.
This bug causes incorrect consumption of the sysfs buffer in
iio_write_channel_info(). When writing more than two characters to
out_voltage0_raw, the ad5446 write handler is called multiple times
causing unexpected behavior.
Fixes: 3ec36a2cf0 ("iio:ad5446: Add support for I2C based DACs")
Signed-off-by: Pekka Korpinen <pekka.korpinen@iki.fi>
Link: https://lore.kernel.org/r/20210929185755.2384-1-pekka.korpinen@iki.fi
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4ac0536f88 upstream.
With commit 506c1da44f ("cifs: use the expiry output of dns_query to
schedule next resolution") and after triggering the first reconnect,
the next async dns resolution of tcp server's hostname would be
scheduled based on dns_resolver's key expiry default, which happens to
default to 5s on most systems that use key.dns_resolver for upcall.
As per key.dns_resolver.conf(5):
default_ttl=<number>
The number of seconds to set as the expiration on a cached
record. This will be overridden if the program manages to re-
trieve TTL information along with the addresses (if, for exam-
ple, it accesses the DNS directly). The default is 5 seconds.
The value must be in the range 1 to INT_MAX.
Make the next async dns resolution no shorter than 120s as we do not
want to be upcalling too often.
Cc: stable@vger.kernel.org
Fixes: 506c1da44f ("cifs: use the expiry output of dns_query to schedule next resolution")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7be3248f31 upstream.
We generally rely on a bunch of factors to differentiate between servers.
For example, IP address, port etc.
For certain server types (like Azure), it is important to make sure
that the server hostname matches too, even if the both hostnames currently
resolve to the same IP address.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9bf3d20331 upstream.
The block number in the quota tree on disk should be smaller than the
v2_disk_dqinfo.dqi_blocks. If the quota file was corrupted, we may be
allocating an 'allocated' block and that would lead to a loop in a tree,
which will probably trigger oops later. This patch adds a check for the
block number in the quota tree to prevent such potential issue.
Link: https://lore.kernel.org/r/20211008093821.1001186-2-yi.zhang@huawei.com
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Cc: stable@kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84e1b4045d upstream.
Aardvark controller has something like config space of a Root Port
available at offset 0x0 of internal registers - these registers are used
for implementation of the emulated bridge.
The default value of Class Code of this bridge corresponds to a RAID Mass
storage controller, though. (This is probably intended for when the
controller is used as Endpoint.)
Change the Class Code to correspond to a PCI Bridge.
Add comment explaining this change.
Link: https://lore.kernel.org/r/20211028185659.20329-6-kabel@kernel.org
Fixes: 8a3ebd8de3 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 771153fc88 upstream.
From very vague, ambiguous and incomplete information from Marvell we
deduced that the 32-bit Aardvark register at address 0x4
(PCIE_CORE_CMD_STATUS_REG), which is not documented for Root Complex mode
in the Functional Specification (only for Endpoint mode), controls two
16-bit PCIe registers: Command Register and Status Registers of PCIe Root
Port.
This means that bit 2 controls bus mastering and forwarding of memory and
I/O requests in the upstream direction. According to PCI specifications
bits [0:2] of Command Register, this should be by default disabled on
reset. So explicitly disable these bits at early setup of the Aardvark
driver.
Remove code which unconditionally enables all 3 bits and let kernel code
(via pci_set_master() function) to handle bus mastering of Root PCIe
Bridge via emulated PCI_COMMAND on emulated bridge.
Link: https://lore.kernel.org/r/20211028185659.20329-5-kabel@kernel.org
Fixes: 8a3ebd8de3 ("PCI: aardvark: Implement emulated root PCI bridge config space")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org # b2a56469d5 ("PCI: aardvark: Add FIXME comment for PCIE_CORE_CMD_STATUS_REG access")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 46ef6090db upstream.
Commit 366697018c ("PCI: aardvark: Add PHY support") introduced
configuration of PCIe Reference clock via PCIE_CORE_REF_CLK_REG register,
but did it incorrectly.
PCIe Reference clock differential pair is routed from system board to
endpoint card, so on CPU side it has output direction. Therefore it is
required to enable transmitting and disable receiving.
Default configuration according to Armada 3700 Functional Specifications is
enabled receiver part and disabled transmitter.
We need this change because otherwise PCIe Reference clock is configured to
some undefined state when differential pair is used for both transmitting
and receiving.
Fix this by disabling receiver part.
Link: https://lore.kernel.org/r/20211005180952.6812-6-kabel@kernel.org
Fixes: 366697018c ("PCI: aardvark: Add PHY support")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 661c399a65 upstream.
Current implementation of advk_pcie_link_up() is wrong as it marks also
link disabled or hot reset states as link up.
Fix it by marking link up only to those states which are defined in PCIe
Base specification 3.0, Table 4-14: Link Status Mapped to the LTSSM.
To simplify implementation, Define macros for every LTSSM state which
aardvark hardware can return in CFG_REG register.
Fix also checking for link training according to the same Table 4-14.
Define a new function advk_pcie_link_training() for this purpose.
Link: https://lore.kernel.org/r/20211005180952.6812-13-kabel@kernel.org
Fixes: 8c39d71036 ("PCI: aardvark: Add Aardvark PCI host controller driver")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org
Cc: Remi Pommarel <repk@triplefau.lt>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7ca6d7fa3 upstream.
The PCIE_ISR1_REG says which interrupts are currently set / active,
including those which are masked.
The driver currently reads this register and looks if some unmasked
interrupts are active, and if not, it clears status bits of _all_
interrupts, including the masked ones.
This is incorrect, since, for example, some drivers may poll these bits.
Remove this clearing, and also remove this early return statement
completely, since it does not change functionality in any way.
Link: https://lore.kernel.org/r/20211005180952.6812-7-kabel@kernel.org
Fixes: 8c39d71036 ("PCI: aardvark: Add Aardvark PCI host controller driver")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a41ae80bd upstream.
The pci_bridge_emul_conf_write() function correctly clears W1C bits in
cfgspace cache, but it does not inform the underlying implementation
about the clear request: the .write_op() method is given the value with
these bits cleared.
This is wrong if the .write_op() needs to know which bits were requested
to be cleared.
Fix the value to be passed into the .write_op() method to have requested
W1C bits set, so that it can clear them.
Both pci-bridge-emul users (mvebu and aardvark) are compatible with this
change.
Link: https://lore.kernel.org/r/20211028185659.20329-2-kabel@kernel.org
Fixes: 23a5fba4d9 ("PCI: Introduce PCI bridge emulated config space common logic")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a25440376 upstream.
Example for triggering use after free in a overlay on ext4 setup:
aio_read
ovl_read_iter
vfs_iter_read
ext4_file_read_iter
ext4_dio_read_iter
iomap_dio_rw -> -EIOCBQUEUED
/*
* Here IO is completed in a separate thread,
* ovl_aio_cleanup_handler() frees aio_req which has iocb embedded
*/
file_accessed(iocb->ki_filp); /**BOOM**/
Fix by introducing a refcount in ovl_aio_req similarly to aio_kiocb. This
guarantees that iocb is only freed after vfs_read/write_iter() returns on
underlying fs.
Fixes: 2406a307ac ("ovl: implement async IO routines")
Signed-off-by: yangerkun <yangerkun@huawei.com>
Link: https://lore.kernel.org/r/20210930032228.3199690-3-yangerkun@huawei.com/
Cc: <stable@vger.kernel.org> # v5.6
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40fdea0284 upstream.
When running as PVH or HVM guest with actual memory < max memory the
hypervisor is using "populate on demand" in order to allow the guest
to balloon down from its maximum memory size. For this to work
correctly the guest must not touch more memory pages than its target
memory size as otherwise the PoD cache will be exhausted and the guest
is crashed as a result of that.
In extreme cases ballooning down might not be finished today before
the init process is started, which can consume lots of memory.
In order to avoid random boot crashes in such cases, add a late init
call to wait for ballooning down having finished for PVH/HVM guests.
Warn on console if initial ballooning fails, panic() after stalling
for more than 3 minutes per default. Add a module parameter for
changing this timeout.
[boris: replaced pr_info() with pr_notice()]
Cc: <stable@vger.kernel.org>
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20211102091944.17487-1-jgross@suse.com
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7444d706be upstream.
The driver no longer depends on this option, but it fails to
build if it's disabled because the skb->tc_skip_classify is
hidden behind an #ifdef:
drivers/net/ifb.c:81:8: error: no member named 'tc_skip_classify' in 'struct sk_buff'
skb->tc_skip_classify = 1;
Use the same #ifdef around the assignment.
Fixes: 046178e726 ("ifb: Depend on netfilter alternatively to tc")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 027b57170b upstream.
Since commit edc6afc549 ("tty: switch to ktermios and new framework")
termios speed is no longer stored only in c_cflag member but also in new
additional c_ispeed and c_ospeed members. If BOTHER flag is set in c_cflag
then termios speed is stored only in these new members.
Therefore to correctly restore termios speed it is required to store also
ispeed and ospeed members, not only cflag member.
In case only cflag member with BOTHER flag is restored then functions
tty_termios_baud_rate() and tty_termios_input_baud_rate() returns baudrate
stored in c_ospeed / c_ispeed member, which is zero as it was not restored
too. If reported baudrate is invalid (e.g. zero) then serial core functions
report fallback baudrate value 9600. So it means that in this case original
baudrate is lost and kernel changes it to value 9600.
Simple reproducer of this issue is to boot kernel with following command
line argument: "console=ttyXXX,86400" (where ttyXXX is the device name).
For speed 86400 there is no Bnnn constant and therefore kernel has to
represent this speed via BOTHER c_cflag. Which means that speed is stored
only in c_ospeed and c_ispeed members, not in c_cflag anymore.
If bootloader correctly configures serial device to speed 86400 then kernel
prints boot log to early console at speed speed 86400 without any issue.
But after kernel starts initializing real console device ttyXXX then speed
is changed to fallback value 9600 because information about speed was lost.
This patch fixes above issue by storing and restoring also ispeed and
ospeed members, which are required for BOTHER flag.
Fixes: edc6afc549 ("[PATCH] tty: switch to ktermios and new framework")
Cc: stable@vger.kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Link: https://lore.kernel.org/r/20211002130900.9518-1-pali@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 51d1579466 upstream.
The resetting of the entire ring buffer use to simply go through and reset
each individual CPU buffer that had its own protection and synchronization.
But this was very slow, due to performing a synchronization for each CPU.
The code was reshuffled to do one disabling of all CPU buffers, followed
by a single RCU synchronization, and then the resetting of each of the CPU
buffers. But unfortunately, the mutex that prevented multiple occurrences
of resetting the buffer was not moved to the upper function, and there is
nothing to protect from it.
Take the ring buffer mutex around the global reset.
Cc: stable@vger.kernel.org
Fixes: b23d7a5f4a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")
Reported-by: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 67f4b9969c upstream.
Always check vmcs01's MSR bitmap when merging L0 and L1 bitmaps for L2,
and always update the relevant bits in vmcs02. This fixes two distinct,
but intertwined bugs related to dynamic MSR bitmap modifications.
The first issue is that KVM fails to enable MSR interception in vmcs02
for the FS/GS base MSRs if L1 first runs L2 with interception disabled,
and later enables interception.
The second issue is that KVM fails to honor userspace MSR filtering when
preparing vmcs02.
Fix both issues simultaneous as fixing only one of the issues (doesn't
matter which) would create a mess that no one should have to bisect.
Fixing only the first bug would exacerbate the MSR filtering issue as
userspace would see inconsistent behavior depending on the whims of L1.
Fixing only the second bug (MSR filtering) effectively requires fixing
the first, as the nVMX code only knows how to transition vmcs02's
bitmap from 1->0.
Move the various accessor/mutators that are currently buried in vmx.c
into vmx.h so that they can be shared by the nested code.
Fixes: 1a155254ff ("KVM: x86: Introduce MSR filtering")
Fixes: d69129b4e4 ("KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible")
Cc: stable@vger.kernel.org
Cc: Alexander Graf <graf@amazon.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211109013047.2041518-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7dfbc624eb upstream.
Check the current VMCS controls to determine if an MSR write will be
intercepted due to MSR bitmaps being disabled. In the nested VMX case,
KVM will disable MSR bitmaps in vmcs02 if they're disabled in vmcs12 or
if KVM can't map L1's bitmaps for whatever reason.
Note, the bad behavior is relatively benign in the current code base as
KVM sets all bits in vmcs02's MSR bitmap by default, clears bits if and
only if L0 KVM also disables interception of an MSR, and only uses the
buggy helper for MSR_IA32_SPEC_CTRL. Because KVM explicitly tests WRMSR
before disabling interception of MSR_IA32_SPEC_CTRL, the flawed check
will only result in KVM reading MSR_IA32_SPEC_CTRL from hardware when it
isn't strictly necessary.
Tag the fix for stable in case a future fix wants to use
msr_write_intercepted(), in which case a buggy implementation in older
kernels could prove subtly problematic.
Fixes: d28b387fb7 ("KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211109013047.2041518-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e2175ebd6 upstream.
In commit b043138246 ("x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is
not missed") we switched to using a gfn_to_pfn_cache for accessing the
guest steal time structure in order to allow for an atomic xchg of the
preempted field. This has a couple of problems.
Firstly, kvm_map_gfn() doesn't work at all for IOMEM pages when the
atomic flag is set, which it is in kvm_steal_time_set_preempted(). So a
guest vCPU using an IOMEM page for its steal time would never have its
preempted field set.
Secondly, the gfn_to_pfn_cache is not invalidated in all cases where it
should have been. There are two stages to the GFN->PFN conversion;
first the GFN is converted to a userspace HVA, and then that HVA is
looked up in the process page tables to find the underlying host PFN.
Correct invalidation of the latter would require being hooked up to the
MMU notifiers, but that doesn't happen---so it just keeps mapping and
unmapping the *wrong* PFN after the userspace page tables change.
In the !IOMEM case at least the stale page *is* pinned all the time it's
cached, so it won't be freed and reused by anyone else while still
receiving the steal time updates. The map/unmap dance only takes care
of the KVM administrivia such as marking the page dirty.
Until the gfn_to_pfn cache handles the remapping automatically by
integrating with the MMU notifiers, we might as well not get a
kernel mapping of it, and use the perfectly serviceable userspace HVA
that we already have. We just need to implement the atomic xchg on
the userspace address with appropriate exception handling, which is
fairly trivial.
Cc: stable@vger.kernel.org
Fixes: b043138246 ("x86/KVM: Make sure KVM_VCPU_FLUSH_TLB flag is not missed")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <3645b9b889dac6438394194bb5586a46b68d581f.camel@infradead.org>
[I didn't entirely agree with David's assessment of the
usefulness of the gfn_to_pfn cache, and integrated the outcome
of the discussion in the above commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8bb084119f upstream.
Since ARMv8.0 the upper 32 bits of ESR_ELx have been RES0, and recently
some of the upper bits gained a meaning and can be non-zero. For
example, when FEAT_LS64 is implemented, ESR_ELx[36:32] contain ISS2,
which for an ST64BV or ST64BV0 can be non-zero. This can be seen in ARM
DDI 0487G.b, page D13-3145, section D13.2.37.
Generally, we must not rely on RES0 bit remaining zero in future, and
when extracting ESR_ELx.EC we must mask out all other bits.
All C code uses the ESR_ELx_EC() macro, which masks out the irrelevant
bits, and therefore no alterations are required to C code to avoid
consuming irrelevant bits.
In a couple of places the KVM assembly extracts ESR_ELx.EC using LSR on
an X register, and so could in theory consume previously RES0 bits. In
both cases this is for comparison with EC values ESR_ELx_EC_HVC32 and
ESR_ELx_EC_HVC64, for which the upper bits of ESR_ELx must currently be
zero, but this could change in future.
This patch adjusts the KVM vectors to use UBFX rather than LSR to
extract ESR_ELx.EC, ensuring these are robust to future additions to
ESR_ELx.
Cc: stable@vger.kernel.org
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211103110545.4613-1-mark.rutland@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 19833c40d0 upstream.
I got the double free report:
BUG: KASAN: double-free or invalid-free in kfree+0xce/0x390
iio_device_unregister_sysfs+0x108/0x13b [industrialio]
iio_dev_release+0x9e/0x10e [industrialio]
device_release+0xa5/0x240
If __iio_device_register() fails, iio_dev_opaque->groups will be freed
in error path in iio_device_unregister_sysfs(), then iio_dev_release()
will call iio_device_unregister_sysfs() again, it causes double free.
Set iio_dev_opaque->groups to NULL when it's freed to fix this double free.
Not this is a local work around for a more general mess around life time
management that will get cleaned up and should make this handling
unnecesarry.
Fixes: 32f171724e ("iio: core: rework iio device group creation")
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Alexandru Ardelean <ardeleanalex@gmail.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211013030532.956133-1-yangyingliang@huawei.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 223a3b8283 upstream.
On Galaxy S3 (i9300/i9305), which has the max17047 fuel gauge and no
current sense resistor (rsns), the RepSOC register does not provide an
accurate state of charge value. The reported value is wrong, and does
not change over time. VFSOC however, which uses the voltage fuel gauge
to determine the state of charge, always shows an accurate value.
For devices without current sense, VFSOC is already used for the
soc-alert (0x0003 is written to MiscCFG register), so with this change
the source of the alert and the PROP_CAPACITY value match.
Fixes: 359ab9f5b1 ("power_supply: Add MAX17042 Fuel Gauge Driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Suggested-by: Wolfgang Wiedmeyer <wolfgit@wiedmeyer.de>
Signed-off-by: Henrik Grimler <henrik@grimler.se>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4ebddd654 upstream.
Following the introduction of the generic ECC engine infrastructure, it
was necessary to reorganize the code and move the ECC configuration in
the ->attach_chip() hook. Failing to do that properly lead to a first
series of fixes supposed to stabilize the situation. Unfortunately, this
only fixed the use of software ECC engines, preventing any other kind of
engine to be used, including on-die ones.
It is now time to (finally) fix the situation by ensuring that we still
provide a default (eg. software ECC) but will still support different
ECC engines such as on-die ECC engines if properly described in the
device tree.
There are no changes needed on the core side in order to do this, but we
just need to leverage the logic there which allows:
1- a subsystem default (set to Host engines in the raw NAND world)
2- a driver specific default (here set to software ECC engines)
3- any type of engine requested by the user (ie. described in the DT)
As the raw NAND subsystem has not yet been fully converted to the ECC
engine infrastructure, in order to provide a default ECC engine for this
driver we need to set chip->ecc.engine_type *before* calling
nand_scan(). During the initialization step, the core will consider this
entry as the default engine for this driver. This value may of course
be overloaded by the user if the usual DT properties are provided.
Fixes: b36bf0a0fe ("mtd: rawnand: socrates: Move the ECC initialization to ->attach_chip()")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210928222258.199726-9-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc7e5940aa upstream.
In orininal code, use 2 function spin_lock() and local_irq_save() to
protect the critical zone. But when enable the kernel debug config,
there are below inconsistent lock state detected.
================================
WARNING: inconsistent lock state
5.10.63-yocto-standard #1 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
lock_torture_wr/226 [HC0[0]:SC1[5]:HE1:SE0] takes:
ffff002005b2dd80 (&p->access_spinlock){+.?.}-{3:3}, at: qbman_swp_enqueue_multiple_mem_back+0x44/0x270
{SOFTIRQ-ON-W} state was registered at:
lock_acquire.part.0+0xf8/0x250
lock_acquire+0x68/0x84
_raw_spin_lock+0x68/0x90
qbman_swp_enqueue_multiple_mem_back+0x44/0x270
......
cryptomgr_test+0x38/0x60
kthread+0x158/0x164
ret_from_fork+0x10/0x38
irq event stamp: 4498
hardirqs last enabled at (4498): [<ffff800010fcf980>] _raw_spin_unlock_irqrestore+0x90/0xb0
hardirqs last disabled at (4497): [<ffff800010fcffc4>] _raw_spin_lock_irqsave+0xd4/0xe0
softirqs last enabled at (4458): [<ffff8000100108c4>] __do_softirq+0x674/0x724
softirqs last disabled at (4465): [<ffff80001005b2a4>] __irq_exit_rcu+0x190/0x19c
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&p->access_spinlock);
<Interrupt>
lock(&p->access_spinlock);
*** DEADLOCK ***
So, in order to avoid deadlock, use the combined functions
spin_lock_irqsave/spin_unlock_irqrestore() to protect critical zone.
Fixes: 3b2abda7d2 ("soc: fsl: dpio: Replace QMAN array mode with ring mode enqueue")
Cc: stable@vger.kernel.org
Signed-off-by: Meng Li <Meng.Li@windriver.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e775eb9fc2 upstream.
When enable debug kernel configs,there will be calltrace as below:
BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
caller is debug_smp_processor_id+0x20/0x30
CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.10.63-yocto-standard #1
Hardware name: NXP Layerscape LX2160ARDB (DT)
Call trace:
dump_backtrace+0x0/0x1a0
show_stack+0x24/0x30
dump_stack+0xf0/0x13c
check_preemption_disabled+0x100/0x110
debug_smp_processor_id+0x20/0x30
dpaa2_io_query_fq_count+0xdc/0x154
dpaa2_eth_stop+0x144/0x314
__dev_close_many+0xdc/0x160
__dev_change_flags+0xe8/0x220
dev_change_flags+0x30/0x70
ic_close_devs+0x50/0x78
ip_auto_config+0xed0/0xf10
do_one_initcall+0xac/0x460
kernel_init_freeable+0x30c/0x378
kernel_init+0x20/0x128
ret_from_fork+0x10/0x38
Based on comment in the context, it doesn't matter whether
preemption is disable or not. So, replace smp_processor_id()
with raw_smp_processor_id() to avoid above call trace.
Fixes: c89105c9b3 ("staging: fsl-mc: Move DPIO from staging to drivers/soc/fsl")
Cc: stable@vger.kernel.org
Signed-off-by: Meng Li <Meng.Li@windriver.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e37ef6dcdb upstream.
Commit 93618e344a ("soc: samsung: exynos-pmu: instantiate clkout
driver as MFD") adds a "devm_mfd_add_devices" call in the exynos-pmu
driver which depends on CONFIG_MFD_CORE. If no driver selects that
config, the build will fail if CONFIG_EXYNOS_PMU is enabled with the
following error:
drivers/soc/samsung/exynos-pmu.c:137: undefined reference to `devm_mfd_add_devices'
Fix this by making CONFIG_EXYNOS_PMU select CONFIG_MFD_CORE.
Fixes: 93618e344a ("soc: samsung: exynos-pmu: instantiate clkout driver as MFD")
Cc: <stable@vger.kernel.org>
Signed-off-by: David Virag <virag.david003@gmail.com>
Link: https://lore.kernel.org/r/20210909222812.108614-1-virag.david003@gmail.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95bf9d646c upstream.
When an instruction to save or restore a register from the stack fails
in _save_fp_context or _restore_fp_context return with -EFAULT. This
change was made to r2300_fpu.S[1] but it looks like it got lost with
the introduction of EX2[2]. This is also what the other implementation
of _save_fp_context and _restore_fp_context in r4k_fpu.S does, and
what is needed for the callers to be able to handle the error.
Furthermore calling do_exit(SIGSEGV) from bad_stack is wrong because
it does not terminate the entire process it just terminates a single
thread.
As the changed code was the only caller of arch/mips/kernel/syscall.c:bad_stack
remove the problematic and now unused helper function.
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Maciej Rozycki <macro@orcam.me.uk>
Cc: linux-mips@vger.kernel.org
[1] 35938a00ba ("MIPS: Fix ISA I FP sigcontext access violation handling")
[2] f92722dc45 ("MIPS: Correct MIPS I FP sigcontext layout")
Cc: stable@vger.kernel.org
Fixes: f92722dc45 ("MIPS: Correct MIPS I FP sigcontext layout")
Acked-by: Maciej W. Rozycki <macro@orcam.me.uk>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://lkml.kernel.org/r/20211020174406.17889-5-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fff53a551d upstream.
This patch fixes 2 problems:
[1] The output warning logs and data loss when performing
mount/umount then remount the device with jffs2 format.
[2] The access width of SMWDR[0:1]/SMRDR[0:1] register is wrong.
This is the sample warning logs when performing mount/umount then
remount the device with jffs2 format:
jffs2: jffs2_scan_inode_node(): CRC failed on node at 0x031c51d4:
Read 0x00034e00, calculated 0xadb272a7
The reason for issue [1] is that the writing data seems to
get messed up.
Data is only completed when the number of bytes is divisible by 4.
If you only have 3 bytes of data left to write, 1 garbage byte
is inserted after the end of the write stream.
If you only have 2 bytes of data left to write, 2 bytes of '00'
are added into the write stream.
If you only have 1 byte of data left to write, 2 bytes of '00'
are added into the write stream. 1 garbage byte is inserted after
the end of the write stream.
To solve problem [1], data must be written continuously in serial
and the write stream ends when data is out.
Following HW manual 62.2.15, access to SMWDR0 register should be
in the same size as the transfer size specified in the SPIDE[3:0]
bits in the manual mode enable setting register (SMENR).
Be sure to access from address 0.
So, in 16-bit transfer (SPIDE[3:0]=b'1100), SMWDR0 should be
accessed by 16-bit width.
Similar to SMWDR1, SMDDR0/1 registers.
In current code, SMWDR0 register is accessed by regmap_write()
that only set up to do 32-bit width.
To solve problem [2], data must be written 16-bit or 8-bit when
transferring 1-byte or 2-byte.
Fixes: ca7d8b980b ("memory: add Renesas RPC-IF driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Duc Nguyen <duc.nguyen.ub@renesas.com>
[wsa: refactored to use regmap only via reg_read/reg_write]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20210922091007.5516-1-wsa+renesas@sang-engineering.com
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d613f9f72 upstream.
The existence of sigkill_pending is a little silly as it is
functionally a duplicate of fatal_signal_pending that is used in
exactly one place.
Checking for pending fatal signals and returning early in ptrace_stop
is actively harmful. It casues the ptrace_stop called by
ptrace_signal to return early before setting current->exit_code.
Later when ptrace_signal reads the signal number from
current->exit_code is undefined, making it unpredictable what will
happen.
Instead rely on the fact that schedule will not sleep if there is a
pending signal that can awaken a task.
Removing the explict sigkill_pending test fixes fixes ptrace_signal
when ptrace_stop does not stop because current->exit_code is always
set to to signr.
Cc: stable@vger.kernel.org
Fixes: 3d749b9e67 ("ptrace: simplify ptrace_stop()->sigkill_pending() path")
Fixes: 1a669c2f16 ("Add arch_ptrace_stop")
Link: https://lkml.kernel.org/r/87pmsyx29t.fsf@disp2133
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b515d09705 upstream.
P2P client mode was only working the first time.
On subsequent connection attempts the group was successfully created but
no data was sent (no transmitted data packets were seen with a sniffer).
The reason for this was that the hardware was being configured in fixed
rate mode with rate RSI_RATE_1 (1Mbps) which is not valid in the 5GHz band.
In P2P mode wpa_supplicant uses NL80211_CMD_SET_TX_BITRATE_MASK to disallow
the 11b rates in the 2.4GHz band which updated common->fixedrate_mask.
rsi_set_min_rate() then used the fixedrate_mask to calculate the minimum
allowed rate, or 0xffff = auto if none was found.
However that calculation did not account for the different rate sets
allowed in the different bands leading to the error.
Fixing set_min_rate() would result in 6Mb/s being used all the time
which is not what we want either.
The reason the problem did not occur on the first connection is that
rsi_mac80211_set_rate_mask() only updated the fixedrate_mask for
the *current* band. When it was called that was still 2.4GHz as the
switch is done later. So the when set_min_rate() was subsequently
called after the switch to 5GHz it still had a mask of zero, leading
to defaulting to auto mode.
Fix this by differentiating the case of a single rate being
requested, in which case the hardware will be used in fixed rate
mode with just that rate, and multiple rates being requested,
in which case we remain in auto mode but the firmware rate selection
algorithm is configured with a restricted set of rates.
Fixes: dad0d04fa7 ("rsi: Add RS9113 wireless driver")
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
CC: stable@vger.kernel.org
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1630337206-12410-4-git-send-email-martin.fuzzey@flowbird.group
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 99ac601882 upstream.
My previous patch checked if encryption should be enabled by directly
checking info->control.hw_key (like the downstream driver).
However that missed that the control and driver_info members of
struct ieee80211_tx_info are union fields.
Due to this when rsi_core_xmit() updates fields in "tx_params"
(driver_info) it can overwrite the control.hw_key, causing the result
of the later test to be incorrect.
With the current structure layout the first byte of control.hw_key is
overlayed with the vap_id so, since we only test if control.hw_key is
NULL / non NULL, a non zero vap_id will incorrectly enable encryption.
In basic STA and AP modes the vap_id is always zero so it works but in
P2P client mode a second VIF is created causing vap_id to be non zero
and hence encryption to be enabled before keys have been set.
Fix this by extracting the key presence flag to a new field in the driver
private tx_params structure and populating it first.
Fixes: 314538041b ("rsi: fix AP mode with WPA failure due to encrypted EAPOL")
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
CC: stable@vger.kernel.org
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1630337206-12410-3-git-send-email-martin.fuzzey@flowbird.group
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b14ed6e11 upstream.
When BT coexistence is enabled (eg oper mode 13, which is the default)
the initialisation on startup sometimes silently fails.
In a normal initialisation we see
usb 1-1.3: Product: Wireless USB Network Module
usb 1-1.3: Manufacturer: Redpine Signals, Inc.
usb 1-1.3: SerialNumber: 000000000001
rsi_91x: rsi_probe: Initialized os intf ops
rsi_91x: rsi_load_9116_firmware: Loading chunk 0
rsi_91x: rsi_load_9116_firmware: Loading chunk 1
rsi_91x: rsi_load_9116_firmware: Loading chunk 2
rsi_91x: Max Stations Allowed = 1
But sometimes the last log is missing and the wlan net device is
not created.
Running a userspace loop that resets the hardware via a GPIO shows the
problem occurring ~5/100 resets.
The problem does not occur in oper mode 1 (wifi only).
Adding logs shows that the initialisation state machine requests a MAC
reset via rsi_send_reset_mac() but the firmware does not reply, leading
to the initialisation sequence being incomplete.
Fix this by delaying attaching the BT adapter until the wifi
initialisation has completed.
With this applied I have done > 300 reset loops with no errors.
Fixes: 716b840c76 ("rsi: handle BT traffic in driver")
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
CC: stable@vger.kernel.org
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1630337206-12410-2-git-send-email-martin.fuzzey@flowbird.group
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f971a85439 upstream.
Checking if DMA is enabled should be done via the
ata_dma_enabled helper function, since the init state
0xff indicates disabled.
This meant that ATA_CMD_READ_LOG_DMA_EXT was used and probed
for before DMA was enabled, which caused hangs for some combinations
of controllers and devices.
It might also have caused it to be incorrectly disabled as broken,
but there have been no reports of that.
Cc: stable@vger.kernel.org
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=195895
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e3e59c31f upstream.
It seems that the PCIe+USB firmware (latest version 15.68.19.p21) of the
88W8897 card sometimes ignores or misses when we try to wake it up by
writing to the firmware status register. This leads to the firmware
wakeup timeout expiring and the driver resetting the card because we
assume the firmware has hung up or crashed.
Turns out that the firmware actually didn't hang up, but simply "missed"
our wakeup request and didn't send us an interrupt with an AWAKE event.
Trying again to read the firmware status register after a short timeout
usually makes the firmware wake up as expected, so add a small retry
loop to mwifiex_pm_wakeup_card() that looks at the interrupt status to
check whether the card woke up.
The number of tries and timeout lengths for this were determined
experimentally: The firmware usually takes about 500 us to wake up
after we attempt to read the status register. In some cases where the
firmware is very busy (for example while doing a bluetooth scan) it
might even miss our requests for multiple milliseconds, which is why
after 15 tries the waiting time gets increased to 10 ms. The maximum
number of tries it took to wake the firmware when testing this was
around 20, so a maximum number of 50 tries should give us plenty of
safety margin.
Here's a reproducer for those firmware wakeup failures I've found:
1) Make sure wifi powersaving is enabled (iw dev wlp1s0 set power_save on)
2) Connect to any wifi network (makes firmware go into wifi powersaving
mode, not deep sleep)
3) Make sure bluetooth is turned off (to ensure the firmware actually
enters powersave mode and doesn't keep the radio active doing bluetooth
stuff)
4) To confirm that wifi powersaving is entered ping a device on the LAN,
pings should be a few ms higher than without powersaving
5) Run "while true; do iwconfig; sleep 0.0001; done", this wakes and
suspends the firmware extremely often
6) Wait until things explode, for me it consistently takes <5 minutes
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211011133224.15561-3-verdre@v0yd.nl
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e5f4eb8223 upstream.
On the 88W8897 PCIe+USB card the firmware randomly crashes after setting
the TX ring write pointer. The issue is present in the latest firmware
version 15.68.19.p21 of the PCIe+USB card.
Those firmware crashes can be worked around by reading any PCI register
of the card after setting that register, so read the PCI_VENDOR_ID
register here. The reason this works is probably because we keep the bus
from entering an ASPM state for a bit longer, because that's what causes
the cards firmware to crash.
This fixes a bug where during RX/TX traffic and with ASPM L1 substates
enabled (the specific substates where the issue happens appear to be
platform dependent), the firmware crashes and eventually a command
timeout appears in the logs.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211011133224.15561-2-verdre@v0yd.nl
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 928265e360 upstream.
There is no reason to allow "syscore" devices to runtime-suspend
during system-wide PM transitions, because they are subject to the
same possible failure modes as any other devices in that respect.
Accordingly, change device_prepare() and device_complete() to call
pm_runtime_get_noresume() and pm_runtime_put(), respectively, for
"syscore" devices too.
Fixes: 057d51a126 ("Merge branch 'pm-sleep'")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3fd2c95c1 upstream.
We observe unexpected connection drops with some APs due to
non-acked mac80211 generated null data frames (keep-alive).
After debugging and capture, we noticed that null frames are
submitted at standard data bitrate and that the given APs are
in trouble with that.
After setting the null frame bitrate to control bitrate, all
null frames are acked as expected and connection is maintained.
Not sure if it's a requirement of the specification, but it seems
the right thing to do anyway, null frames are mostly used for control
purpose (power-saving, keep-alive...), and submitting them with
a slower/simpler bitrate/modulation is more robust.
Cc: stable@vger.kernel.org
Fixes: 512b191d96 ("wcn36xx: Fix TX data path")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634560399-15290-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9e79b116c upstream.
This change fix the TX ack mechanism in various ways:
- For NO_ACK tagged packets, we don't need to wait for TX_ACK indication
and so are not subject to the single packet ack limitation. So we don't
have to stop the tx queue, and can call the tx status callback as soon
as DMA transfer has completed.
- Fix skb ownership/reference. Only start status indication timeout
once the DMA transfer has been completed. This avoids the skb to be
both referenced in the DMA tx ring and by the tx_ack_skb pointer,
preventing any use-after-free or double-free.
- This adds a sanity (paranoia?) check on the skb tx ack pointer.
- Resume TX queue if TX status tagged packet TX fails.
Cc: stable@vger.kernel.org
Fixes: fdf21cc371 ("wcn36xx: Add TX ack support")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634567281-28997-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc0fd0acb6 upstream.
Until now, we have only ever seen the REG-category registry being used
on devices addressed with target ID 2. In fact, we have only ever seen
Surface Aggregator Module (SAM) HID devices with target ID 2. For those
devices, the registry also has to be addressed with target ID 2.
Some devices, like the new Surface Laptop Studio, however, address their
HID devices on target ID 1. As a result of this, any target ID 2
commands time out. This includes event management commands addressed to
the target ID 2 REG-category registry. For these devices, the registry
has to be addressed via target ID 1 instead.
We therefore assume that the target ID of the registry to be used
depends on the target ID of the respective device. Implement this
accordingly.
Note that we currently allow the surface HID driver to only load against
devices with target ID 2, so these timeouts are not happening (yet).
This is just a preparation step before we allow the driver to load
against all target IDs.
Cc: stable@vger.kernel.org # 5.14+
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20211021130904.862610-3-luzmaximilian@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5cd1fd604 upstream.
When clearing all existing pending tx slots, mt76_tx_complete_skb needs to
be used to free the skbs, to ensure that they are cleared from the status
list as well.
Cc: stable@vger.kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f042e4019 upstream.
Add support for the Surface Laptop Studio.
In contrast to previous Surface Laptop models, this one has its HID
devices attached to target ID 1 (instead of 2). It also has a couple
more of them, including a new notifier for when the pen is stashed /
taken out of its place, a "Sys Control" device, and two other
unidentified HID devices with unknown usages.
Battery and performance profile interfaces remain the same.
Cc: stable@vger.kernel.org # 5.14+
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Link: https://lore.kernel.org/r/20211021130904.862610-2-luzmaximilian@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 046178e726 upstream.
IFB originally depended on NET_CLS_ACT for traffic redirection.
But since v4.5, that may be achieved with NFT_FWD_NETDEV as well.
Fixes: 39e6dea28a ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: <stable@vger.kernel.org> # v4.5+: bcfabee1af: netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a006acb931 upstream.
Add the missing endpoint max-packet sanity check to probe() to avoid
division by zero in ath10k_usb_hif_tx_sg() in case a malicious device
has broken descriptors (or when doing descriptor fuzz testing).
Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4f ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).
Fixes: 4db66499df ("ath10k: add initial USB support")
Cc: stable@vger.kernel.org # 4.14
Cc: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211027080819.6675-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1b9ca365d upstream.
Add the missing endpoint max-packet sanity check to probe() to avoid
division by zero in ath10k_usb_hif_tx_sg() in case a malicious device
has broken descriptors (or when doing descriptor fuzz testing).
Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4f ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).
Fixes: 9cbee35868 ("ath6kl: add full USB support")
Cc: stable@vger.kernel.org # 3.5
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211027080819.6675-3-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 89f8765a11 upstream.
Add the missing endpoint sanity checks to probe() to avoid division by
zero in mwifiex_write_data_sync() in case a malicious device has broken
descriptors (or when doing descriptor fuzz testing).
Only add checks for the firmware-download boot stage, which require both
command endpoints, for now. The driver looks like it will handle a
missing endpoint during normal operation without oopsing, albeit not
very gracefully as it will try to submit URBs to the default pipe and
fail.
Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4f ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).
Fixes: 4daffe3543 ("mwifiex: add support for Marvell USB8797 chipset")
Cc: stable@vger.kernel.org # 3.5
Cc: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211027080819.6675-4-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b16bef60a9 upstream.
The driver and its bindings, before commit 04f9f068a6 ("regulator:
s5m8767: Modify parsing method of the voltage table of buck2/3/4") were
requiring to provide at least one safe/default voltage for DVS registers
if DVS GPIO is not being enabled.
IOW, if s5m8767,pmic-buck2-uses-gpio-dvs is missing, the
s5m8767,pmic-buck2-dvs-voltage should still be present and contain one
voltage.
This requirement was coming from driver behavior matching this condition
(none of DVS GPIO is enabled): it was always initializing the DVS
selector pins to 0 and keeping the DVS enable setting at reset value
(enabled). Therefore if none of DVS GPIO is enabled in devicetree,
driver was configuring the first DVS voltage for buck[234].
Mentioned commit 04f9f068a6 ("regulator: s5m8767: Modify parsing
method of the voltage table of buck2/3/4") broke it because DVS voltage
won't be parsed from devicetree if DVS GPIO is not enabled. After the
change, driver will configure bucks to use the register reset value as
voltage which might have unpleasant effects.
Fix this by relaxing the bindings constrain: if DVS GPIO is not enabled
in devicetree (therefore DVS voltage is also not parsed), explicitly
disable it.
Cc: <stable@vger.kernel.org>
Fixes: 04f9f068a6 ("regulator: s5m8767: Modify parsing method of the voltage table of buck2/3/4")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Acked-by: Rob Herring <robh@kernel.org>
Message-Id: <20211008113723.134648-2-krzysztof.kozlowski@canonical.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db05ddf7f3 upstream.
You will get two decrements when the messages on a panic are sent, not
one, since commit 2033f68589 ("ipmi: Free receive messages when in an
oops") was added, but the watchdog code had a bug where it didn't set
the value properly.
Reported-by: Anton Lundin <glance@acc.umu.se>
Cc: <Stable@vger.kernel.org> # v5.4+
Fixes: 2033f68589 ("ipmi: Free receive messages when in an oops")
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cbfcd13be5 upstream.
Current code contains a lot of racy patterns when converting an
ocontext's context structure to an SID. This is being done in a "lazy"
fashion, such that the SID is looked up in the SID table only when it's
first needed and then cached in the "sid" field of the ocontext
structure. However, this is done without any locking or memory barriers
and is thus unsafe.
Between commits 24ed7fdae6 ("selinux: use separate table for initial
SID lookup") and 66f8e2f03c ("selinux: sidtab reverse lookup hash
table"), this race condition lead to an actual observable bug, because a
pointer to the shared sid field was passed directly to
sidtab_context_to_sid(), which was using this location to also store an
intermediate value, which could have been read by other threads and
interpreted as an SID. In practice this caused e.g. new mounts to get a
wrong (seemingly random) filesystem context, leading to strange denials.
This bug has been spotted in the wild at least twice, see [1] and [2].
Fix the race condition by making all the racy functions use a common
helper that ensures the ocontext::sid accesses are made safely using the
appropriate SMP constructs.
Note that security_netif_sid() was populating the sid field of both
contexts stored in the ocontext, but only the first one was actually
used. The SELinux wiki's documentation on the "netifcon" policy
statement [3] suggests that using only the first context is intentional.
I kept only the handling of the first context here, as there is really
no point in doing the SID lookup for the unused one.
I wasn't able to reproduce the bug mentioned above on any kernel that
includes commit 66f8e2f03c, even though it has been reported that the
issue occurs with that commit, too, just less frequently. Thus, I wasn't
able to verify that this patch fixes the issue, but it makes sense to
avoid the race condition regardless.
[1] https://github.com/containers/container-selinux/issues/89
[2] https://lists.fedoraproject.org/archives/list/selinux@lists.fedoraproject.org/thread/6DMTAMHIOAOEMUAVTULJD45JZU7IBAFM/
[3] https://selinuxproject.org/page/NetworkStatements#netifcon
Cc: stable@vger.kernel.org
Cc: Xinjie Zheng <xinjie@google.com>
Reported-by: Sujithra Periasamy <sujithra@google.com>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 235cee1624 upstream.
Commit 112665286d ("KVM: PPC: Book3S HV: Context tracking exit guest
context before enabling irqs") moved guest_exit() into the interrupt
protected area to avoid wrong context warning (or worse). The problem is
that tick-based time accounting has not yet been updated at this point
(because it depends on the timer interrupt firing), so the guest time
gets incorrectly accounted to system time.
To fix the problem, follow the x86 fix in commit 1604571401 ("Defer
vtime accounting 'til after IRQ handling"), and allow host IRQs to run
before accounting the guest exit time.
In the case vtime accounting is enabled, this is not required because TB
is used directly for accounting.
Before this patch, with CONFIG_TICK_CPU_ACCOUNTING=y in the host and a
guest running a kernel compile, the 'guest' fields of /proc/stat are
stuck at zero. With the patch they can be observed increasing roughly as
expected.
Fixes: e233d54d4d ("KVM: booke: use __kvm_guest_exit")
Fixes: 112665286d ("KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs")
Cc: stable@vger.kernel.org # 5.12+
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
[np: only required for tick accounting, add Book3E fix, tweak changelog]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211027142150.3711582-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c78a5e7aa upstream.
In open_ctree() in btrfs_check_rw_degradable() [1], we check each block
group individually if at least the minimum number of devices is available
for that profile. If all the devices are available, then we don't have to
check degradable.
[1]
open_ctree()
::
3559 if (!sb_rdonly(sb) && !btrfs_check_rw_degradable(fs_info, NULL)) {
Also before calling btrfs_check_rw_degradable() in open_ctee() at the
line number shown below [2] we call btrfs_read_chunk_tree() and down to
add_missing_dev() to record number of missing devices.
[2]
open_ctree()
::
3454 ret = btrfs_read_chunk_tree(fs_info);
btrfs_read_chunk_tree()
read_one_chunk() / read_one_dev()
add_missing_dev()
So, check if there is any missing device before btrfs_check_rw_degradable()
in open_ctree().
Also, with this the mount command could save ~16ms.[3] in the most
common case, that is no device is missing.
[3]
1) * 16934.96 us | btrfs_check_rw_degradable [btrfs]();
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 10adb1152d upstream.
At replay_dir_deletes(), if find_dir_range() returns an error we break out
of the main while loop and then assign a value of 0 (success) to the 'ret'
variable, resulting in completely ignoring that an error happened. Fix
that by jumping to the 'out' label when find_dir_range() returns an error
(negative value).
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d03dbebba upstream.
Reported bug: https://github.com/kdave/btrfs-progs/issues/389
There's a problem with scrub reporting aborted status but returning
error code 0, on a filesystem with missing and readded device.
Roughly these steps:
- mkfs -d raid1 dev1 dev2
- fill with data
- unmount
- make dev1 disappear
- mount -o degraded
- copy more data
- make dev1 appear again
Running scrub afterwards reports that the command was aborted, but the
system log message says the exit code was 0.
It seems that the cause of the error is decrementing
fs_devices->missing_devices but not clearing device->dev_state. Every
time we umount filesystem, it would call close_ctree, And it would
eventually involve btrfs_close_one_device to close the device, but it
only decrements fs_devices->missing_devices but does not clear the
device BTRFS_DEV_STATE_MISSING bit. Worse, this bug will cause Integer
Overflow, because every time umount, fs_devices->missing_devices will
decrease. If fs_devices->missing_devices value hit 0, it would overflow.
With added debugging:
loop1: detected capacity change from 0 to 20971520
BTRFS: device fsid 56ad51f1-5523-463b-8547-c19486c51ebb devid 1 transid 21 /dev/loop1 scanned by systemd-udevd (2311)
loop2: detected capacity change from 0 to 20971520
BTRFS: device fsid 56ad51f1-5523-463b-8547-c19486c51ebb devid 2 transid 17 /dev/loop2 scanned by systemd-udevd (2313)
BTRFS info (device loop1): flagging fs with big metadata feature
BTRFS info (device loop1): allowing degraded mounts
BTRFS info (device loop1): using free space tree
BTRFS info (device loop1): has skinny extents
BTRFS info (device loop1): before clear_missing.00000000f706684d /dev/loop1 0
BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 1
BTRFS info (device loop1): flagging fs with big metadata feature
BTRFS info (device loop1): allowing degraded mounts
BTRFS info (device loop1): using free space tree
BTRFS info (device loop1): has skinny extents
BTRFS info (device loop1): before clear_missing.00000000f706684d /dev/loop1 0
BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 0
BTRFS info (device loop1): flagging fs with big metadata feature
BTRFS info (device loop1): allowing degraded mounts
BTRFS info (device loop1): using free space tree
BTRFS info (device loop1): has skinny extents
BTRFS info (device loop1): before clear_missing.00000000f706684d /dev/loop1 18446744073709551615
BTRFS warning (device loop1): devid 2 uuid 6635ac31-56dd-4852-873b-c60f5e2d53d2 is missing
BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 18446744073709551615
If fs_devices->missing_devices is 0, next time it would be 18446744073709551615
After apply this patch, the fs_devices->missing_devices seems to be
right:
$ truncate -s 10g test1
$ truncate -s 10g test2
$ losetup /dev/loop1 test1
$ losetup /dev/loop2 test2
$ mkfs.btrfs -draid1 -mraid1 /dev/loop1 /dev/loop2 -f
$ losetup -d /dev/loop2
$ mount -o degraded /dev/loop1 /mnt/1
$ umount /mnt/1
$ mount -o degraded /dev/loop1 /mnt/1
$ umount /mnt/1
$ mount -o degraded /dev/loop1 /mnt/1
$ umount /mnt/1
$ dmesg
loop1: detected capacity change from 0 to 20971520
loop2: detected capacity change from 0 to 20971520
BTRFS: device fsid 15aa1203-98d3-4a66-bcae-ca82f629c2cd devid 1 transid 5 /dev/loop1 scanned by mkfs.btrfs (1863)
BTRFS: device fsid 15aa1203-98d3-4a66-bcae-ca82f629c2cd devid 2 transid 5 /dev/loop2 scanned by mkfs.btrfs (1863)
BTRFS info (device loop1): flagging fs with big metadata feature
BTRFS info (device loop1): allowing degraded mounts
BTRFS info (device loop1): disk space caching is enabled
BTRFS info (device loop1): has skinny extents
BTRFS info (device loop1): before clear_missing.00000000975bd577 /dev/loop1 0
BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 1
BTRFS info (device loop1): checking UUID tree
BTRFS info (device loop1): flagging fs with big metadata feature
BTRFS info (device loop1): allowing degraded mounts
BTRFS info (device loop1): disk space caching is enabled
BTRFS info (device loop1): has skinny extents
BTRFS info (device loop1): before clear_missing.00000000975bd577 /dev/loop1 0
BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 1
BTRFS info (device loop1): flagging fs with big metadata feature
BTRFS info (device loop1): allowing degraded mounts
BTRFS info (device loop1): disk space caching is enabled
BTRFS info (device loop1): has skinny extents
BTRFS info (device loop1): before clear_missing.00000000975bd577 /dev/loop1 0
BTRFS warning (device loop1): devid 2 uuid 8b333791-0b3f-4f57-b449-1c1ab6b51f38 is missing
BTRFS info (device loop1): before clear_missing.0000000000000000 /dev/loop2 1
CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Li Zhang <zhanglikernel@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c4a146c7cf ]
The value of llc_testlink_time is set to the value stored in
net->ipv4.sysctl_tcp_keepalive_time when linkgroup init. The value of
sysctl_tcp_keepalive_time is already jiffies, so we don't need to
multiply by HZ, which would cause smc_link->llc_testlink_time overflow,
and test_link send flood.
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90a881fc35 ]
MTU change is refused whenever the value of new MTU is bigger than
the max packet bytes that fits in NFP Cluster Target Memory (CTM).
However, an eBPF program doesn't always need to access the whole
packet data.
The maximum direct packet access (DPA) offset has always been
caculated by verifier and stored in the max_pkt_offset field of prog
aux data.
Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Reviewed-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Niklas Soderlund <niklas.soderlund@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9159f10240 ]
The netif_device_detach() conditionally stops all tx queues if the queues
are running. There is no need to call netif_tx_stop_all_queues() again.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f29da4088f ]
Currently, the workqueue of hclge/hclgevf is executed on
the CPU that initiates scheduling requests by default. In
stress scenarios, the CPU may be busy and workqueue scheduling
is completed after a long period of time. To avoid this
situation and implement proper scheduling, use the WQ_UNBOUND
mode instead. In this way, the workqueue can be performed on
a relatively idle CPU.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86aeda32b8 ]
Pass the correct length to nvmet_tcp_verify_hdgst, which is the pdu
header length. This fixes a wrong behaviour where header digest
verification passes although the digest is wrong.
Signed-off-by: Amit Engel <amit.engel@dell.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9586e67b91 ]
When dispatching a zone append write request to a SCSI zoned block device,
if the target zone of the request is already locked, the device driver will
return BLK_STS_ZONE_RESOURCE and the request will be pushed back to the
hctx dipatch queue. The queue will be marked as RESTART in
dd_finish_request() and restarted in __blk_mq_free_request(). However, this
restart applies to the hctx of the completed request. If the requeued
request is on a different hctx, dispatch will no be retried until another
request is submitted or the next periodic queue run triggers, leading to up
to 30 seconds latency for the requeued request.
Fix this problem by scheduling a queue restart similarly to the
BLK_STS_RESOURCE case or when we cannot get the budget.
Also, consolidate the checks into the "need_resource" variable to simplify
the condition.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Niklas Cassel <Niklas.Cassel@wdc.com>
Link: https://lore.kernel.org/r/20211026165127.4151055-1-naohiro.aota@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9122a70a63 ]
During a testing of an user-space application which transmits UDP
multicast datagrams and utilizes multicast routing to send the UDP
datagrams out of defined network interfaces, I've found a multicast
router does not fill-in UDP checksum into locally produced, looped-back
and forwarded UDP datagrams, if an original output NIC the datagrams
are sent to has UDP TX checksum offload enabled.
The datagrams are sent malformed out of the NIC the datagrams have been
forwarded to.
It is because:
1. If TX checksum offload is enabled on the output NIC, UDP checksum
is not calculated by kernel and is not filled into skb data.
2. dev_loopback_xmit(), which is called solely by
ip_mc_finish_output(), sets skb->ip_summed = CHECKSUM_UNNECESSARY
unconditionally.
3. Since 35fc92a9 ("[NET]: Allow forwarding of ip_summed except
CHECKSUM_COMPLETE"), the ip_summed value is preserved during
forwarding.
4. If ip_summed != CHECKSUM_PARTIAL, checksum is not calculated during
a packet egress.
The minimum fix in dev_loopback_xmit():
1. Preserves skb->ip_summed CHECKSUM_PARTIAL. This is the
case when the original output NIC has TX checksum offload enabled.
The effects are:
a) If the forwarding destination interface supports TX checksum
offloading, the NIC driver is responsible to fill-in the
checksum.
b) If the forwarding destination interface does NOT support TX
checksum offloading, checksums are filled-in by kernel before
skb is submitted to the NIC driver.
c) For local delivery, checksum validation is skipped as in the
case of CHECKSUM_UNNECESSARY, thanks to skb_csum_unnecessary().
2. Translates ip_summed CHECKSUM_NONE to CHECKSUM_UNNECESSARY. It
means, for CHECKSUM_NONE, the behavior is unmodified and is there
to skip a looped-back packet local delivery checksum validation.
Signed-off-by: Cyril Strejc <cyril.strejc@skoda.cz>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 926245c7d2 ]
page_frag_free() won't completely release the memory
allocated for the commands, the cache page must be explicitly
freed by calling __page_frag_cache_drain().
This bug can be easily reproduced by repeatedly
executing the following command on the initiator:
$echo 1 > /sys/devices/virtual/nvme-fabrics/ctl/nvme0/reset_controller
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 042b2046d0 ]
The tx queues are not stopped during the live migration. As a result, the
ndo_start_xmit() may access netfront_info->queues which is freed by
talk_to_netback()->xennet_destroy_queues().
This patch is to netif_device_detach() at the beginning of xen-netfront
resuming, and netif_device_attach() at the end of resuming.
CPU A CPU B
talk_to_netback()
-> if (info->queues)
xennet_destroy_queues(info);
to free netfront_info->queues
xennet_start_xmit()
to access netfront_info->queues
-> err = xennet_create_queues(info, &num_queues);
The idea is borrowed from virtio-net.
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f09f6dfef8 ]
The spi-altera driver has two flavors: platform and dfl. I'm seeing
a case where I have both device types in the same machine, and they
are conflicting on the SPI ID:
... kernel: couldn't get idr
... kernel: WARNING: CPU: 28 PID: 912 at drivers/spi/spi.c:2920 spi_register_controller.cold+0x84/0xc0a
Both the platform and dfl drivers use the parent's driver ID as the SPI
ID. In the error case, the parent devices are dfl_dev.4 and
subdev_spi_altera.4.auto. When the second spi-master is created, the
failure occurs because the SPI ID of 4 has already been allocated.
Change the ID allocation to dynamic (by initializing bus_num to -1) to
avoid duplicate SPI IDs.
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20211019002401.24041-1-russell.h.weight@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 162079f2dc ]
The Winbond MMC driver fails to build on ARCH=m68k so prevent
that build config. Silences these build errors:
../drivers/mmc/host/wbsd.c: In function 'wbsd_request_end':
../drivers/mmc/host/wbsd.c:212:28: error: implicit declaration of function 'claim_dma_lock' [-Werror=implicit-function-declaration]
212 | dmaflags = claim_dma_lock();
../drivers/mmc/host/wbsd.c:215:17: error: implicit declaration of function 'release_dma_lock'; did you mean 'release_task'? [-Werror=implicit-function-declaration]
215 | release_dma_lock(dmaflags);
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Pierre Ossman <pierre@ossman.eu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211017175949.23838-1-rdunlap@infradead.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 55dd7e0590 ]
Commit bbc4d71d63 ("net: phy: realtek: fix rtl8211e rx/tx delay
config") sets the RX/TX delay according to the phy-mode property in the
device tree. For the A20-olinuxino-lime2 board this is "rgmii", which is the
wrong setting.
Following the example of a900cac375 ("ARM: dts: sun7i: a20: bananapro:
Fix ethernet phy-mode") the phy-mode is changed to "rgmii-id" which gets
the Ethernet working again on this board.
Signed-off-by: Bastien Roucariès <rouca@debian.org>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20210916081721.237137-1-rouca@debian.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8017c99680 ]
On arm64 randconfig builds, hyperv sometimes fails with this
error:
In file included from drivers/hv/hv_trace.c:3:
In file included from drivers/hv/hyperv_vmbus.h:16:
In file included from arch/arm64/include/asm/sync_bitops.h:5:
arch/arm64/include/asm/bitops.h:11:2: error: only <linux/bitops.h> can be included directly
In file included from include/asm-generic/bitops/hweight.h:5:
include/asm-generic/bitops/arch_hweight.h:9:9: error: implicit declaration of function '__sw_hweight32' [-Werror,-Wimplicit-function-declaration]
include/asm-generic/bitops/atomic.h:17:7: error: implicit declaration of function 'BIT_WORD' [-Werror,-Wimplicit-function-declaration]
Include the correct header first.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211018131929.2260087-1-arnd@kernel.org
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf6abf345d ]
Use pci_info instead to avoid unnamed/uninitialized noise:
[197088.688729] sfc 0000:01:00.0: Solarflare NIC detected
[197088.690333] sfc 0000:01:00.0: Part Number : SFN5122F
[197088.729061] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no SR-IOV VFs probed
[197088.729071] sfc 0000:01:00.0 (unnamed net_device) (uninitialized): no PTP support
Inspired by fa44821a4d ("sfc: don't use netif_info et al before
net_device is registered") from Heiner Kallweit.
Signed-off-by: Erik Ekman <erik@kryo.se>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c62041c5ba ]
The 1/10GbaseT modes were set up for cards with SFP+ cages in
3497ed8c85 ("sfc: report supported link speeds on SFP connections").
10GbaseT was likely used since no 10G fibre mode existed.
The missing fibre modes for 1/10G were added to ethtool.h in 5711a98221
("net: ethtool: add support for 1000BaseX and missing 10G link modes")
shortly thereafter.
The user guide available at https://support-nic.xilinx.com/wp/drivers
lists support for the following cable and transceiver types in section 2.9:
- QSFP28 100G Direct Attach Cables
- QSFP28 100G SR Optical Transceivers (with SR4 modules listed)
- SFP28 25G Direct Attach Cables
- SFP28 25G SR Optical Transceivers
- QSFP+ 40G Direct Attach Cables
- QSFP+ 40G Active Optical Cables
- QSFP+ 40G SR4 Optical Transceivers
- QSFP+ to SFP+ Breakout Direct Attach Cables
- QSFP+ to SFP+ Breakout Active Optical Cables
- SFP+ 10G Direct Attach Cables
- SFP+ 10G SR Optical Transceivers
- SFP+ 10G LR Optical Transceivers
- SFP 1000BASE‐T Transceivers
- 1G Optical Transceivers
(From user guide issue 28. Issue 16 which also includes older cards like
SFN5xxx/SFN6xxx has matching lists for 1/10/40G transceiver types.)
Regarding SFP+ 10GBASE‐T transceivers the latest guide says:
"Solarflare adapters do not support 10GBASE‐T transceiver modules."
Tested using SFN5122F-R7 (with 2 SFP+ ports). Supported link modes do not change
depending on module used (tested with 1000BASE-T, 1000BASE-BX10, 10GBASE-LR).
Before:
$ ethtool ext
Settings for ext:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
10000baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Link partner advertised link modes: Not reported
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: No
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Auto-negotiation: off
Port: FIBRE
PHYAD: 255
Transceiver: internal
Current message level: 0x000020f7 (8439)
drv probe link ifdown ifup rx_err tx_err hw
Link detected: yes
After:
$ ethtool ext
Settings for ext:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
1000baseX/Full
10000baseCR/Full
10000baseSR/Full
10000baseLR/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Link partner advertised link modes: Not reported
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: No
Link partner advertised FEC modes: Not reported
Speed: 1000Mb/s
Duplex: Full
Auto-negotiation: off
Port: FIBRE
PHYAD: 255
Transceiver: internal
Supports Wake-on: g
Wake-on: d
Current message level: 0x000020f7 (8439)
drv probe link ifdown ifup rx_err tx_err hw
Link detected: yes
Signed-off-by: Erik Ekman <erik@kryo.se>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c69b2f4687 ]
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e211210098 ]
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4a8f71014b ]
The sgl is freed in the target stack in target_release_cmd_kref() before
calling qlt_free_cmd() but there is an unmap of sgl in qlt_free_cmd() that
causes a panic if sgl is not yet DMA unmapped:
NIP dma_direct_unmap_sg+0xdc/0x180
LR dma_direct_unmap_sg+0xc8/0x180
Call Trace:
ql_dbg_prefix+0x68/0xc0 [qla2xxx] (unreliable)
dma_unmap_sg_attrs+0x54/0xf0
qlt_unmap_sg.part.19+0x54/0x1c0 [qla2xxx]
qlt_free_cmd+0x124/0x1d0 [qla2xxx]
tcm_qla2xxx_release_cmd+0x4c/0xa0 [tcm_qla2xxx]
target_put_sess_cmd+0x198/0x370 [target_core_mod]
transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
tcm_qla2xxx_complete_free+0x6c/0x90 [tcm_qla2xxx]
The sgl may be left unmapped in error cases of response sending. For
instance, qlt_rdy_to_xfer() maps sgl and exits when session is being
deleted keeping the sgl mapped.
This patch removes use-after-free of the sgl and ensures that the sgl is
unmapped for any command that was not sent to firmware.
Link: https://lore.kernel.org/r/20211018122650.11846-1-d.bogdanov@yadro.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2cddb44bd ]
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d1a7b9e469 ]
Fix following coccicheck warning:
./drivers/net/ethernet/mscc/ocelot_vsc7514.c:946:1-33: WARNING: Function
for_each_available_child_of_node should have of_node_put() before goto.
Early exits from for_each_available_child_of_node should decrement the
node reference counter.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d9fd7e9fcc ]
Fix following coccicheck warning:
./drivers/net/ethernet/microchip/sparx5/s4parx5_main.c:723:1-33: WARNING: Function
for_each_available_child_of_node should have of_node_put() before goto
Early exits from for_each_available_child_of_node should decrement the
node reference counter.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c2402d43d1 ]
Commit a86ed2cfa1 ("ptp: Don't print an error if ptp_kvm is not supported")
fixes the error message print on ARM platform by only concerning about
the case that the error returned from kvm_arch_ptp_init() is not -EOPNOTSUPP.
Although the ARM platform returns -EOPNOTSUPP if ptp_kvm is not supported
while X86_64 platform returns -KVM_EOPNOTSUPP, both error codes share the
same value 95.
Actually kvm_arch_ptp_init() on X86_64 platform can return three kinds of
errors (-KVM_ENOSYS, -KVM_EOPNOTSUPP and -KVM_EFAULT). The problem is that
-KVM_EOPNOTSUPP is masked out and -KVM_EFAULT is ignored among them.
This patch fixes this by returning them to ptp_kvm_init() respectively.
Signed-off-by: Kele Huang <huangkele@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d94befbb5a ]
In laptop 'HP Spectre x360 Convertible 15-eb1xxx/8811' both front and
rear speakers are silent, this patch fixes that by overriding the pin
layout and by initializing the amplifier which needs a GPIO pin to be
set to 1 then 0, similar to the existing HP Spectre x360 14 model.
In order to have volume control, both front and rear speakers were
forced to use the DAC1.
This patch also correctly map the mute LED but since there is no
microphone on/off switch exposed by the alsa subsystem it never turns
on by itself.
There are still known audio issues in this laptop: headset microphone
doesn't work, the button to mute/unmute microphone is not yet mapped,
the LED of the mute/unmute speakers doesn't seems to be exposed via
GPIO and never turns on.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213953
Signed-off-by: Davide Baldo <davide@baldo.me>
Link: https://lore.kernel.org/r/20211015072121.5287-1-davide@baldo.me
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c448b7aa3e ]
'component' is allocated in snd_soc_register_component(), but component->list
is not initalized, this may cause snd_soc_del_component_unlocked() deref null
ptr in the error handing case.
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:__list_del_entry_valid+0x81/0xf0
Call Trace:
snd_soc_del_component_unlocked+0x69/0x1b0 [snd_soc_core]
snd_soc_add_component.cold+0x54/0x6c [snd_soc_core]
snd_soc_register_component+0x70/0x90 [snd_soc_core]
devm_snd_soc_register_component+0x5e/0xd0 [snd_soc_core]
tas2552_probe+0x265/0x320 [snd_soc_tas2552]
? tas2552_component_probe+0x1e0/0x1e0 [snd_soc_tas2552]
i2c_device_probe+0xa31/0xbe0
Fix by adding INIT_LIST_HEAD() to snd_soc_component_initialize().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211009065840.3196239-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit b968e84b50 upstream.
Since commit c8137ace56 ("x86/iopl: Restrict iopl() permission
scope") it's possible to emulate iopl(3) using ioperm(), except for
the CLI/STI usage.
Userspace CLI/STI usage is very dubious (read broken), since any
exception taken during that window can lead to rescheduling anyway (or
worse). The IOPL(2) manpage even states that usage of CLI/STI is highly
discouraged and might even crash the system.
Of course, that won't stop people and HP has the dubious honour of
being the first vendor to be found using this in their hp-health
package.
In order to enable this 'software' to still 'work', have the #GP treat
the CLI/STI instructions as NOPs when iopl(3). Warn the user that
their program is doing dubious things.
Fixes: a24ca99768 ("x86/iopl: Remove legacy IOPL option")
Reported-by: Ondrej Zary <linux@zary.sk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org # v5.5+
Link: https://lkml.kernel.org/r/20210918090641.GD5106@worktop.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ff53f6a43 upstream.
Add a synchronize_rcu() after clearing the posted interrupt wakeup handler
to ensure all readers, i.e. in-flight IRQ handlers, see the new handler
before returning to the caller. If the caller is an exiting module and
is unregistering its handler, failure to wait could result in the IRQ
handler jumping into an unloaded module.
The registration path doesn't require synchronization, as it's the
caller's responsibility to not generate interrupts it cares about until
after its handler is registered.
Fixes: f6b3c72c23 ("x86/irq: Define a global vector for VT-d Posted-Interrupts")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211009001107.3936588-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 415de44076 upstream.
Currently, Linux probes for X86_BUG_NULL_SEL unconditionally which
makes it unsafe to migrate in a virtualised environment as the
properties across the migration pool might differ.
To be specific, the case which goes wrong is:
1. Zen1 (or earlier) and Zen2 (or later) in a migration pool
2. Linux boots on Zen2, probes and finds the absence of X86_BUG_NULL_SEL
3. Linux is then migrated to Zen1
Linux is now running on a X86_BUG_NULL_SEL-impacted CPU while believing
that the bug is fixed.
The only way to address the problem is to fully trust the "no longer
affected" CPUID bit when virtualised, because in the above case it would
be clear deliberately to indicate the fact "you might migrate to
somewhere which has this behaviour".
Zen3 adds the NullSelectorClearsBase CPUID bit to indicate that loading
a NULL segment selector zeroes the base and limit fields, as well as
just attributes. Zen2 also has this behaviour but doesn't have the NSCB
bit.
[ bp: Minor touchups. ]
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
CC: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20211021104744.24126-1-jane.malalane@citrix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7d445ab26 upstream.
When runtime support for converting between 4-level and 5-level pagetables
was added to the kernel, the SME code that built pagetables was updated
to use the pagetable functions, e.g. p4d_offset(), etc., in order to
simplify the code. However, the use of the pagetable functions in early
boot code requires the use of the USE_EARLY_PGTABLE_L5 #define in order to
ensure that the proper definition of pgtable_l5_enabled() is used.
Without the #define, pgtable_l5_enabled() is #defined as
cpu_feature_enabled(X86_FEATURE_LA57). In early boot, the CPU features
have not yet been discovered and populated, so pgtable_l5_enabled() will
return false even when 5-level paging is enabled. This causes the SME code
to always build 4-level pagetables to perform the in-place encryption.
If 5-level paging is enabled, switching to the SME pagetables results in
a page-fault that kills the boot.
Adding the #define results in pgtable_l5_enabled() using the
__pgtable_l5_enabled variable set in early boot and the SME code building
pagetables for the proper paging level.
Fixes: aad983913d ("x86/mm/encrypt: Simplify sme_populate_pgd() and sme_populate_pgd_large()")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> # 4.18.x
Link: https://lkml.kernel.org/r/2cb8329655f5c753905812d951e212022a480475.1634318656.git.thomas.lendacky@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 712a951025 upstream.
It is possible to trigger a crash by splicing anon pipe bufs to the fuse
device.
The reason for this is that anon_pipe_buf_release() will reuse buf->page if
the refcount is 1, but that page might have already been stolen and its
flags modified (e.g. PG_lru added).
This happens in the unlikely case of fuse_dev_splice_write() getting around
to calling pipe_buf_release() after a page has been stolen, added to the
page cache and removed from the page cache.
Fix by calling pipe_buf_release() right after the page was inserted into
the page cache. In this case the page has an elevated refcount so any
release function will know that the page isn't reusable.
Reported-by: Frank Dinoff <fdinoff@google.com>
Link: https://lore.kernel.org/r/CAAmZXrsGg2xsP1CK+cbuEMumtrqdvD-NKnWzhNcvn71RV3c1yw@mail.gmail.com/
Fixes: dd3bb14f44 ("fuse: support splice() writing to fuse device")
Cc: <stable@vger.kernel.org> # v2.6.35
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1811bc401a upstream.
After we drop i_data sem, we need to reload the ext4_ext_path
structure since the extent tree can change once i_data_sem is
released.
This addresses the BUG:
[52117.465187] ------------[ cut here ]------------
[52117.465686] kernel BUG at fs/ext4/extents.c:1756!
...
[52117.478306] Call Trace:
[52117.478565] ext4_ext_shift_extents+0x3ee/0x710
[52117.479020] ext4_fallocate+0x139c/0x1b40
[52117.479405] ? __do_sys_newfstat+0x6b/0x80
[52117.479805] vfs_fallocate+0x151/0x4b0
[52117.480177] ksys_fallocate+0x4a/0xa0
[52117.480533] __x64_sys_fallocate+0x22/0x30
[52117.480930] do_syscall_64+0x35/0x80
[52117.481277] entry_SYSCALL_64_after_hwframe+0x44/0xae
[52117.481769] RIP: 0033:0x7fa062f855ca
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20210903062748.4118886-4-yangerkun@huawei.com
Signed-off-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 39fec6889d upstream.
Ext4 file system has default lazy inode table initialization setup once
it is mounted. However, it has issue on computing the next schedule time
that makes the timeout same amount in jiffies but different real time in
secs if with various HZ values. Therefore, fix by measuring the current
time in a more granular unit nanoseconds and make the next schedule time
independent of the HZ value.
Fixes: bfff68738f ("ext4: add support for lazy inode table initialization")
Signed-off-by: Shaoying Xu <shaoyi@amazon.com>
Cc: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210902164412.9994-2-shaoyi@amazon.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0317c0e87 upstream.
When the timer instance was add into ack_list but was not currently in
process, the user could stop it via snd_timer_stop1() without delete it
from the ack_list. Then the user could free the timer instance and when
it was actually processed UAF occurred.
This issue could be reproduced via testcase snd_timer01 in ltp - running
several instances of that testcase at the same time.
What I actually met was that the ack_list of the timer broken and the
kernel went into deadloop with irqoff. That could be detected by
hardlockup detector on board or when we run it on qemu, we could use gdb
to dump the ack_list when the console has no response.
To fix this issue, we delete the timer instance from ack_list and
active_list unconditionally in snd_timer_stop1().
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Suggested-by: Takashi Iwai <tiwai@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211103033517.80531-1-wangwensheng4@huawei.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 39173303c8 upstream.
The recent change in hda-intel driver to allow repeated probes
surfaced a problem that has been hidden until; the probe process in
the work calls azx_free() at the error path, and this skips the card
free process that eventually releases codec instances. As a result,
we get a kernel WARNING like:
snd_hda_intel 0000:00:1f.3: Cannot probe codecs, giving up
------------[ cut here ]------------
WARNING: CPU: 14 PID: 186 at sound/hda/hdac_bus.c:73
....
For fixing this, we need to call snd_card_free() instead of
azx_free(). Additionally, the device drvdata has to be cleared, as
the driver binding itself is still active. Then the PM and other
driver callbacks will ignore the procedure.
Fixes: c0f1886de7 ("ALSA: hda: intel: Allow repeatedly probing on codec configuration errors")
Reported-and-tested-by: Scott Branden <scott.branden@broadcom.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/063e2397-7edb-5f48-7b0d-618b938d9dd8@broadcom.com
Link: https://lore.kernel.org/r/20211110194633.19098-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f27b68906 upstream.
Adding the Line6 HX-Stomp XL USB_ID as it needs this fixed frequency
quirk as well.
The device is basically just the HX-Stomp with some more buttons on
the face. I've done some recording with it after adding it, and it
seems to function properly with this fix. The Midi features appear to
be working as well.
[ a coding style fix and patch reformat by tiwai ]
Signed-off-by: Jason Ormes <skryking@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211030200405.1358678-1-skryking@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55f261b73a upstream.
Add the missing endpoint max-packet sanity check to probe() to avoid
division by zero in alloc_stream_buffers() in case a malicious device
has broken descriptors (or when doing descriptor fuzz testing).
Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4f ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).
Fixes: 63978ab3e3 ("sound: add Edirol UA-101 support")
Cc: stable@vger.kernel.org # 2.6.34
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211026095401.26522-1-johan@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 861f92cb91 upstream.
Drivers that do not use the ctrl-framework use this function instead.
Fix the following issues:
- Do not check for multiple classes when getting the DEF_VAL.
- Return -EINVAL for request_api calls
- Default value cannot be changed, return EINVAL as soon as possible.
- Return the right error_idx
[If an error is found when validating the list of controls passed with
VIDIOC_G_EXT_CTRLS, then error_idx shall be set to ctrls->count to
indicate to userspace that no actual hardware was touched.
It would have been much nicer of course if error_idx could point to the
control index that failed the validation, but sadly that's not how the
API was designed.]
Fixes v4l2-compliance:
Control ioctls (Input 0):
warn: v4l2-test-controls.cpp(834): error_idx should be equal to count
warn: v4l2-test-controls.cpp(855): error_idx should be equal to count
fail: v4l2-test-controls.cpp(813): doioctl(node, VIDIOC_G_EXT_CTRLS, &ctrls)
test VIDIOC_G/S/TRY_EXT_CTRLS: FAIL
Buffer ioctls (Input 0):
fail: v4l2-test-buffers.cpp(1994): ret != EINVAL && ret != EBADR && ret != ENOTTY
test Requests: FAIL
Cc: stable@vger.kernel.org
Fixes: 6fa6f831f0 ("media: v4l2-ctrls: add core request support")
Suggested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0887e9e152 upstream.
The mem-to-mem stateless decoder API specifies support for dynamic
resolution changes. In particular, the decoder should accept format
changes on the OUTPUT queue even when buffers have been allocated,
as long as it is not streaming.
Relax restrictions for S_FMT as described in the previous paragraph,
and as long as the codec format remains the same. This aligns it with
the Hantro and Cedrus decoders. This change was mostly based on commit
ae02d49493 ("media: hantro: Fix s_fmt for dynamic resolution changes").
Since rkvdec_s_fmt() is now just a wrapper around the output/capture
variants without any additional shared functionality, drop the wrapper
and call the respective functions directly.
Fixes: cd33c83044 ("media: rkvdec: Add the rkvdec driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 298d8e8f7b upstream.
The rkvdec H.264 decoder currently overrides sizeimage for the output
format. This causes issues when userspace requires and requests a larger
buffer, but ends up with one of insufficient size.
Instead, only provide a default size if none was requested. This fixes
the video_decode_accelerator_tests from Chromium failing on the first
frame due to insufficient buffer space. It also aligns the behavior
of the rkvdec driver with the Hantro and Cedrus drivers.
Fixes: cd33c83044 ("media: rkvdec: Add the rkvdec driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ac5fb35cd upstream.
sizeof when applied to a pointer typed expression gives the size of
the pointer.
./drivers/firmware/psci/psci_checker.c:158:41-47: ERROR application of sizeof to pointer
This issue was detected with the help of Coccinelle.
Fixes: 7401056de5 ("drivers/firmware: psci_checker: stash and use topology_core_cpumask for hotplug tests")
Cc: stable@vger.kernel.org
Reported-by: Zeal Robot <zealci@zte.com.cn>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: jing yangyang <jing.yangyang@zte.com.cn>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8779e05ba8 upstream.
The TIF_XXX flags are stored in the flags field in the thread_info
struct (TI_FLAGS), not in the flags field of the task_struct structure
(TASK_FLAGS).
It seems this bug didn't generate any important side-effects, otherwise it
wouldn't have went unnoticed for 12 years (since v2.6.32).
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: ecd3d4bc06 ("parisc: stop using task->ptrace for {single,block}step flags")
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e866a4628 upstream.
Fix a kernel crash which happens on PA1.x CPUs while initializing the
FTRACE/KPROBE breakpoints. The PTE table entries for the fixmap area
were not created correctly.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: ccfbc68d41 ("parisc: add set_fixmap()/clear_fixmap()")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c336d6e33 upstream.
When calculating i_blocks, there was a mistake that was masked with a
32-bit variable. So i_blocks for files larger than 4 GiB had incorrect
values. Mask with a 64-bit variable instead of 32-bit one.
Fixes: 5f2aa07507 ("exfat: add inode operations")
Cc: stable@vger.kernel.org # v5.7+
Reported-by: Ganapathi Kamath <hgkamath@hotmail.com>
Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 43592c8736 upstream.
Only wait for DRTO on reads, otherwise the driver hangs.
The driver prevents sending CMD12 on response errors like CRCs. According
to the comment this is because some cards have problems with this during
the UHS tuning sequence. Unfortunately this workaround currently also
applies for any command with data. On reads this will set the drto timer,
which then triggers after a while. On writes this will not set any timer
and the tasklet will not be scheduled again.
I cannot test for the UHS workarounds need, but even if so, it should at
most apply to reads. I have observed many hangs when CMD25 response
contained a CRC error. This patch fixes this without touching the actual
UHS tuning workaround.
Signed-off-by: Christian Loehle <cloehle@hyperstone.com>
Reviewed-by: Jaehoon Chung <jh80.chung@samsung.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/af8f8b8674ba4fcc9a781019e4aeb72c@hyperstone.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15c9a35909 upstream.
When endpoint_alloc() return failed in xillyusb_setup_base_eps(),
'xdev->msg_ep' will be freed but not set to NULL. That lets program
enter fail handling to cleanup_dev() in xillyusb_probe(). Check for
'xdev->msg_ep' is invalid in cleanup_dev() because 'xdev->msg_ep' did
not set to NULL when was freed. So the UAF problem for 'xdev->msg_ep'
is triggered.
==================================================================
BUG: KASAN: use-after-free in fifo_mem_release+0x1f4/0x210
CPU: 0 PID: 166 Comm: kworker/0:2 Not tainted 5.15.0-rc5+ #19
Call Trace:
dump_stack_lvl+0xe2/0x152
print_address_description.constprop.0+0x21/0x140
? fifo_mem_release+0x1f4/0x210
kasan_report.cold+0x7f/0x11b
? xillyusb_probe+0x530/0x700
? fifo_mem_release+0x1f4/0x210
fifo_mem_release+0x1f4/0x210
? __sanitizer_cov_trace_pc+0x1d/0x50
endpoint_dealloc+0x35/0x2b0
cleanup_dev+0x90/0x120
xillyusb_probe+0x59a/0x700
...
Freed by task 166:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0x109/0x140
kfree+0x117/0x4c0
xillyusb_probe+0x606/0x700
Set 'xdev->msg_ep' to NULL after being freed in xillyusb_setup_base_eps()
to fix the UAF problem.
Fixes: a53d1202ae ("char: xillybus: Add driver for XillyUSB (Xillybus variant for USB)")
Cc: stable <stable@vger.kernel.org>
Acked-by: Eli Billauer <eli.billauer@gmail.com>
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Link: https://lore.kernel.org/r/20211016052047.1611983-1-william.xuanziyang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93f43ed81a upstream.
The code which constructs the modules for each engine present on the GPU
passes -1 for 'instance' on non-instanced engines, which affects how the
name for a sub-device is generated. This is then stored as 'instance 0'
in nvkm_subdev.inst, so code can potentially be shared with earlier GPUs
that only had a single instance of an engine.
However, GF100's CE constructor uses this value to calculate the address
of its falcon before it's translated, resulting in CE0 getting the wrong
address.
This slightly modifies the approach, always passing a valid instance for
engines that *can* have multiple copies, and having the code for earlier
GPUs explicitly ask for non-instanced name generation.
Bug: https://gitlab.freedesktop.org/drm/nouveau/-/issues/91
Fixes: 50551b15c7 ("drm/nouveau/ce: switch to instanced constructor")
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211103011057.15344-1-skeggsb@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd8a36a90b upstream.
A prior patch inadvertently caused lpfc_sli_sum_iocb() to exclude counting
of outstanding aborted I/Os and ABORT IOCBs. Thus,
lpfc_reset_flush_io_context() called from any TMF routine does not properly
wait to flush all outstanding FCP IOCBs leading to a block layer crash on
an invalid scsi_cmnd->request pointer.
kernel BUG at ../block/blk-core.c:1489!
RIP: 0010:blk_requeue_request+0xaf/0xc0
...
Call Trace:
<IRQ>
__scsi_queue_insert+0x90/0xe0 [scsi_mod]
blk_done_softirq+0x7e/0x90
__do_softirq+0xd2/0x280
irq_exit+0xd5/0xe0
do_IRQ+0x4c/0xd0
common_interrupt+0x87/0x87
</IRQ>
Fix by separating out the LPFC_IO_FCP, LPFC_IO_ON_TXCMPLQ,
LPFC_DRIVER_ABORTED, and CMD_ABORT_XRI_CN || CMD_CLOSE_XRI_CN checks into a
new lpfc_sli_validate_fcp_iocb_for_abort() routine when determining to
build an ABORT iocb.
Restore lpfc_reset_flush_io_context() functionality by including counting
of outstanding aborted IOCBs and ABORT IOCBs in lpfc_sli_sum_iocb().
Link: https://lore.kernel.org/r/20210910233159.115896-9-jsmart2021@gmail.com
Fixes: e136471135 ("scsi: lpfc: Fix illegal memory access on Abort IOCBs")
Cc: <stable@vger.kernel.org> # v5.12+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 982fc3965d upstream.
In a rarely executed path, FLOGI failure, there is a refcounting error. If
FLOGI completed with an error, typically a timeout, the initial completion
handler would remove the job reference. However, the job completion isn't
the actual end of the job/exchange as the timeout usually initiates an
ABTS, and upon that ABTS completion, a final completion is sent. The driver
removes the reference again in the final completion. Thus the imbalance.
In the buggy cases, if there was a link bounce while the delayed response
is outstanding, the fport node may be referenced again but there was no
additional reference as it is already present. The delayed completion then
occurs and removes the last reference freeing the node and causing issues
in the link up processed that is using the node.
Fix this scenario by removing the snippet that removed the reference in the
initial FLOGI completion. The bad snippet was poorly trying to identify the
FLOGI as OK to do so by realizing the node was not registered with either
SCSI or NVMe transport.
Link: https://lore.kernel.org/r/20210910233159.115896-3-jsmart2021@gmail.com
Fixes: 618e2ee146 ("scsi: lpfc: Fix FLOGI failure due to accessing a freed node")
Cc: <stable@vger.kernel.org> # v5.13+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 703535e6ae upstream.
No need to deduce command size in scsi_setup_scsi_cmnd() anymore as
appropriate checks have been added to scsi_fill_sghdr_rq() function and the
cmd_len should never be zero here. The code to do that wasn't correct
anyway, as it used uninitialized cmd->cmnd, which caused a null-ptr-deref
if the command size was zero as in the trace below. Fix this by removing
the unneeded code.
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 1822 Comm: repro Not tainted 5.15.0 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014
Call Trace:
blk_mq_dispatch_rq_list+0x7c7/0x12d0
__blk_mq_sched_dispatch_requests+0x244/0x380
blk_mq_sched_dispatch_requests+0xf0/0x160
__blk_mq_run_hw_queue+0xe8/0x160
__blk_mq_delay_run_hw_queue+0x252/0x5d0
blk_mq_run_hw_queue+0x1dd/0x3b0
blk_mq_sched_insert_request+0x1ff/0x3e0
blk_execute_rq_nowait+0x173/0x1e0
blk_execute_rq+0x15c/0x540
sg_io+0x97c/0x1370
scsi_ioctl+0xe16/0x28e0
sd_ioctl+0x134/0x170
blkdev_ioctl+0x362/0x6e0
block_ioctl+0xb0/0xf0
vfs_ioctl+0xa7/0xf0
do_syscall_64+0x3d/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
---[ end trace 8b086e334adef6d2 ]---
Kernel panic - not syncing: Fatal exception
Link: https://lore.kernel.org/r/20211103170659.22151-2-tadeusz.struk@linaro.org
Fixes: 2ceda20f0a ("scsi: core: Move command size detection out of the fast path")
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Cc: <stable@vger.kernel.org> # 5.15, 5.14, 5.10
Reported-by: syzbot+5516b30f5401d4dcbcae@syzkaller.appspotmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ae17501bc upstream.
The changes to issue the abort from the scmd->abort_work instead of the EH
thread introduced a problem if eh_deadline is used. If aborting the
command(s) is successful, and there are never any scmds added to the
shost->eh_cmd_q, there is no code path which will reset the ->last_reset
value back to zero.
The effect of this is that after a successful abort with no EH thread
activity, a subsequent timeout, perhaps a long time later, might
immediately be considered past a user-set eh_deadline time, and the host
will be reset with no attempt at recovery.
Fix this by resetting ->last_reset back to zero in scmd_eh_abort_handler()
if it is determined that the EH thread will not run to do this.
Thanks to Gopinath Marappan for investigating this problem.
Link: https://lore.kernel.org/r/20211029194311.17504-2-emilne@redhat.com
Fixes: e494f6a728 ("[SCSI] improved eh timeout handler")
Cc: stable@vger.kernel.org
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 839b63860e upstream.
Patch series "ocfs2: Truncate data corruption fix".
As further testing has shown, commit 5314454ea3 ("ocfs2: fix data
corruption after conversion from inline format") didn't fix all the data
corruption issues the customer started observing after 6dbf7bb555
("fs: Don't invalidate page buffers in block_write_full_page()") This
time I have tracked them down to two bugs in ocfs2 truncation code.
One bug (truncating page cache before clearing tail cluster and setting
i_size) could cause data corruption even before 6dbf7bb555, but before
that commit it needed a race with page fault, after 6dbf7bb555 it
started to be pretty deterministic.
Another bug (zeroing pages beyond old i_size) used to be harmless
inefficiency before commit 6dbf7bb555. But after commit 6dbf7bb555
in combination with the first bug it resulted in deterministic data
corruption.
Although fixing only the first problem is needed to stop data
corruption, I've fixed both issues to make the code more robust.
This patch (of 2):
ocfs2_truncate_file() did unmap invalidate page cache pages before
zeroing partial tail cluster and setting i_size. Thus some pages could
be left (and likely have left if the cluster zeroing happened) in the
page cache beyond i_size after truncate finished letting user possibly
see stale data once the file was extended again. Also the tail cluster
zeroing was not guaranteed to finish before truncate finished causing
possible stale data exposure. The problem started to be particularly
easy to hit after commit 6dbf7bb555 "fs: Don't invalidate page buffers
in block_write_full_page()" stopped invalidation of pages beyond i_size
from page writeback path.
Fix these problems by unmapping and invalidating pages in the page cache
after the i_size is reduced and tail cluster is zeroed out.
Link: https://lkml.kernel.org/r/20211025150008.29002-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20211025151332.11301-1-jack@suse.cz
Fixes: ccd979bdbc ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68dbbe7d5b upstream.
Some ATA drives are very slow to respond to READ_LOG_EXT and
READ_LOG_DMA_EXT commands issued from ata_dev_configure() when the
device is revalidated right after resuming a system or inserting the
ATA adapter driver (e.g. ahci). The default 5s timeout
(ATA_EH_CMD_DFL_TIMEOUT) used for these commands is too short, causing
errors during the device configuration. Ex:
...
ata9: SATA max UDMA/133 abar m524288@0x9d200000 port 0x9d200400 irq 209
ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata9.00: ATA-9: XXX XXXXXXXXXXXXXXX, XXXXXXXX, max UDMA/133
ata9.00: qc timeout (cmd 0x2f)
ata9.00: Read log page 0x00 failed, Emask 0x4
ata9.00: Read log page 0x00 failed, Emask 0x40
ata9.00: NCQ Send/Recv Log not supported
ata9.00: Read log page 0x08 failed, Emask 0x40
ata9.00: 27344764928 sectors, multi 16: LBA48 NCQ (depth 32), AA
ata9.00: Read log page 0x00 failed, Emask 0x40
ata9.00: ATA Identify Device Log not supported
ata9.00: failed to set xfermode (err_mask=0x40)
ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata9.00: configured for UDMA/133
...
The timeout error causes a soft reset of the drive link, followed in
most cases by a successful revalidation as that give enough time to the
drive to become fully ready to quickly process the read log commands.
However, in some cases, this also fails resulting in the device being
dropped.
Fix this by using adding the ata_eh_revalidate_timeouts entries for the
READ_LOG_EXT and READ_LOG_DMA_EXT commands. This defines a timeout
increased to 15s, retriable one time.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e1959faf08 upstream.
Some USB 3.1 enumeration issues were reported after the hub driver removed
the minimum 100ms limit for the power-on-good delay.
Since commit 90d28fb53d ("usb: core: reduce power-on-good delay time of
root hub") the hub driver sets the power-on-delay based on the
bPwrOn2PwrGood value in the hub descriptor.
xhci driver has a 20ms bPwrOn2PwrGood value for both roothubs based
on xhci spec section 5.4.8, but it's clearly not enough for the
USB 3.1 devices, causing enumeration issues.
Tests indicate full 100ms delay is needed.
Reported-by: Walt Jr. Brake <mr.yming81@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Fixes: 90d28fb53d ("usb: core: reduce power-on-good delay time of root hub")
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211105160036.549516-1-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a44f9d6f9d upstream.
There is a wrong comparison of the total size of the loaded firmware
css->fw->size with the size of a pointer to struct imgu_fw_header.
Turn binary_header into a flexible-array member[1][2], use the
struct_size() helper and fix the wrong size comparison. Notice
that the loaded firmware needs to contain at least one 'struct
imgu_fw_info' item in the binary_header[] array.
It's also worth mentioning that
"css->fw->size < struct_size(css->fwp, binary_header, 1)"
with binary_header declared as a flexible-array member is equivalent
to
"css->fw->size < sizeof(struct imgu_fw_header)"
with binary_header declared as a one-element array (as in the original
code).
The replacement of the one-element array with a flexible-array member
also helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/109
Fixes: 09d290f0ba ("media: staging/intel-ipu3: css: Add support for firmware management")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a56d3e40bd upstream.
USB bulk and interrupt message timeouts are specified in milliseconds
and should specifically not vary with CONFIG_HZ.
Note that the bulk-out transfer timeout was set to the endpoint
bInterval value, which should be ignored for bulk endpoints and is
typically set to zero. This meant that a failing bulk-out transfer
would never time out.
Assume that the 10 second timeout used for all other transfers is more
than enough also for the bulk-out endpoint.
Fixes: 985cafccbf ("Staging: Comedi: vmk80xx: Add k8061 support")
Fixes: 951348b377 ("staging: comedi: vmk80xx: wait for URBs to complete")
Cc: stable@vger.kernel.org # 2.6.31
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20211025114532.4599-6-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a23461c474 upstream.
The driver uses endpoint-sized USB transfer buffers but up until
recently had no sanity checks on the sizes.
Commit e1f13c879a ("staging: comedi: check validity of wMaxPacketSize
of usb endpoints found") inadvertently fixed NULL-pointer dereferences
when accessing the transfer buffers in case a malicious device has a
zero wMaxPacketSize.
Make sure to allocate buffers large enough to handle also the other
accesses that are done without a size check (e.g. byte 18 in
vmk80xx_cnt_insn_read() for the VMK8061_MODEL) to avoid writing beyond
the buffers, for example, when doing descriptor fuzzing.
The original driver was for a low-speed device with 8-byte buffers.
Support was later added for a device that uses bulk transfers and is
presumably a full-speed device with a maximum 64-byte wMaxPacketSize.
Fixes: 985cafccbf ("Staging: Comedi: vmk80xx: Add k8061 support")
Cc: stable@vger.kernel.org # 2.6.31
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20211025114532.4599-4-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 536de747bc upstream.
USB transfer buffers are typically mapped for DMA and must not be
allocated on the stack or transfers will fail.
Allocate proper transfer buffers in the various command helpers and
return an error on short transfers instead of acting on random stack
data.
Note that this also fixes a stack info leak on systems where DMA is not
used as 32 bytes are always sent to the device regardless of how short
the command is.
Fixes: 63274cd7d3 ("Staging: comedi: add usb dt9812 driver")
Cc: stable@vger.kernel.org # 2.6.29
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211027093529.30896-3-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c052cc1a06 upstream.
Syzbot reported use-after-free in rtl8712_dl_fw(). The problem was in
race condition between r871xu_dev_remove() ->ndo_open() callback.
It's easy to see from crash log, that driver accesses released firmware
in ->ndo_open() callback. It may happen, since driver was releasing
firmware _before_ unregistering netdev. Fix it by moving
unregister_netdev() before cleaning up resources.
Call Trace:
...
rtl871x_open_fw drivers/staging/rtl8712/hal_init.c:83 [inline]
rtl8712_dl_fw+0xd95/0xe10 drivers/staging/rtl8712/hal_init.c:170
rtl8712_hal_init drivers/staging/rtl8712/hal_init.c:330 [inline]
rtl871x_hal_init+0xae/0x180 drivers/staging/rtl8712/hal_init.c:394
netdev_open+0xe6/0x6c0 drivers/staging/rtl8712/os_intfs.c:380
__dev_open+0x2bc/0x4d0 net/core/dev.c:1484
Freed by task 1306:
...
release_firmware+0x1b/0x30 drivers/base/firmware_loader/main.c:1053
r871xu_dev_remove+0xcc/0x2c0 drivers/staging/rtl8712/usb_intf.c:599
usb_unbind_interface+0x1d8/0x8d0 drivers/usb/core/driver.c:458
Fixes: 8c213fa591 ("staging: r8712u: Use asynchronous firmware loading")
Cc: stable <stable@vger.kernel.org>
Reported-and-tested-by: syzbot+c55162be492189fb4f51@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Link: https://lore.kernel.org/r/20211019211718.26354-1-paskripkin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32e9f56a96 upstream.
When freeing txn buffers, binder_transaction_buffer_release()
attempts to detect whether the current context is the target by
comparing current->group_leader to proc->tsk. This is an unreliable
test. Instead explicitly pass an 'is_failure' boolean.
Detecting the sender was being used as a way to tell if the
transaction failed to be sent. When cleaning up after
failing to send a transaction, there is no need to close
the fds associated with a BINDER_TYPE_FDA object. Now
'is_failure' can be used to accurately detect this case.
Fixes: 44d8047f1d ("binder: use standard functions to allocate fds")
Cc: stable <stable@vger.kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20211015233811.3532235-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52f8869337 upstream.
Since binder was integrated with selinux, it has passed
'struct task_struct' associated with the binder_proc
to represent the source and target of transactions.
The conversion of task to SID was then done in the hook
implementations. It turns out that there are race conditions
which can result in an incorrect security context being used.
Fix by using the 'struct cred' saved during binder_open and pass
it to the selinux subsystem.
Cc: stable@vger.kernel.org # 5.14 (need backport for earlier stables)
Fixes: 79af73079d ("Add security hooks to binder and implement the hooks for SELinux.")
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29bc22ac5e upstream.
Save the 'struct cred' associated with a binder process
at initial open to avoid potential race conditions
when converting to an euid.
Set a transaction's sender_euid from the 'struct cred'
saved at binder_open() instead of looking up the euid
from the binder proc's 'struct task'. This ensures
the euid is associated with the security context that
of the task that opened binder.
Cc: stable@vger.kernel.org # 4.4+
Fixes: 457b9a6f09 ("Staging: android: add binder driver")
Signed-off-by: Todd Kjos <tkjos@google.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Jann Horn <jannh@google.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21b5fcdccb upstream.
musb_gadget_queue() adds the passed request to musb_ep::req_list. If the
endpoint is idle and it is the first request then it invokes
musb_queue_resume_work(). If the function returns an error then the
error is passed to the caller without any clean-up and the request
remains enqueued on the list. If the caller enqueues the request again
then the list corrupts.
Remove the request from the list on error.
Fixes: ea2f35c01d ("usb: musb: Fix sleeping function called from invalid context for hdrc glue")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Viraj Shah <viraj.shah@linutronix.de>
Link: https://lore.kernel.org/r/20211021093644.4734-1-viraj.shah@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0548b2690 upstream.
On 64-bit:
drivers/usb/gadget/udc/fsl_qe_udc.c: In function ‘qe_ep0_rx’:
drivers/usb/gadget/udc/fsl_qe_udc.c:842:13: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
842 | vaddr = (u32)phys_to_virt(in_be32(&bd->buf));
| ^
In file included from drivers/usb/gadget/udc/fsl_qe_udc.c:41:
drivers/usb/gadget/udc/fsl_qe_udc.c:843:28: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
843 | frame_set_data(pframe, (u8 *)vaddr);
| ^
The driver assumes physical and virtual addresses are 32-bit, hence it
cannot work on 64-bit platforms.
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20211027080849.3276289-1-geert@linux-m68k.org
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d5e7a28b1 upstream.
This is a new warning in clang top-of-tree (will be clang 14):
In file included from arch/x86/kvm/mmu/mmu.c:27:
arch/x86/kvm/mmu/spte.h:318:9: error: use of bitwise '|' with boolean operands [-Werror,-Wbitwise-instead-of-logical]
return __is_bad_mt_xwr(rsvd_check, spte) |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
||
arch/x86/kvm/mmu/spte.h:318:9: note: cast one or both operands to int to silence this warning
The code is fine, but change it anyway to shut up this clever clogs
of a compiler.
Reported-by: torvic9@mailbox.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d9e9153f1 upstream.
CS46xx driver switches the buffer depending on the number of periods,
and in some cases it switches to the own buffer without updating the
buffer type properly. This may cause a problem with the mmap on
exotic architectures that require the own mmap call for the coherent
DMA buffer.
This patch addresses the potential breakage by replacing the buffer
setup with the proper macro. It also simplifies the source code,
too.
Link: https://lore.kernel.org/r/20210809071829.22238-4-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cbea6e5a77 upstream.
Currently we check only the substream->dma_buffer as the preset of the
buffer configuration for verifying the availability of mmap. But a
few drivers rather set up the buffer in the own way without the
standard buffer preallocation using substream->dma_buffer, and they
miss the proper checks. (Now it's working more or less fine as most
of them are running only on x86).
Actually, they may set up the runtime dma_buffer (referred via
snd_pcm_get_dma_buf()) at the open callback, though. That is, this
could have been used as the primary source.
This patch changes the hw_support_mmap() function to check the runtime
dma buffer at first. It's usually NULL with the standard buffer
preallocation, and in that case, we continue checking
substream->dma_buffer as fallback.
Link: https://lore.kernel.org/r/20210809071829.22238-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df0380b953 upstream.
This is a fix equivalent with the upstream commit df0380b953 ("ALSA:
usb-audio: Add quirk for Audient iD14"), adapted to the earlier
kernels up to 5.14.y. It adds the quirk entry with the old
ignore_ctl_error flag to the usbmix_ctl_maps, instead.
The original commit description says:
Audient iD14 (2708:0002) may get a control message error that
interferes the operation e.g. with alsactl. Add the quirk to ignore
such errors like other devices.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22390ce786 upstream.
This is a fix equivalent with the upstream commit 22390ce786 ("ALSA:
usb-audio: add Schiit Hel device to quirk table"), adapted to the
earlier kernels up to 5.14.y. It adds the quirk entry with the old
ignore_ctl_error flag to the usbmix_ctl_maps, instead.
The original patch description says:
The Shciit Hel device responds to the ctl message for the mic capture
switch with a timeout of -EPIPE:
usb 7-2.2: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x1100, type = 1
usb 7-2.2: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x1100, type = 1
usb 7-2.2: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x1100, type = 1
usb 7-2.2: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x1100, type = 1
This seems safe to ignore as the device works properly with the control
message quirk, so add it to the quirk table so all is good.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac653dd799 upstream.
Propagating errors to dependent fences is broken and can lead to errors
from one client ending up in another. In commit 3761baae90 ("Revert
"drm/i915: Propagate errors on awaiting already signaled fences""), we
attempted to get rid of fence error propagation but missed the case
added in commit 8e9f84cf5c ("drm/i915/gt: Propagate change in error
status to children on unhold"). Revert that one too. This error was
found by an up-and-coming selftest which triggers a reset during
request cancellation and verifies that subsequent requests complete
successfully.
v2:
(Daniel Vetter)
- Use revert
v3:
(Jason)
- Update commit message
v4 (Daniele):
- fix checkpatch error in commit message.
References: '3761baae908a ("Revert "drm/i915: Propagate errors on awaiting already signaled fences"")'
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-8-matthew.brost@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb4f756915 upstream.
After commit 77a7300aba ("of/irq: Get rid of NO_IRQ usage"),
no irq case has been removed, irq_of_parse_and_map() will return
0 in all cases when get error from parse and map an interrupt into
linux virq space.
amba_device_register() is only used on no-DT initialization, see
s3c64xx_pl080_init() arch/arm/mach-s3c/pl080.c
ep93xx_init_devices() arch/arm/mach-ep93xx/core.c
They won't set -1 to irq[0], so no need the warn.
This reverts commit 2eac58d502.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b2f106eb5 upstream.
This reverts commit a77ebdd9f553. It turns out that the VPU domain has no
different requirements, even though the downstream ATF implementation seems
to suggest otherwise. Powering on the domain with the reset asserted works
fine. As the changed sequence has caused sporadic issues with the GPU
domains, just revert the change to go back to the working sequence.
Cc: <stable@vger.kernel.org> # 5.14
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Adam Ford <aford173@gmail.com> #imx8mm-beacon
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 35d2969ea3 upstream.
The bounds checking in avc_ca_pmt() is not strict enough. It should
be checking "read_pos + 4" because it's reading 5 bytes. If the
"es_info_length" is non-zero then it reads a 6th byte so there needs to
be an additional check for that.
I also added checks for the "write_pos". I don't think these are
required because "read_pos" and "write_pos" are tied together so
checking one ought to be enough. But they make the code easier to
understand for me. The check on write_pos is:
if (write_pos + 4 >= sizeof(c->operand) - 4) {
The first "+ 4" is because we're writing 5 bytes and the last " - 4"
is to leave space for the CRC.
The other problem is that "length" can be invalid. It comes from
"data_length" in fdtv_ca_pmt().
Cc: stable@vger.kernel.org
Reported-by: Luo Likang <luolikang@nsfocus.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55161e67d4 upstream.
This reverts commit 09e856d54b.
When an interface is enslaved in a VRF, prerouting conntrack hook is
called twice: once in the context of the original input interface, and
once in the context of the VRF interface. If no special precausions are
taken, this leads to creation of two conntrack entries instead of one,
and breaks SNAT.
Commit above was intended to avoid creation of extra conntrack entries
when input interface is enslaved in a VRF. It did so by resetting
conntrack related data associated with the skb when it enters VRF context.
However it breaks netfilter operation. Imagine a use case when conntrack
zone must be assigned based on the original input interface, rather than
VRF interface (that would make original interfaces indistinguishable). One
could create netfilter rules similar to these:
chain rawprerouting {
type filter hook prerouting priority raw;
iif realiface1 ct zone set 1 return
iif realiface2 ct zone set 2 return
}
This works before the mentioned commit, but not after: zone assignment
is "forgotten", and any subsequent NAT or filtering that is dependent
on the conntrack zone does not work.
Here is a reproducer script that demonstrates the difference in behaviour.
==========
#!/bin/sh
# This script demonstrates unexpected change of nftables behaviour
# caused by commit 09e856d54b ""vrf: Reset skb conntrack
# connection on VRF rcv"
# https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=09e856d54bda5f288ef8437a90ab2b9b3eab83d1
#
# Before the commit, it was possible to assign conntrack zone to a
# packet (or mark it for `notracking`) in the prerouting chanin, raw
# priority, based on the `iif` (interface from which the packet
# arrived).
# After the change, # if the interface is enslaved in a VRF, such
# assignment is lost. Instead, assignment based on the `iif` matching
# the VRF master interface is honored. Thus it is impossible to
# distinguish packets based on the original interface.
#
# This script demonstrates this change of behaviour: conntrack zone 1
# or 2 is assigned depending on the match with the original interface
# or the vrf master interface. It can be observed that conntrack entry
# appears in different zone in the kernel versions before and after
# the commit.
IPIN=172.30.30.1
IPOUT=172.30.30.2
PFXL=30
ip li sh vein >/dev/null 2>&1 && ip li del vein
ip li sh tvrf >/dev/null 2>&1 && ip li del tvrf
nft list table testct >/dev/null 2>&1 && nft delete table testct
ip li add vein type veth peer veout
ip li add tvrf type vrf table 9876
ip li set veout master tvrf
ip li set vein up
ip li set veout up
ip li set tvrf up
/sbin/sysctl -w net.ipv4.conf.veout.accept_local=1
/sbin/sysctl -w net.ipv4.conf.veout.rp_filter=0
ip addr add $IPIN/$PFXL dev vein
ip addr add $IPOUT/$PFXL dev veout
nft -f - <<__END__
table testct {
chain rawpre {
type filter hook prerouting priority raw;
iif { veout, tvrf } meta nftrace set 1
iif veout ct zone set 1 return
iif tvrf ct zone set 2 return
notrack
}
chain rawout {
type filter hook output priority raw;
notrack
}
}
__END__
uname -rv
conntrack -F
ping -W 1 -c 1 -I vein $IPOUT
conntrack -L
Signed-off-by: Eugene Crosser <crosser@average.org>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 041c614882 upstream.
Everything except the first 32 bits was lost when the pause flags were
added. This makes the 50000baseCR2 mode flag (bit 34) not appear.
I have tested this with a 10G card (SFN5122F-R7) by modifying it to
return a non-legacy link mode (10000baseCR).
Signed-off-by: Erik Ekman <erik@kryo.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b0971ca7f upstream.
If the guest requests string I/O from the hypervisor via VMGEXIT,
SW_EXITINFO2 will contain the REP count. However, sev_es_string_io
was incorrectly treating it as the size of the GHCB buffer in
bytes.
This fixes the "outsw" test in the experimental SEV tests of
kvm-unit-tests.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reported-by: Marc Orr <marcorr@google.com>
Tested-by: Marc Orr <marcorr@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0985dba842 upstream.
In kvm_vcpu_block, the current task is set to TASK_INTERRUPTIBLE before
making a final check whether the vCPU should be woken from HLT by any
incoming interrupt.
This is a problem for the get_user() in __kvm_xen_has_interrupt(), which
really shouldn't be sleeping when the task state has already been set.
I think it's actually harmless as it would just manifest itself as a
spurious wakeup, but it's causing a debug warning:
[ 230.963649] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b6bcdbc9>] prepare_to_swait_exclusive+0x30/0x80
Fix the warning by turning it into an *explicit* spurious wakeup. When
invoked with !task_is_running(current) (and we might as well add
in_atomic() there while we're at it), just return 1 to indicate that
an IRQ is pending, which will cause a wakeup and then something will
call it again in a context that *can* sleep so it can fault the page
back in.
Cc: stable@vger.kernel.org
Fixes: 40da8ccd72 ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <168bf8c689561da904e48e2ff5ae4713eaef9e2d.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29c77550ee upstream.
When perf.data is not written cleanly, we would like to process existing
data as much as possible (please see f_header.data.size == 0 condition
in perf_session__read_header). However, perf.data with partial data may
crash perf. Specifically, we see crash in 'perf script' for NULL
session->header.env.arch.
Fix this by checking session->header.env.arch before using it to determine
native_arch. Also split the if condition so it is easier to read.
Committer notes:
If it is a pipe, we already assume is a native arch, so no need to check
session->header.env.arch.
Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211004053238.514936-1-songliubraving@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54c5639d8f upstream.
Nathan reported that because KASAN_SHADOW_OFFSET was not defined in
Kconfig, it prevents asan-stack from getting disabled with clang even
when CONFIG_KASAN_STACK is disabled: fix this by defining the
corresponding config.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf11d01135 upstream.
When calling this function, all the shadow memory is already populated
with kasan_early_shadow_pte which has PAGE_KERNEL protection.
kasan_populate_early_shadow write-protects the mapping of the range
of addresses passed in argument in zero_pte_populate, which actually
write-protects all the shadow memory mapping since kasan_early_shadow_pte
is used for all the shadow memory at this point. And then when using
memblock API to populate the shadow memory, the first write access to the
kernel stack triggers a trap. This becomes visible with the next commit
that contains a fix for asan-stack.
We already manually populate all the shadow memory in kasan_early_init
and we write-protect kasan_early_shadow_pte at the end of kasan_init
which makes the calls to kasan_populate_early_shadow superfluous so
we can remove them.
Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
Fixes: e178d670f2 ("riscv/kasan: add KASAN_VMALLOC support")
Fixes: 8ad8b72721 ("riscv: Add KASAN support")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e20f80b9b1 upstream.
Commit a264cf5e81 ("scsi: ibmvfc: Fix command state accounting and stale
response detection") introduced a regression in detecting duplicate
responses. This was observed in test where a command was sent to the VIOS
and completed before ibmvfc_send_event() set the active flag to 1, which
resulted in the atomic_dec_if_positive() call in ibmvfc_handle_crq()
thinking this was a duplicate response, which resulted in scsi_done() not
getting called, so we then hit a SCSI command timeout for this command once
the timeout expires. This simply ensures the active flag gets set prior to
making the hcall to send the command to the VIOS, in order to close this
window.
Link: https://lore.kernel.org/r/20211019152129.16558-1-brking@linux.vnet.ibm.com
Fixes: a264cf5e81 ("scsi: ibmvfc: Fix command state accounting and stale response detection")
Cc: stable@vger.kernel.org
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 27730c8cd6 ]
-F weight in perf script is broken.
# ./perf mem record
# ./perf script -F weight
Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot
print 'weight' field.
The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The
lower 32 bits are exactly the same for both sample type. The higher 32
bits may be different for different architecture. For a new kernel on
x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other
ARCHs, the PERF_SAMPLE_WEIGHT is used.
With -F weight, current perf script will only check the input string
"weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit
ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't
update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a
new kernel on x86, the check fails.
Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to
replace PERF_SAMPLE_WEIGHT
Fixes: ea8d0ed6ea ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT")
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Joe Mario <jmario@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Joe Mario <jmario@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/1632929894-102778-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b57e9d501 ]
The idea behind kicked mask is that we should not re-kick a vcpu that
is already in the "kick" process, i.e. that was kicked and is
is about to be dispatched if certain conditions are met.
The problem with the current implementation is, that it assumes the
kicked vcpu is going to enter SIE shortly. But under certain
circumstances, the vcpu we just kicked will be deemed non-runnable and
will remain in wait state. This can happen, if the interrupt(s) this
vcpu got kicked to deal with got already cleared (because the interrupts
got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
would return false, and the vcpu would remain in kvm_vcpu_block(),
but this time with its kicked_mask bit set. So next time around we
wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
that we just kicked it.
Let us make sure the kicked_mask is cleared before we give up on
re-dispatching the vcpu.
Fixes: 9f30f62163 ("KVM: s390: add gib_alert_irq_handler()")
Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019175401.3757927-2-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc45b96e2d ]
While displaying ingress policers information in
debugfs check whether ingress policers exist in
the hardware or not because some platforms(CN9XXX)
do not have this feature.
Fixes: e7d8971763 ("octeontx2-af: cn10k: Debugfs support for bandwidth")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c7a6e3978e ]
The specified buffer length for three debugfs files fd_tcam, uc and tqp
is not enough for their maximum needs, so this patch fixes them.
Fixes: b5a0b70d77 ("net: hns3: refactor dump fd tcam of debugfs")
Fixes: 1556ea9120 ("net: hns3: refactor dump mac list of debugfs")
Fixes: d96b0e5946 ("net: hns3: refactor dump reg of debugfs")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6754614a78 ]
As the width of packets number registers is 32 bits, they needs at most
10 characters for decimal data printing, but now the string spaces is not
enough, so this patch fixes it.
Fixes: e44c495d95 ("net: hns3: refactor queue info of debugfs")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 99d0a3831e ]
bpf_types.h has BPF_MAP_TYPE_INODE_STORAGE and BPF_MAP_TYPE_TASK_STORAGE
declared inside #ifdef CONFIG_NET although they are built regardless of
CONFIG_NET. So, when CONFIG_BPF_SYSCALL && !CONFIG_NET, they are built
without the declarations leading to spurious build failures and not
registered to bpf_map_types making them unavailable.
Fix it by moving the BPF_MAP_TYPE for the two map types outside of
CONFIG_NET.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: a10787e6d5 ("bpf: Enable task local storage for tracing programs")
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/YXG1cuuSJDqHQfRY@slm.duckdns.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f31afb502c ]
SBSA says of the generic watchdog:
All registers are 32 bits in size and should be accessed using 32-bit
reads and writes. If an access size other than 32 bits is used then
the results are IMPLEMENTATION DEFINED.
and for qemu, the implementation will only allow 32-bit accesses
resulting in a synchronous external abort when configuring the watchdog.
Use lo_hi_* accessors rather than a readq/writeq.
Fixes: abd3ac7902 ("watchdog: sbsa: Support architecture version 1")
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/20210903112101.493552-1-quic_jiles@quicinc.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d02831e51 ]
sctp_sf_ootb() is called when processing DATA chunk in closed state,
and many other places are also using it.
The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.
When fails to verify the vtag from the chunk, this patch sets asoc
to NULL, so that the abort will be made with the vtag from the
received chunk later.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef16b1734f ]
sctp_sf_do_8_5_1_E_sa() is called when processing SHUTDOWN_ACK chunk
in cookie_wait and cookie_echoed state.
The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.
Note that when fails to verify the vtag from SHUTDOWN-ACK chunk,
SHUTDOWN COMPLETE message will still be sent back to peer, but
with the vtag from SHUTDOWN-ACK chunk, as said in 5) of
rfc4960#section-8.4.
While at it, also remove the unnecessary chunk length check from
sctp_sf_shut_8_4_5(), as it's already done in both places where
it calls sctp_sf_shut_8_4_5().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aa0f697e45 ]
sctp_sf_violation() is called when processing HEARTBEAT_ACK chunk
in cookie_wait state, and some other places are also using it.
The vtag in the chunk's sctphdr should be verified, otherwise, as
later in chunk length check, it may send abort with the existent
asoc's vtag, which can be exploited by one to cook a malicious
chunk to terminate a SCTP asoc.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a64b341b86 ]
1. In closed state: in sctp_sf_do_5_1D_ce():
When asoc is NULL, making packet for abort will use chunk's vtag
in sctp_ootb_pkt_new(). But when asoc exists, vtag from the chunk
should be verified before using peer.i.init_tag to make packet
for abort in sctp_ootb_pkt_new(), and just discard it if vtag is
not correct.
2. In the other states: in sctp_sf_do_5_2_4_dupcook():
asoc always exists, but duplicate cookie_echo's vtag will be
handled by sctp_tietags_compare() and then take actions, so before
that we only verify the vtag for the abort sent for invalid chunk
length.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 438b95a7c9 ]
Currently INIT_ACK chunk in non-cookie_echoed state is processed in
sctp_sf_discard_chunk() to send an abort with the existent asoc's
vtag if the chunk length is not valid. But the vtag in the chunk's
sctphdr is not verified, which may be exploited by one to cook a
malicious chunk to terminal a SCTP asoc.
sctp_sf_discard_chunk() also is called in many other places to send
an abort, and most of those have this problem. This patch is to fix
it by sending abort with the existent asoc's vtag only if the vtag
from the chunk's sctphdr is verified in sctp_sf_discard_chunk().
Note on sctp_sf_do_9_1_abort() and sctp_sf_shutdown_pending_abort(),
the chunk length has been verified before sctp_sf_discard_chunk(),
so replace it with sctp_sf_discard(). On sctp_sf_do_asconf_ack() and
sctp_sf_do_asconf(), move the sctp_chunk_length_valid check ahead of
sctp_sf_discard_chunk(), then replace it with sctp_sf_discard().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eae5783908 ]
This patch fixes the problems below:
1. In non-shutdown_ack_sent states: in sctp_sf_do_5_1B_init() and
sctp_sf_do_5_2_2_dupinit():
chunk length check should be done before any checks that may cause
to send abort, as making packet for abort will access the init_tag
from init_hdr in sctp_ootb_pkt_new().
2. In shutdown_ack_sent state: in sctp_sf_do_9_2_reshutack():
The same checks as does in sctp_sf_do_5_2_2_dupinit() is needed
for sctp_sf_do_9_2_reshutack().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4f7019c7eb ]
Currently Linux SCTP uses the verification tag of the existing SCTP
asoc when failing to process and sending the packet with the ABORT
chunk. This will result in the peer accepting the ABORT chunk and
removing the SCTP asoc. One could exploit this to terminate a SCTP
asoc.
This patch is to fix it by always using the initiate tag of the
received INIT chunk for the ABORT chunk to be sent.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2dace185ca ]
When irdma_ws_add fails, irdma_ws_remove is used to cleanup the leaf node.
This lead to holding the qos mutex twice in the QP resume path. Fix this
by avoiding the call to irdma_ws_remove and unwinding the error in
irdma_ws_add. This skips the call to irdma_tc_in_use function which is not
needed in the error unwind cases.
Fixes: 3ae331c751 ("RDMA/irdma: Add QoS definitions")
Link: https://lore.kernel.org/r/20211019151654.1943-2-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e93c7d8e8c ]
The valid bit for extended CQE's written by HW is retrieved from the
incorrect quad-word. This leads to missed completions for any UD traffic
particularly after a wrap-around.
Get the valid bit for extended CQE's from the correct quad-word in the
descriptor.
Fixes: 551c46edc7 ("RDMA/irdma: Add user/kernel shared libraries")
Link: https://lore.kernel.org/r/20211005182302.374-1-shiraz.saleem@intel.com
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit af1a02aa23 upstream.
There is a race condition where the PHY state machine can change
members of the phydev structure at the same time userspace requests a
change via ethtool. To prevent this, have phy_ethtool_ksettings_set
take the PHY lock.
Fixes: 2d55173e71 ("phy: add generic function to support ksetting support")
Reported-by: Walter Stoll <Walter.Stoll@duagon.com>
Suggested-by: Walter Stoll <Walter.Stoll@duagon.com>
Tested-by: Walter Stoll <Walter.Stoll@duagon.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 707293a56f upstream.
Split phy_start_aneg into a wrapper which takes the PHY lock, and a
helper doing the real work. This will be needed when
phy_ethtook_ksettings_set takes the lock.
Fixes: 2d55173e71 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64cd92d5e8 upstream.
This allows it to make use of a helper which assume the PHY is already
locked.
Fixes: 2d55173e71 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c10a485c3d upstream.
The PHY structure should be locked while copying information out if
it, otherwise there is no guarantee of self consistency. Without the
lock the PHY state machine could be updating the structure.
Fixes: 2d55173e71 ("phy: add generic function to support ksetting support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d9d6fd21a upstream.
sk->sk_err contains a positive number, yet async_wait.err wants the
opposite. Fix the missed sign flip, which Jakub caught by inspection.
Fixes: a42055e8d2 ("net/tls: Add support for async encryption of records for performance")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8684db191 upstream.
The driver allocates skb during ndo_open with GFP_ATOMIC which has high chance of failure when there are multiple instances.
GFP_KERNEL is enough while open and use GFP_ATOMIC only from interrupt context.
Fixes: 23f0703c12 ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a21dab594 upstream.
The member data in struct hclge_desc is type of __le32, it needs endian
conversion before using it, and some functions of debugfs didn't do that,
so this patch fixes it.
Fixes: c0ebebb9cc ("net: hns3: Add "dcb register" status information query function")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3bda2e5df4 upstream.
If a TP port is configured by follow steps:
1.ethtool -s ethx autoneg off speed 100 duplex full
2.ethtool -A ethx rx on tx on
3.ethtool -s ethx autoneg on(rx&tx negotiated pause results are off)
4.ethtool -s ethx autoneg off speed 100 duplex full
In step 3, driver will set rx&tx pause parameters of hardware to off as
pause parameters negotiated with link partner are off.
After step 4, the "ethtool -a ethx" command shows both rx and tx pause
parameters are on. However, pause parameters of hardware are still off
and port has no flow control function actually.
To fix this problem, if autoneg is disabled, driver uses its saved
parameters to restore pause of hardware. If the speed is not changed in
this case, there is no link state changed for phy, it will cause the pause
parameter is not taken effect, so we need to force phy to go down and up.
Fixes: aacbe27e82 ("net: hns3: modify how pause options is displayed")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ace19b9924 upstream.
A hard hang is observed whenever the ethernet interface is brought
down. If the PHY is stopped before the LPC core block is reset,
the SoC will hang. Comparing lpc_eth_close() and lpc_eth_open() I
re-arranged the ordering of the functions calls in lpc_eth_close() to
reset the hardware before stopping the PHY.
Fixes: b7370112f5 ("lpc32xx: Added ethernet driver")
Signed-off-by: Trevor Woerner <twoerner@gmail.com>
Acked-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 759635760a upstream.
When the driver fails to allocate a new Rx buffer, it passes an empty Rx
descriptor (contains zero address and size) to the device and marks it
as invalid by setting the skb pointer in the descriptor's metadata to
NULL.
After processing enough Rx descriptors, the driver will try to process
the invalid descriptor, but will return immediately seeing that the skb
pointer is NULL. Since the driver no longer passes new Rx descriptors to
the device, the Rx queue will eventually become full and the device will
start to drop packets.
Fix this by recycling the received packet if allocation of the new
packet failed. This means that allocation is no longer performed at the
end of the Rx routine, but at the start, before tearing down the DMA
mapping of the received packet.
Remove the comment about the descriptor being zeroed as it is no longer
correct. This is OK because we either use the descriptor as-is (when
recycling) or overwrite its address and size fields with that of the
newly allocated Rx buffer.
The issue was discovered when a process ("perf") consumed too much
memory and put the system under memory pressure. It can be reproduced by
injecting slab allocation failures [1]. After the fix, the Rx queue no
longer comes to a halt.
[1]
# echo 10 > /sys/kernel/debug/failslab/times
# echo 1000 > /sys/kernel/debug/failslab/interval
# echo 100 > /sys/kernel/debug/failslab/probability
FAULT_INJECTION: forcing a failure.
name failslab, interval 1000, probability 100, space 0, times 8
[...]
Call Trace:
<IRQ>
dump_stack_lvl+0x34/0x44
should_fail.cold+0x32/0x37
should_failslab+0x5/0x10
kmem_cache_alloc_node+0x23/0x190
__alloc_skb+0x1f9/0x280
__netdev_alloc_skb+0x3a/0x150
mlxsw_pci_rdq_skb_alloc+0x24/0x90
mlxsw_pci_cq_tasklet+0x3dc/0x1200
tasklet_action_common.constprop.0+0x9f/0x100
__do_softirq+0xb5/0x252
irq_exit_rcu+0x7a/0xa0
common_interrupt+0x83/0xa0
</IRQ>
asm_common_interrupt+0x1e/0x40
RIP: 0010:cpuidle_enter_state+0xc8/0x340
[...]
mlxsw_spectrum2 0000:06:00.0: Failed to alloc skb for RDQ
Fixes: eda6500a98 ("mlxsw: Add PCI bus implementation")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/20211024064014.1060919-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a089e95b4 upstream.
nios2:allmodconfig builds fail with
make[1]: *** No rule to make target 'arch/nios2/boot/dts/""',
needed by 'arch/nios2/boot/dts/built-in.a'. Stop.
make: [Makefile:1868: arch/nios2/boot/dts] Error 2 (ignored)
This is seen with compile tests since those enable NIOS2_DTB_SOURCE_BOOL,
which in turn enables NIOS2_DTB_SOURCE. This causes the build error
because the default value for NIOS2_DTB_SOURCE is an empty string.
Disable NIOS2_DTB_SOURCE_BOOL for compile tests to avoid the error.
Fixes: 2fc8483fdc ("nios2: Build infrastructure")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c57eeecc5 upstream.
Drivers call netdev_set_num_tc() and then netdev_set_tc_queue()
to set the queue count and offset for each TC. So the queue count
and offset for the TCs may be zero for a short period after dev->num_tc
has been set. If a TX packet is being transmitted at this time in the
code path netdev_pick_tx() -> skb_tx_hash(), skb_tx_hash() may see
nonzero dev->num_tc but zero qcount for the TC. The while loop that
keeps looping while hash >= qcount will not end.
Fix it by checking the TC's qcount to be nonzero before using it.
Fixes: eadec877ce ("net: Add support for subordinate traffic classes to netdev_pick_tx")
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7a1e76d0f upstream.
Currently in net_ns_get_ownership() it may not be able to set uid or gid
if make_kuid or make_kgid returns an invalid value, and an uninit-value
issue can be triggered by this.
This patch is to fix it by initializing the uid and gid before calling
net_ns_get_ownership(), as it does in kobject_get_ownership()
Fixes: e6dee9f389 ("net-sysfs: add netdev_change_owner()")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f68cd6348 upstream.
Syzbot reported ODEBUG warning in batadv_nc_mesh_free(). The problem was
in wrong error handling in batadv_mesh_init().
Before this patch batadv_mesh_init() was calling batadv_mesh_free() in case
of any batadv_*_init() calls failure. This approach may work well, when
there is some kind of indicator, which can tell which parts of batadv are
initialized; but there isn't any.
All written above lead to cleaning up uninitialized fields. Even if we hide
ODEBUG warning by initializing bat_priv->nc.work, syzbot was able to hit
GPF in batadv_nc_purge_paths(), because hash pointer in still NULL. [1]
To fix these bugs we can unwind batadv_*_init() calls one by one.
It is good approach for 2 reasons: 1) It fixes bugs on error handling
path 2) It improves the performance, since we won't call unneeded
batadv_*_free() functions.
So, this patch makes all batadv_*_init() clean up all allocated memory
before returning with an error to no call correspoing batadv_*_free()
and open-codes batadv_mesh_free() with proper order to avoid touching
uninitialized fields.
Link: https://lore.kernel.org/netdev/000000000000c87fbd05cef6bcb0@google.com/ [1]
Reported-and-tested-by: syzbot+28b0702ada0bf7381f58@syzkaller.appspotmail.com
Fixes: c6c8fea297 ("net: Add batman-adv meshing protocol")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55e6d80378 upstream.
In regcache_rbtree_insert_to_block(), when 'present' realloc failed,
the 'blk' which is supposed to assign to 'rbnode->block' will be freed,
so 'rbnode->block' points a freed memory, in the error handling path of
regcache_rbtree_init(), 'rbnode->block' will be freed again in
regcache_rbtree_exit(), KASAN will report double-free as follows:
BUG: KASAN: double-free or invalid-free in kfree+0xce/0x390
Call Trace:
slab_free_freelist_hook+0x10d/0x240
kfree+0xce/0x390
regcache_rbtree_exit+0x15d/0x1a0
regcache_rbtree_init+0x224/0x2c0
regcache_init+0x88d/0x1310
__regmap_init+0x3151/0x4a80
__devm_regmap_init+0x7d/0x100
madera_spi_probe+0x10f/0x333 [madera_spi]
spi_probe+0x183/0x210
really_probe+0x285/0xc30
To fix this, moving up the assignment of rbnode->block to immediately after
the reallocation has succeeded so that the data structure stays valid even
if the second reallocation fails.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 3f4ff561bc ("regmap: rbtree: Make cache_present bitmap per node")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211012023735.1632786-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd1b5beb17 upstream.
PTP is currently only supported on E810 devices, it is checked
in ice_ptp_init(). However, there is no check in ice_ptp_release().
For other E800 series devices, ice_ptp_release() will be wrongly executed.
Fix the following calltrace.
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
Workqueue: ice ice_service_task [ice]
Call Trace:
dump_stack_lvl+0x5b/0x82
dump_stack+0x10/0x12
register_lock_class+0x495/0x4a0
? find_held_lock+0x3c/0xb0
__lock_acquire+0x71/0x1830
lock_acquire+0x1e6/0x330
? ice_ptp_release+0x3c/0x1e0 [ice]
? _raw_spin_lock+0x19/0x70
? ice_ptp_release+0x3c/0x1e0 [ice]
_raw_spin_lock+0x38/0x70
? ice_ptp_release+0x3c/0x1e0 [ice]
ice_ptp_release+0x3c/0x1e0 [ice]
ice_prepare_for_reset+0xcb/0xe0 [ice]
ice_do_reset+0x38/0x110 [ice]
ice_service_task+0x138/0xf10 [ice]
? __this_cpu_preempt_check+0x13/0x20
process_one_work+0x26a/0x650
worker_thread+0x3f/0x3b0
? __kthread_parkme+0x51/0xb0
? process_one_work+0x650/0x650
kthread+0x161/0x190
? set_kthread_struct+0x40/0x40
ret_from_fork+0x1f/0x30
Fixes: 4dd0d5c33c ("ice: add lock around Tx timestamp tracker flush")
Signed-off-by: Yongxin Liu <yongxin.liu@windriver.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a8b357278 upstream.
When the PF is a member of a link aggregate, and the driver
is removed, the process will hang unless we respond to the
NETDEV_UNREGISTER event that is sent to the event_handler
for LAG.
Add a case statement for the ice_lag_event_handler to unlink
the PF from the link aggregate.
Also remove code that was incorrectly applying a dev_hold to
peer_netdevs that were associated with the ice driver.
Fixes: df006dd4b1 ("ice: Add initial support framework for LAG")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Tony Brelinski <tony.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2d4c543f7 upstream.
This patch fixes possible null pointer dereference in files
"rvu_debugfs.c" and "rvu_nix.c"
Fixes: 8756828a81 ("octeontx2-af: Add NPA aura and pool contexts to debugfs")
Fixes: 9a946def26 ("octeontx2-af: Modify nix_vtag_cfg mailbox to support TX VTAG entries")
Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e77bcdd1f6 upstream.
Currently, we are using a fixed buffer size of length 2048 to display
rsrc_alloc output. As a result a maximum of 2048 characters of
rsrc_alloc output is displayed, which may lead sometimes to display only
partial output. This patch fixes this dependency on max limit of buffer
size and displays all PF VF entries.
Each column of the debugfs entry "rsrc_alloc" uses a fixed width of 12
characters to print the list of LFs of each block for a PF/VF. If the
length of list of LFs of a block exceeds this fixed width then the list
gets truncated and displays only a part of the list. This patch fixes
this by using the maximum possible length of list of LFs among all
blocks of all PFs and VFs entries as the width size.
Fixes: f788409714 ("octeontx2-af: Formatting debugfs entry rsrc_alloc.")
Fixes: 23205e6d06 ("octeontx2-af: Dump current resource provisioning status")
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <Sunil.Goutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce7723e9cd upstream.
With commit db5ad6b7f8 ("nvme-tcp: try to send request in queue_rq
context") r2t and response PDU can get processed while send function
is executing.
Current data digest send code uses req->offset after kernel_sendmsg(),
this creates a race condition where req->offset gets reset before it
is used in send function.
This can happen in two cases -
1. Target sends r2t PDU which resets req->offset.
2. Target send response PDU which completes the req and then req is
used for a new command, nvme_tcp_setup_cmd_pdu() resets req->offset.
Fix this by storing req->offset in a local variable and using
this local variable after kernel_sendmsg().
Fixes: db5ad6b7f8 ("nvme-tcp: try to send request in queue_rq context")
Signed-off-by: Varun Prakash <varun@chelsio.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54713c85f5 upstream.
Lorenzo noticed that the code testing for program type compatibility of
tail call maps is potentially racy in that two threads could encounter a
map with an unset type simultaneously and both return true even though they
are inserting incompatible programs.
The race window is quite small, but artificially enlarging it by adding a
usleep_range() inside the check in bpf_prog_array_compatible() makes it
trivial to trigger from userspace with a program that does, essentially:
map_fd = bpf_create_map(BPF_MAP_TYPE_PROG_ARRAY, 4, 4, 2, 0);
pid = fork();
if (pid) {
key = 0;
value = xdp_fd;
} else {
key = 1;
value = tc_fd;
}
err = bpf_map_update_elem(map_fd, &key, &value, 0);
While the race window is small, it has potentially serious ramifications in
that triggering it would allow a BPF program to tail call to a program of a
different type. So let's get rid of it by protecting the update with a
spinlock. The commit in the Fixes tag is the last commit that touches the
code in question.
v2:
- Use a spinlock instead of an atomic variable and cmpxchg() (Alexei)
v3:
- Put lock and the members it protects into an embedded 'owner' struct (Daniel)
Fixes: 3324b584b6 ("ebpf: misc core cleanup")
Reported-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211026110019.363464-1-toke@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd9733f5d7 upstream.
With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls (or
multiple cores) on a single socket 'sk' we could get the following flow.
msgA, sk msgB, sk
----------- ---------------
tcp_bpf_sendmsg()
lock(sk)
psock = sk->psock
tcp_bpf_sendmsg()
lock(sk) ... blocking
tcp_bpf_send_verdict
if (psock->eval == NONE)
psock->eval = sk_psock_msg_verdict
..
< handle SK_REDIRECT case >
release_sock(sk) < lock dropped so grab here >
ret = tcp_bpf_sendmsg_redir
psock = sk->psock
tcp_bpf_send_verdict
lock_sock(sk) ... blocking on B
if (psock->eval == NONE) <- boom.
psock->eval will have msgA state
The problem here is we dropped the lock on msgA and grabbed it with msgB.
Now we have old state in psock and importantly psock->eval has not been
cleared. So msgB will run whatever action was done on A and the verdict
program may never see it.
Fixes: 604326b41a ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211012052019.184398-1-liujian56@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04f8ef5643 upstream.
When enabling CONFIG_CGROUP_BPF, kmemleak can be observed by running
the command as below:
$mount -t cgroup -o none,name=foo cgroup cgroup/
$umount cgroup/
unreferenced object 0xc3585c40 (size 64):
comm "mount", pid 425, jiffies 4294959825 (age 31.990s)
hex dump (first 32 bytes):
01 00 00 80 84 8c 28 c0 00 00 00 00 00 00 00 00 ......(.........
00 00 00 00 00 00 00 00 6c 43 a0 c3 00 00 00 00 ........lC......
backtrace:
[<e95a2f9e>] cgroup_bpf_inherit+0x44/0x24c
[<1f03679c>] cgroup_setup_root+0x174/0x37c
[<ed4b0ac5>] cgroup1_get_tree+0x2c0/0x4a0
[<f85b12fd>] vfs_get_tree+0x24/0x108
[<f55aec5c>] path_mount+0x384/0x988
[<e2d5e9cd>] do_mount+0x64/0x9c
[<208c9cfe>] sys_mount+0xfc/0x1f4
[<06dd06e0>] ret_fast_syscall+0x0/0x48
[<a8308cb3>] 0xbeb4daa8
This is because that since the commit 2b0d3d3e4f ("percpu_ref: reduce
memory footprint of percpu_ref in fast path") root_cgrp->bpf.refcnt.data
is allocated by the function percpu_ref_init in cgroup_bpf_inherit which
is called by cgroup_setup_root when mounting, but not freed along with
root_cgrp when umounting. Adding cgroup_bpf_offline which calls
percpu_ref_kill to cgroup_kill_sb can free root_cgrp->bpf.refcnt.data in
umount path.
This patch also fixes the commit 4bfc0bb2c6 ("bpf: decouple the lifetime
of cgroup_bpf from cgroup itself"). A cgroup_bpf_offline is needed to do a
cleanup that frees the resources which are allocated by cgroup_bpf_inherit
in cgroup_setup_root.
And inside cgroup_bpf_offline, cgroup_get() is at the beginning and
cgroup_put is at the end of cgroup_bpf_release which is called by
cgroup_bpf_offline. So cgroup_bpf_offline can keep the balance of
cgroup's refcount.
Fixes: 2b0d3d3e4f ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
Fixes: 4bfc0bb2c6 ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211018075623.26884-1-quanyang.wang@windriver.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad76744b04 upstream.
[Why]
A deadlock in the kernel occurs when we fallback from the V3 to V2
add_topology_to_display or remove_topology_to_display because they
both try to acquire the dtm_mutex but recursive locking isn't
supported on mutex_lock().
[How]
Make the mutex_lock/unlock more fine grained and move them up such that
they're only required for the psp invocation itself.
Fixes: bf62221e9d ("drm/amd/display: Add DCN3.1 HDCP support")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54149d13f3 upstream.
[WHY]
On certain configs, SMU clock table voltages don't match which cause parser
to behave incorrectly by leaving dcfclk and socclk table entries unpopulated.
[HOW]
Currently the function that finds the corresponding clock for a given voltage
only checks for exact voltage level matches. In the case that no match gets
found, parser now falls back to searching for the max clock which meets the
requested voltage (i.e. its corresponding voltage is below requested).
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f4e54bd31 upstream.
CVE-2021-42327 was fixed by:
commit f23750b5b3
Author: Thelford Williams <tdwilliamsiv@gmail.com>
Date: Wed Oct 13 16:04:13 2021 -0400
drm/amdgpu: fix out of bounds write
but amdgpu_dm_debugfs.c contains more of the same issue so fix the
remaining ones.
v2:
* Add missing fix in dp_max_bpc_write (Harry Wentland)
Fixes: 918698d5c2 ("drm/amd/display: Return the number of bytes parsed than allocated")
Signed-off-by: Patrik Jakobsson <pjakobsson@suse.de>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e6f966308 upstream.
Reading out the DP encoders' DPCD during booting or resume is only
required for enabled encoders: such encoders may be modesetted during
the initial commit and the link training this involves depends on an
initialized DPCD. For DDI encoders reading out the DPCD is skipped, do
the same on pre-DDI platforms.
Atm, the first DPCD readout without a sink connected - which is a likely
scneario if the encoder is disabled - leaves intel_dp->num_common_rates
at 0, which resulted in
intel_dp_sync_state()->intel_dp_max_common_rate()
in a
intel_dp->common_rates[-1]
access. This by definition results in an undefined behaviour, though to
my best knowledge in all HW/compiler configurations it actually results
in accessing the array item type value preceding the array. In this
case the preceding value happens to be intel_dp->num_common_rates,
which is 0, so this issue - by luck - didn't cause a user visible
problem.
Nevertheless it's still an undefined behaviour and in CONFIG_UBSAN
builds leads to a kernel BUG() (which revealed this problem for us),
hence CC:stable.
A related problem in case the encoder is enabled but the sink is not
connected or the DPCD readout fails is fixed by the next patch.
v2: Amend the commit message describing the root cause of the
CONFIG_UBSAN BUG().
Fixes: a532cde31d ("drm/i915/tc: Fix TypeC port init/resume time sanitization")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/4297
Reported-and-tested-by: Mat Jonczyk <mat.jonczyk@o2.pl>
Cc: Mat Jonczyk <mat.jonczyk@o2.pl>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018094154.1407705-2-imre.deak@intel.com
(cherry picked from commit 4ec5ffc341)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82a4f329b1 upstream.
It looks like the voltages for the SOC and DRAM supply weren't properly
validated before. The datasheet and uboot-imx code tells us that VDD_SOC
should be 800 mV in suspend and 850 mV in run mode. VDD_DRAM should be
950 mV for DDR clock frequencies of up to 1.5 GHz.
Let's fix these values to make sure the voltages are within the required
range.
Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b28c41e3c upstream.
Previously we falsely relied on the PHY driver to unconditionally
enable the internal RX delay. Since the following fix for the PHY
driver this is not the case anymore:
commit 7b005a1742 ("net: phy: mscc: configure both RX and TX internal
delays for RGMII")
In order to enable the delay we need to set the connection type to
"rgmii-rxid". Without the RX delay the ethernet is not functional at
all.
Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ca6f9d85d5 upstream.
The MCP2515 can be used with an SPI clock of up to 10 MHz. Set the
limit accordingly to prevent any performance issues caused by the
really low clock speed of 100 kHz.
This removes the arbitrarily low limit on the SPI frequency, that was
caused by a typo in the original dts.
Without this change, receiving CAN messages on the board beyond a
certain bitrate will cause overrun errors (see 'ip -det -stat link show
can0').
With this fix, receiving messages on the bus works without any overrun
errors for bitrates up to 1 MBit.
Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6562d6e350 upstream.
The regulator reg_rst_eth2 should keep the reset signal of the USB ethernet
adapter deasserted anytime. Fix the polarity and mark it as always-on.
Anyway, using the regulator is only a workaround for the missing support of
specifying a reset GPIO for USB devices in a generic way. As we don't
have a solution for this at the moment, at least fix the current
workaround.
Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Cc: stable@vger.kernel.org
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eac96c3efd upstream.
When handling shmem page fault the THP with corrupted subpage could be
PMD mapped if certain conditions are satisfied. But kernel is supposed
to send SIGBUS when trying to map hwpoisoned page.
There are two paths which may do PMD map: fault around and regular
fault.
Before commit f9ce0be71d ("mm: Cleanup faultaround and finish_fault()
codepaths") the thing was even worse in fault around path. The THP
could be PMD mapped as long as the VMA fits regardless what subpage is
accessed and corrupted. After this commit as long as head page is not
corrupted the THP could be PMD mapped.
In the regular fault path the THP could be PMD mapped as long as the
corrupted page is not accessed and the VMA fits.
This loophole could be fixed by iterating every subpage to check if any
of them is hwpoisoned or not, but it is somewhat costly in page fault
path.
So introduce a new page flag called HasHWPoisoned on the first tail
page. It indicates the THP has hwpoisoned subpage(s). It is set if any
subpage of THP is found hwpoisoned by memory failure and after the
refcount is bumped successfully, then cleared when the THP is freed or
split.
The soft offline path doesn't need this since soft offline handler just
marks a subpage hwpoisoned when the subpage is migrated successfully.
But shmem THP didn't get split then migrated at all.
Link: https://lkml.kernel.org/r/20211020210755.23964-3-shy828301@gmail.com
Fixes: 800d8c63b2 ("shmem: add huge pages support")
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 672437486e upstream.
[Why]
Immediate flip can be enabled dynamically and has higher BW requirements
when validating which voltage mode to use.
If we validate when it's not set then potentially DCFCLK will be too low
and we will underflow.
[How]
DM always requires support so always require it as part of DML input
parameters.
This can't be enabled unconditionally on older ASIC because it blocks
some expected modes so only target DCN3.1 for now.
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Agustin Gutierrez Sanchez <agustin.gutierrez@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db6c3c064f upstream.
Add the missing endpoint max-packet sanity check to probe() to avoid
division by zero in lan78xx_tx_bh() in case a malicious device has
broken descriptors (or when doing descriptor fuzz testing).
Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4f ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Cc: stable@vger.kernel.org # 4.3
Cc: Woojung.Huh@microchip.com <Woojung.Huh@microchip.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09b1d5dc6c upstream.
The management registrations locking was broken, the list was
locked for each wdev, but cfg80211_mgmt_registrations_update()
iterated it without holding all the correct spinlocks, causing
list corruption.
Rather than trying to fix it with fine-grained locking, just
move the lock to the wiphy/rdev (still need the list on each
wdev), we already need to hold the wdev lock to change it, so
there's no contention on the lock in any case. This trivially
fixes the bug since we hold one wdev's lock already, and now
will hold the lock that protects all lists.
Cc: stable@vger.kernel.org
Reported-by: Jouni Malinen <j@w1.fi>
Fixes: 6cd536fe62 ("cfg80211: change internal management frame registration API")
Link: https://lore.kernel.org/r/20211025133111.5cf733eab0f4.I7b0abb0494ab712f74e2efcd24bb31ac33f7eee9@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 25e1f67eda upstream.
We should not access request members after the last send, even to
determine if indeed it was the last data payload send. The reason is
that a completion could have arrived and trigger a new execution of the
request which overridden these members. This was fixed by commit
825619b09a ("nvme-tcp: fix possible use-after-completion").
Commit e371af033c broke that assumption again to address cases where
multiple r2t pdus are sent per request. To fix it, we need to record the
request data_sent and data_len and after the payload network send we
reference these counters to determine weather we should advance the
request iterator.
Fixes: e371af033c ("nvme-tcp: fix incorrect h2cdata pdu offset accounting")
Reported-by: Keith Busch <kbusch@kernel.org>
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f1b228529 upstream.
Encountered a race between ocfs2_test_bg_bit_allocatable() and
jbd2_journal_put_journal_head() resulting in the below vmcore.
PID: 106879 TASK: ffff880244ba9c00 CPU: 2 COMMAND: "loop3"
Call trace:
panic
oops_end
no_context
__bad_area_nosemaphore
bad_area_nosemaphore
__do_page_fault
do_page_fault
page_fault
[exception RIP: ocfs2_block_group_find_clear_bits+316]
ocfs2_block_group_find_clear_bits [ocfs2]
ocfs2_cluster_group_search [ocfs2]
ocfs2_search_chain [ocfs2]
ocfs2_claim_suballoc_bits [ocfs2]
__ocfs2_claim_clusters [ocfs2]
ocfs2_claim_clusters [ocfs2]
ocfs2_local_alloc_slide_window [ocfs2]
ocfs2_reserve_local_alloc_bits [ocfs2]
ocfs2_reserve_clusters_with_limit [ocfs2]
ocfs2_reserve_clusters [ocfs2]
ocfs2_lock_refcount_allocators [ocfs2]
ocfs2_make_clusters_writable [ocfs2]
ocfs2_replace_cow [ocfs2]
ocfs2_refcount_cow [ocfs2]
ocfs2_file_write_iter [ocfs2]
lo_rw_aio
loop_queue_work
kthread_worker_fn
kthread
ret_from_fork
When ocfs2_test_bg_bit_allocatable() called bh2jh(bg_bh), the
bg_bh->b_private NULL as jbd2_journal_put_journal_head() raced and
released the jounal head from the buffer head. Needed to take bit lock
for the bit 'BH_JournalHead' to fix this race.
Link: https://lkml.kernel.org/r/1634820718-6043-1-git-send-email-gautham.ananthakrishna@oracle.com
Signed-off-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: <rajesh.sivaramasubramaniom@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e0c60d0102 upstream.
Commit a33df75c63 ("block: use an xarray for disk->part_tbl") modified
the method to check partition existence in host-aware zoned block
devices from disk_has_partitions() helper function call to empty check
of xarray disk->part_tbl. However, disk->part_tbl always has single
entry for disk->part0 and never becomes empty. This resulted in the
host-aware zoned devices always judged to have partitions, and it made
the sysfs queue/zoned attribute to be "none" instead of "host-aware"
regardless of partition existence in the devices.
This also caused DEBUG_LOCKS_WARN_ON(lock->magic != lock) for
sdkp->rev_mutex in scsi layer when the kernel detects host-aware zoned
device. Since block layer handled the host-aware zoned devices as non-
zoned devices, scsi layer did not have chance to initialize the mutex
for zone revalidation. Therefore, the warning was triggered.
To fix the issues, call the helper function disk_has_partitions() in
place of disk->part_tbl empty check. Since the function was removed with
the commit a33df75c63, reimplement it to walk through entries in the
xarray disk->part_tbl.
Fixes: a33df75c63 ("block: use an xarray for disk->part_tbl")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org # v5.14+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211026060115.753746-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9af372dc70 upstream.
To reset standard tuning circuit completely, after clear ESDHC_MIX_CTRL_EXE_TUNE,
also need to clear bit buffer_read_ready, this operation will finally clear the
USDHC IP internal logic flag execute_tuning_with_clr_buf, make sure the following
normal data transfer will not be impacted by standard tuning logic used before.
Find this issue when do quick SD card insert/remove stress test. During standard
tuning prodedure, if remove SD card, USDHC standard tuning logic can't clear the
internal flag execute_tuning_with_clr_buf. Next time when insert SD card, all
data related commands can't get any data related interrupts, include data transfer
complete interrupt, data timeout interrupt, data CRC interrupt, data end bit interrupt.
Always trigger software timeout issue. Even reset the USDHC through bits in register
SYS_CTRL (0x2C, bit28 reset tuning, bit26 reset data, bit 25 reset command, bit 24
reset all) can't recover this. From the user's point of view, USDHC stuck, SD can't
be recognized any more.
Fixes: d9370424c9 ("mmc: sdhci-esdhc-imx: reset tuning circuit when power on mmc card")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1634263236-6111-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 92b18252b9 upstream.
While mmc0 enter suspend state, we need halt CQE to send legacy cmd(flush
cache) and disable cqe, for resume back, we enable CQE and not clear HALT
state.
In this case MediaTek mmc host controller will keep the value for HALT
state after CQE disable/enable flow, so the next CQE transfer after resume
will be timeout due to CQE is in HALT state, the log as below:
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: timeout for tag 2
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: ============ CQHCI REGISTER DUMP ===========
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Caps: 0x100020b6 | Version: 0x00000510
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Config: 0x00001103 | Control: 0x00000001
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: TDL base: 0xfd05f000 | TDL up32: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Doorbell: 0x8000203c | TCN: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Dev queue: 0x00000000 | Dev Pend: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Task clr: 0x00000000 | SSC1: 0x00001000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: Resp idx: 0x00000000 | Resp arg: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: CRNQP: 0x00000000 | CRNQDUN: 0x00000000
<4>.(4)[318:kworker/4:1H]mmc0: cqhci: CRNQIS: 0x00000000 | CRNQIE: 0x00000000
This change check HALT state after CQE enable, if CQE is in HALT state, we
will clear it.
Signed-off-by: Wenbin Mei <wenbin.mei@mediatek.com>
Cc: stable@vger.kernel.org
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: a4080225f5 ("mmc: cqhci: support for command queue enabled host")
Link: https://lore.kernel.org/r/20211026070812.9359-1-wenbin.mei@mediatek.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da353fac65 upstream.
sk->sk_err appears to expect a positive value, a convention that ktls
doesn't always follow and that leads to memory corruption in other code.
For instance,
[kworker]
tls_encrypt_done(..., err=<negative error from crypto request>)
tls_err_abort(.., err)
sk->sk_err = err;
[task]
splice_from_pipe_feed
...
tls_sw_do_sendpage
if (sk->sk_err) {
ret = -sk->sk_err; // ret is positive
splice_from_pipe_feed (continued)
ret = actor(...) // ret is still positive and interpreted as bytes
// written, resulting in underflow of buf->len and
// sd->len, leading to huge buf->offset and bogus
// addresses computed in later calls to actor()
Fix all tls_err_abort() callers to pass a negative error code
consistently and centralize the error-prone sign flip there, throwing in
a warning to catch future misuse and uninlining the function so it
really does only warn once.
Cc: stable@vger.kernel.org
Fixes: c46234ebb4 ("tls: RX path for ktls")
Reported-by: syzbot+b187b77c8474f9648fae@syzkaller.appspotmail.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2195f2062e upstream.
During probing, the driver tries to get a list (mask) of supported
command types in port100_get_command_type_mask() function. The value
is u64 and 0 is treated as invalid mask (no commands supported). The
function however returns also -ERRNO as u64 which will be interpret as
valid command mask.
Return 0 on every error case of port100_get_command_type_mask(), so the
probing will stop.
Cc: <stable@vger.kernel.org>
Fixes: 0347a6ab30 ("NFC: port100: Commands mechanism implementation")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa40d9734a upstream.
The function tipc_crypto_key_rcv is used to parse MSG_CRYPTO messages
to receive keys from other nodes in the cluster in order to decrypt any
further messages from them.
This patch verifies that any supplied sizes in the message body are
valid for the received message.
Fixes: 1ef6f7c939 ("tipc: add automatic session key exchange")
Signed-off-by: Max VA <maxv@sentinelone.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0023bb9dd upstream.
mv_init_host() propagates the value returned by mv_chip_id() which in turn
gets propagated by mv_pci_init_one() and hits local_pci_probe().
During the process of driver probing, the probe function should return < 0
for failure, otherwise, the kernel will treat value > 0 as success.
Since this is a bug rather than a recoverable runtime error we should
use dev_alert() instead of dev_err().
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e5a04be88 upstream.
Some systems such as the Microsoft Surface Laptop 4 leave interrupts
enabled and configured for use in sleep states on boot, which cause
unexpected behaviour such as spurious wakes and failed resumes in
s2idle states.
As interrupts should not be enabled until they are claimed and
explicitly enabled, disabling any interrupts mistakenly left enabled by
firmware should be safe.
Signed-off-by: Sachi King <nakato@nakato.io>
Link: https://lore.kernel.org/r/20211009033240.21543-1-nakato@nakato.io
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6dba4bdfd7 upstream.
This reverts commit a49d784d5a.
The updated binding was wrong / invalid and has been reverted. There
isn't any upstream kernel DTS using it and Broadcom isn't known to use
it neither. There is close to zero chance this will cause regression for
anyone.
Actually in-kernel bcm5301x.dtsi still uses the old good binding and so
it's broken since the driver update. This revert fixes it.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Link: https://lore.kernel.org/r/20211008205938.29925-3-zajec5@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 00568b8a63 upstream.
My intel-ixp42x-welltech-epbx100 no longer boot since 4.14.
This is due to commit 463dbba4d1 ("ARM: 9104/2: Fix Keystone 2 kernel
mapping regression")
which forgot to handle CONFIG_CPU_ENDIAN_BE32 as possible BE config.
Suggested-by: Krzysztof Hałasa <khalasa@piap.pl>
Fixes: 463dbba4d1 ("ARM: 9104/2: Fix Keystone 2 kernel mapping regression")
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48ccc8edf5 upstream.
In randconfig builds, we sometimes come across this warning:
arm-linux-gnueabi-ld: XIP start address may cause MPU programming issues
While this is helpful for actual systems to figure out why it
fails, the warning does not provide any benefit for build testing,
so guard it in a check for CONFIG_COMPILE_TEST, which is usually
set on randconfig builds.
Fixes: 216218308c ("ARM: 8713/1: NOMMU: Support MPU in XIP configuration")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 44cc6412e6 upstream.
When frame pointers are used instead of the ARM unwinder,
and the kernel is built using clang with an external assembler
and CONFIG_XIP_KERNEL, every file produces two warnings
like:
arm-linux-gnueabi-ld: warning: orphan section `.ARM.extab' from `net/mac802154/util.o' being placed in section `.ARM.extab'
arm-linux-gnueabi-ld: warning: orphan section `.ARM.exidx' from `net/mac802154/util.o' being placed in section `.ARM.exidx'
The same fix was already merged for the normal (non-XIP)
linker script, with a longer description.
Fixes: c39866f268 ("arm/build: Always handle .ARM.exidx and .ARM.extab sections")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eaf6cc7165 upstream.
Both the decompressor code and the kasan logic try to override
the memcpy() and memmove() definitions, which leading to a clash
in a KASAN-enabled kernel with XZ decompression:
arch/arm/boot/compressed/decompress.c:50:9: error: 'memmove' macro redefined [-Werror,-Wmacro-redefined]
#define memmove memmove
^
arch/arm/include/asm/string.h:59:9: note: previous definition is here
#define memmove(dst, src, len) __memmove(dst, src, len)
^
arch/arm/boot/compressed/decompress.c:51:9: error: 'memcpy' macro redefined [-Werror,-Wmacro-redefined]
#define memcpy memcpy
^
arch/arm/include/asm/string.h:58:9: note: previous definition is here
#define memcpy(dst, src, len) __memcpy(dst, src, len)
^
Here we want the set of functions from the decompressor, so undefine
the other macros before the override.
Link: https://lore.kernel.org/linux-arm-kernel/CACRpkdZYJogU_SN3H9oeVq=zJkRgRT1gDz3xp59gdqWXxw-B=w@mail.gmail.com/
Link: https://lore.kernel.org/lkml/202105091112.F5rmd4By-lkp@intel.com/
Fixes: d6d51a96c7 ("ARM: 9014/2: Replace string mem* functions for KASan")
Fixes: a7f464f3db ("ARM: 7001/2: Wire up support for the XZ decompressor")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df909df077 upstream.
ARM: kasan: Fix __get_user_check failure with kasan
In macro __get_user_check defined in arch/arm/include/asm/uaccess.h,
error code is store in register int __e(r0). When kasan is
enabled, assigning value to kernel address might trigger kasan check,
which unexpectedly overwrites r0 and causes undefined behavior on arm
kasan images.
One example is failure in do_futex and results in process soft lockup.
Log:
watchdog: BUG: soft lockup - CPU#0 stuck for 62946ms! [rs:main
Q:Reg:1151]
...
(__asan_store4) from (futex_wait_setup+0xf8/0x2b4)
(futex_wait_setup) from (futex_wait+0x138/0x394)
(futex_wait) from (do_futex+0x164/0xe40)
(do_futex) from (sys_futex_time32+0x178/0x230)
(sys_futex_time32) from (ret_fast_syscall+0x0/0x50)
The soft lockup happens in function futex_wait_setup. The reason is
function get_futex_value_locked always return EINVAL, thus pc jump
back to retry label and causes looping.
This line in function get_futex_value_locked
ret = __get_user(*dest, from);
is expanded to
*dest = (typeof(*(p))) __r2; ,
in macro __get_user_check. Writing to pointer dest triggers kasan check
and overwrites the return value of __get_user_x function.
The assembly code of get_futex_value_locked in kernel/futex.c:
...
c01f6dc8: eb0b020e bl c04b7608 <__get_user_4>
// "x = (typeof(*(p))) __r2;" triggers kasan check and r0 is overwritten
c01f6dCc: e1a00007 mov r0, r7
c01f6dd0: e1a05002 mov r5, r2
c01f6dd4: eb04f1e6 bl c0333574 <__asan_store4>
c01f6dd8: e5875000 str r5, [r7]
// save ret value of __get_user(*dest, from), which is dest address now
c01f6ddc: e1a05000 mov r5, r0
...
// checking return value of __get_user failed
c01f6e00: e3550000 cmp r5, #0
...
c01f6e0c: 01a00005 moveq r0, r5
// assign return value to EINVAL
c01f6e10: 13e0000d mvnne r0, #13
Return value is the destination address of get_user thus certainly
non-zero, so get_futex_value_locked always return EINVAL.
Fix it by using a tmp vairable to store the error code before the
assignment. This fix has no effects to non-kasan images thanks to compiler
optimization. It only affects cases that overwrite r0 due to kasan check.
This should fix bug discussed in Link:
[1] https://lore.kernel.org/linux-arm-kernel/0ef7c2a5-5d8b-c5e0-63fa-31693fd4495c@gmail.com/
Fixes: 421015713b ("ARM: 9017/2: Enable KASan for ARM")
Signed-off-by: Lexi Shao <shaolexi@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d417cbe36 upstream.
tglx notes:
This function [futex_detect_cmpxchg] is only needed when an
architecture has to runtime discover whether the CPU supports it or
not. ARM has unconditional support for this, so the obvious thing to
do is the below.
Fixes linkage failure from Clang randconfigs:
kernel/futex.o:(.text.fixup+0x5c): relocation truncated to fit: R_ARM_JUMP24 against `.init.text'
and boot failures for CONFIG_THUMB2_KERNEL.
Link: https://github.com/ClangBuiltLinux/linux/issues/325
Comments from Nick Desaulniers:
See-also: 03b8c7b623 ("futex: Allow architectures to skip
futex_atomic_cmpxchg_inatomic() test")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org # v3.14+
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a46044a92a upstream.
Since commit 2a671f77ee ("s390/pci: fix use after free of zpci_dev")
the reference count of a zpci_dev is incremented between
pcibios_add_device() and pcibios_release_device() which was supposed to
prevent the zpci_dev from being freed while the common PCI code has
access to it. It was missed however that the handling of zPCI
availability events assumed that once zpci_zdev_put() was called no
later availability event would still see the device. With the previously
mentioned commit however this assumption no longer holds and we must
make sure that we only drop the initial long-lived reference the zPCI
subsystem holds exactly once.
Do so by introducing a zpci_device_reserved() function that handles when
a device is reserved. Here we make sure the zpci_dev will not be
considered for further events by removing it from the zpci_list.
This also means that the device actually stays in the
ZPCI_FN_STATE_RESERVED state between the time we know it has been
reserved and the final reference going away. We thus need to consider it
a real state instead of just a conceptual state after the removal. The
final cleanup of PCI resources, removal from zbus, and destruction of
the IOMMU stays in zpci_release_device() to make sure holders of the
reference do see valid data until the release.
Fixes: 2a671f77ee ("s390/pci: fix use after free of zpci_dev")
Cc: stable@vger.kernel.org
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02368b7cf6 upstream.
It's currently safe to call zpci_cleanup_bus_resources() even if the
resources were never created but it makes no sense so check
zdev->has_resources before we call zpci_cleanup_bus_resources() in
zpci_release_device().
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 25f54d08f1 upstream.
There's a mistake in commit 2be7828c9f ("get rid of autofs_getpath()")
that affects kernels from v5.13.0, basically missed because of me not
fully testing the change for Al.
The problem is that the hash calculation for the wait name qstr hasn't
been updated to account for the change to use dentry_path_raw(). This
prevents the correct matching an existing wait resulting in multiple
notifications being sent to the daemon for the same mount which must
not occur.
The problem wasn't discovered earlier because it only occurs when
multiple processes trigger a request for the same mount concurrently
so it only shows up in more aggressive testing.
Fixes: 2be7828c9f ("get rid of autofs_getpath()")
Cc: stable@vger.kernel.org
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c026565fe9 ]
Enable one additional plane that is alpha blended on top
of the primary plane.
This also fixes the below warnings when building with
-Warray-bounds:
drivers/gpu/drm/kmb/kmb_plane.c:135:20: warning: array subscript 3 is
above array bounds of 'struct layer_status[1]' [-Warray-bounds]
drivers/gpu/drm/kmb/kmb_plane.c:132:20: warning: array subscript 2 is
above array bounds of 'struct layer_status[1]' [-Warray-bounds]
drivers/gpu/drm/kmb/kmb_plane.c:129:20: warning: array subscript 1 is
above array bounds of 'struct layer_status[1]' [-Warray-bounds]
v2: corrected previous patch dependecies so it builds
Signed-off-by: Edmund Dea <edmund.j.dea@intel.com>
Signed-off-by: Anitha Chrisanthus <anitha.chrisanthus@intel.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20210728003126.1425028-13-anitha.chrisanthus@intel.com/
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 14fe2471c6 ]
Both multipath and bonding events are changing the HW LAG state
independently.
Handling one of the features events while the other is already
enabled can cause unwanted behavior, for example handling
bonding event while multipath enabled will disable the lag and
cause multipath to stop working.
Fix it by ignoring bonding event while in multipath and ignoring FIB
events while in bonding mode.
Fixes: 544fe7c2e6 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 63d4a9afbc ]
If a netdev is removed from the lag the lag should be destroyed.
With downstream patches this might trigger a reconfiguration of
representors on a different eswitch and such we don't have the proper
locking to so from this path. Move the destruction to be done by the
workqueue.
As the destruction won't affect the netdev side it okay to do so.
The RDMA side will be reconfigured and it already coded to handle such
reconfiguration.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f9f0f1999 ]
rx unused desc is the desc that need attatching new buffer
before refilling to hw to receive new packet, the number of
desc need attatching new buffer is calculated using next_to_use
and next_to_clean. when next_to_use == next_to_clean, currently
hns3 driver assumes that all the desc has the buffer attatched,
but 'next_to_use == next_to_clean' also means all the desc need
attatching new buffer if hw has comsumed all the desc and the
driver has not attatched any buffer to the desc yet.
This patch adds 'refill' in desc_cb to indicate whether a new
buffer has been refilled to a desc.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 63acd42c0d ]
Commit f1a0a376ca ("sched/core: Initialize the idle task with
preemption disabled") removed the init_idle() call from
idle_thread_get(). This was the sole call-path on hotplug that resets
the Shadow Call Stack (scs) Stack Pointer (sp).
Not resetting the scs-sp leads to scs overflow after enough hotplug
cycles. Therefore add an explicit scs_task_reset() to the hotplug code
to make sure the scs-sp does get reset on hotplug.
Fixes: f1a0a376ca ("sched/core: Initialize the idle task with preemption disabled")
Signed-off-by: Woody Lin <woodylin@google.com>
[peterz: Changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lore.kernel.org/r/20211012083521.973587-1-woodylin@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7fb223d0ad ]
Commit 8c0eb596ba ("[SCSI] qla2xxx: Fix a memory leak in an error path of
qla2x00_process_els()"), intended to change:
bsg_job->request->msgcode == FC_BSG_HST_ELS_NOLOGIN
to:
bsg_job->request->msgcode != FC_BSG_RPT_ELS
but changed it to:
bsg_job->request->msgcode == FC_BSG_RPT_ELS
instead.
Change the == to a != to avoid leaking the fcport structure or freeing
unallocated memory.
Link: https://lore.kernel.org/r/20211012191834.90306-2-jgu@purestorage.com
Fixes: 8c0eb596ba ("[SCSI] qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Joy Gu <jgu@purestorage.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 187a580c9e ]
In commit 9e67600ed6 ("scsi: iscsi: Fix race condition between login and
sync thread") we meant to add a check where before we call ->set_param() we
make sure the iscsi_cls_connection is bound. The problem is that between
versions 4 and 5 of the patch the deletion of the unchecked set_param()
call was dropped so we ended up with 2 calls. As a result we can still hit
a crash where we access the unbound connection on the first call.
This patch removes that first call.
Fixes: 9e67600ed6 ("scsi: iscsi: Fix race condition between login and sync thread")
Link: https://lore.kernel.org/r/20211010161904.60471-1-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Li Feng <fengli@smartx.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d997cc1715 ]
On i.MX7S and i.MX8M* (but not i.MX6*) the pwrkey device has an
associated clock. Accessing the registers requires that this clock is
enabled. Binding the driver on at least i.MX7S and i.MX8MP while not
having the clock enabled results in a complete hang of the machine.
(This usually only happens if snvs_pwrkey is built as a module and the
rtc-snvs driver isn't already bound because at bootup the required clk
is on and only gets disabled when the clk framework disables unused clks
late during boot.)
This completes the fix in commit 135be16d35 ("ARM: dts: imx7s: add
snvs clock to pwrkey").
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20211013062848.2667192-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ff6d64e68 ]
The `cpu` argument of perf_evsel__read() must specify the cpu index.
perf_cpu_map__for_each_cpu() is for iterating the cpu number (not index)
and is thus not appropriate for use with perf_evsel__read().
So, if there is an offline CPU, the cpu number specified in the argument
may point out of range because the cpu number and the cpu index are
different.
Fix test_stat_cpu().
Testing it:
# make tests -C tools/lib/perf/
make: Entering directory '/home/nakamura/kernel_src/linux-5.15-rc4_fix/tools/lib/perf'
running static:
- running tests/test-cpumap.c...OK
- running tests/test-threadmap.c...OK
- running tests/test-evlist.c...OK
- running tests/test-evsel.c...OK
running dynamic:
- running tests/test-cpumap.c...OK
- running tests/test-threadmap.c...OK
- running tests/test-evlist.c...OK
- running tests/test-evsel.c...OK
make: Leaving directory '/home/nakamura/kernel_src/linux-5.15-rc4_fix/tools/lib/perf'
Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211011083704.4108720-1-nakamura.shun@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 16a8e2fbb2 ]
io_mutex is taken by spi_setup() and spi-mux's .setup() callback calls
spi_setup() which results in a nested lock of io_mutex.
add_lock is taken by spi_add_device(). The device_add() call in there
can result in calling spi-mux's .probe() callback which registers its
own spi controller which in turn results in spi_add_device() being
called again.
To fix this initialize the controller's locks already in
spi_alloc_controller() to give spi_mux_probe() a chance to set the
lockdep subclass.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20211013133710.2679703-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6098475d4c ]
Currently we have a global spi_add_lock which we take when adding new
devices so that we can check that we're not trying to reuse a chip
select that's already controlled. This means that if the SPI device is
itself a SPI controller and triggers the instantiation of further SPI
devices we trigger a deadlock as we try to register and instantiate
those devices while in the process of doing so for the parent controller
and hence already holding the global spi_add_lock. Since we only care
about concurrency within a single SPI bus move the lock to be per
controller, avoiding the deadlock.
This can be easily triggered in the case of spi-mux.
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b37a15188e ]
The snd_hdac_bus_reset_link() contains logic to clear STATESTS register
before performing controller reset. This code dates back to an old
bugfix in commit e8a7f136f5 ("[ALSA] hda-intel - Improve HD-audio
codec probing robustness"). Originally the code was added to
azx_reset().
The code was moved around in commit a41d122449 ("ALSA: hda - Embed bus
into controller object") and ended up to snd_hdac_bus_reset_link() and
called primarily via snd_hdac_bus_init_chip().
The logic to clear STATESTS is correct when snd_hdac_bus_init_chip() is
called when controller is not in reset. In this case, STATESTS can be
cleared. This can be useful e.g. when forcing a controller reset to retry
codec probe. A normal non-power-on reset will not clear the bits.
However, this old logic is problematic when controller is already in
reset. The HDA specification states that controller must be taken out of
reset before writing to registers other than GCTL.CRST (1.0a spec,
3.3.7). The write to STATESTS in snd_hdac_bus_reset_link() will be lost
if the controller is already in reset per the HDA specification mentioned.
This has been harmless on older hardware. On newer generation of Intel
PCIe based HDA controllers, if configured to report issues, this write
will emit an unsupported request error. If ACPI Platform Error Interface
(APEI) is enabled in kernel, this will end up to kernel log.
Fix the code in snd_hdac_bus_reset_link() to only clear the STATESTS if
the function is called when controller is not in reset. Otherwise
clearing the bits is not possible and should be skipped.
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/20211012142935.3731820-1-kai.vehmanen@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6636fec29c ]
On SPEAr3xx, ethernet driver is not compatible with the SPEAr600
one.
Indeed, SPEAr3xx uses an earlier version of this IP (v3.40) and
needs some driver tuning compare to SPEAr600.
The v3.40 IP support was added to stmmac driver and this patch
fixes this issue and use the correct compatible string for
SPEAr3xx
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 77a5b9e3d1 ]
Currently inode_in_dir() ignores errors returned from
btrfs_lookup_dir_index_item() and from btrfs_lookup_dir_item(), treating
any errors as if the directory entry does not exists in the fs/subvolume
tree, which is obviously not correct, as we can get errors such as -EIO
when reading extent buffers while searching the fs/subvolume's tree.
Fix that by making inode_in_dir() return the errors and making its only
caller, add_inode_ref(), deal with returned errors as well.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c0f1886de7 ]
It seems that a few recent AMD systems show the codec configuration
errors at the early boot, while loading the driver at a later stage
works magically. Although the root cause of the error isn't clear,
it's certainly not bad to allow retrying the codec probe in such a
case if that helps.
This patch adds the capability for retrying the probe upon codec probe
errors on the certain AMD platforms. The probe_work is changed to a
delayed work, and at the secondary call, it'll jump to the codec
probing.
Note that, not only adding the re-probing, this includes the behavior
changes in the codec configuration function. Namely,
snd_hda_codec_configure() won't unregister the codec at errors any
longer. Instead, its caller, azx_codec_configure() unregisters the
codecs with the probe failures *if* any codec has been successfully
configured. If all codec probe failed, it doesn't unregister but let
it re-probed -- which is the most case we're seeing and this patch
tries to improve.
Even if the driver doesn't re-probe or give up, it will go to the
"free-all" error path, hence the leftover codecs shall be disabled /
deleted in anyway.
BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1190801
Link: https://lore.kernel.org/r/20211006141940.2897-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a8cf90332a ]
The structleak plugin causes the stack frame size to grow immensely:
lib/bitfield_kunit.c: In function 'test_bitfields_constants':
lib/bitfield_kunit.c:93:1: error: the frame size of 7440 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
Turn it off in this file.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 33d4951e02 ]
The structleak plugin causes the stack frame size to grow immensely when
used with KUnit:
drivers/thunderbolt/test.c:1529:1: error: the frame size of 1176 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
Turn it off in this file.
Linus already split up tests in this file, so this change *should* be
redundant now.
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6a1e2d93d5 ]
The structleak plugin causes the stack frame size to grow immensely when
used with KUnit:
../drivers/base/test/property-entry-test.c:492:1: warning: the frame size of 2832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/base/test/property-entry-test.c:322:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/base/test/property-entry-test.c:250:1: warning: the frame size of 4976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/base/test/property-entry-test.c:115:1: warning: the frame size of 3280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
Turn it off in this file.
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2326f3cdba ]
The structleak plugin causes the stack frame size to grow immensely when
used with KUnit:
../drivers/iio/test/iio-test-format.c: In function ‘iio_test_iio_format_value_fixedpoint’:
../drivers/iio/test/iio-test-format.c:98:1: warning: the frame size of 2336 bytes is larger than 2048 bytes [-Wframe-larger-than=]
Turn it off in this file.
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f6f59072e8 ]
I've seen some crashes in our crash reporting that *look* like multiple
threads stomping on each other while communicating with GMU. So wrap
all those paths in a lock.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f62314b1ce ]
The reference counting issue happens in the normal path of
kfree_at_end(). When kunit_alloc_and_get_resource() is invoked, the
function forgets to handle the returned resource object, whose refcount
increased inside, causing a refcount leak.
Fix this issue by calling kunit_alloc_resource() instead of
kunit_alloc_and_get_resource().
Fixed the following when applying:
Shuah Khan <skhan@linuxfoundation.org>
CHECK: Alignment should match open parenthesis
+ kunit_alloc_resource(test, NULL, kfree_res_free, GFP_KERNEL,
(void *)to_free);
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit adfb7b4966 upstream.
Currently the max tx size supported by the hw is calculated by
using the max BD num supported by the hw. According to the hw
user manual, the max tx size is fixed value for both non-TSO and
TSO skb.
This patch updates the max tx size according to the manual.
Fixes: 8ae10cfb5089("net: hns3: support tx-scatter-gather-fraglist feature")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit baa1e5ca17 upstream.
The refactoring in commit bb18a67774 ("KVM: SEV: Acquire
vcpu mutex when updating VMSA") left behind the assignment to
svm->vcpu.arch.guest_state_protected; add it back.
Signed-off-by: Peter Gonda <pgonda@google.com>
[Delta between v2 and v3 of Peter's patch, which had already been
committed; the commit message is my own. - Paolo]
Fixes: bb18a67774 ("KVM: SEV: Acquire vcpu mutex when updating VMSA")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fac3cb82a5 upstream.
When I added IGMPv3 support I decided to follow the RFC for computing
the GMI dynamically:
" 8.4. Group Membership Interval
The Group Membership Interval is the amount of time that must pass
before a multicast router decides there are no more members of a
group or a particular source on a network.
This value MUST be ((the Robustness Variable) times (the Query
Interval)) plus (one Query Response Interval)."
But that actually is inconsistent with how the bridge used to compute it
for IGMPv2, where it was user-configurable that has a correct default value
but it is up to user-space to maintain it. This would make it consistent
with the other timer values which are also maintained correct by the user
instead of being dynamically computed. It also changes back to the previous
user-expected GMI behaviour for IGMPv3 queries which were supported before
IGMPv3 was added. Note that to properly compute it dynamically we would
need to add support for "Robustness Variable" which is currently missing.
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: 0436862e41 ("net: bridge: mcast: support for IGMPv3/MLDv2 ALLOW_NEW_SOURCES report")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b1499a817 upstream.
The nci_core_conn_close_rsp_packet() function will release the conn_info
with given conn_id. However, it needs to set the rf_conn_info to NULL to
prevent other routines like nci_rf_intf_activated_ntf_packet() to trigger
the UAF.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b5efc930b upstream.
complete_emulator_pio_in can expect that vcpu->arch.pio has been filled in,
and therefore does not need the size and count arguments. This makes things
nicer when the function is called directly from a complete_userspace_io
callback.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b27de2718 upstream.
emulator_pio_in handles both the case where the data is pending in
vcpu->arch.pio.count, and the case where I/O has to be done via either
an in-kernel device or a userspace exit. For SEV-ES we would like
to split these, to identify clearly the moment at which the
sev_pio_data is consumed. To this end, create two different
functions: __emulator_pio_in fills in vcpu->arch.pio.count, while
complete_emulator_pio_in clears it and releases vcpu->arch.pio.data.
Because this patch has to be backported, things are left a bit messy.
kernel_pio() operates on vcpu->arch.pio, which leads to emulator_pio_in()
having with two calls to complete_emulator_pio_in(). It will be fixed
in the next release.
While at it, remove the unused void* val argument of emulator_pio_in_out.
The function currently hardcodes vcpu->arch.pio_data as the
source/destination buffer, which sucks but will be fixed after the more
severe SEV-ES buffer overflow.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de7cd3f676 upstream.
The kvm_x86_sync_pir_to_irr callback can sometimes set KVM_REQ_EVENT.
If that happens exactly at the time that an exit is handled as
EXIT_FASTPATH_REENTER_GUEST, vcpu_enter_guest will go incorrectly
through the loop that calls kvm_x86_run, instead of processing
the request promptly.
Fixes: 379a3c8ee4 ("KVM: VMX: Optimize posted-interrupt delivery for timer fastpath")
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d33b1baeb upstream.
Currently emulator_pio_in clears vcpu->arch.pio.count twice if
emulator_pio_in_out performs kernel PIO. Move the clear into
emulator_pio_out where it is actually necessary.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f1ee7b169 upstream.
The size of the GHCB scratch area is limited to 16 KiB (GHCB_SCRATCH_AREA_LIMIT),
so there is no need for it to be a u64. This fixes a build error on 32-bit
systems:
i686-linux-gnu-ld: arch/x86/kvm/svm/sev.o: in function `sev_es_string_io:
sev.c:(.text+0x110f): undefined reference to `__udivdi3'
Cc: stable@vger.kernel.org
Fixes: 019057bd73 ("KVM: SEV-ES: fix length of string I/O")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95e16b4792 upstream.
The PIO scratch buffer is larger than a single page, and therefore
it is not possible to copy it in a single step to vcpu->arch/pio_data.
Bound each call to emulator_pio_in/out to a single page; keep
track of how many I/O operations are left in vcpu->arch.sev_pio_count,
so that the operation can be restarted in the complete_userspace_io
callback.
For OUT, this means that the previous kvm_sev_es_outs implementation
becomes an iterator of the loop, and we can consume the sev_pio_data
buffer before leaving to userspace.
For IN, instead, consuming the buffer and decreasing sev_pio_count
is always done in the complete_userspace_io callback, because that
is when the memcpy is done into sev_pio_data.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reported-by: Felix Wilhelm <fwilhelm@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 019057bd73 upstream.
The size of the data in the scratch buffer is not divided by the size of
each port I/O operation, so vcpu->arch.pio.count ends up being larger
than it should be by a factor of size.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea724ea420 upstream.
A few very small cleanups to the functions, smushed together because
the patch is already very small like this:
- inline emulator_pio_in_emulated and emulator_pio_out_emulated,
since we already have the vCPU
- remove the data argument and pull setting vcpu->arch.sev_pio_data into
the caller
- remove unnecessary clearing of vcpu->arch.pio.count when
emulation is done by the kernel (and therefore vcpu->arch.pio.count
is already clear on exit from emulator_pio_in and emulator_pio_out).
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5998402e3 upstream.
We will be using this field for OUTS emulation as well, in case the
data that is pushed via OUTS spans more than one page. In that case,
there will be a need to save the data pointer across exits to userspace.
So, change the name to something that refers to any kind of PIO.
Also spell out what it is used for, namely SEV-ES.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: 7ed9abfe8e ("KVM: SVM: Support string IO operations for an SEV-ES guest")
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a25dfa67f upstream.
Since commit c300ab9f08 ("KVM: x86: Replace late check_nested_events() hack with
more precise fix") there is no longer the certainty that check_nested_events()
tries to inject an external interrupt vmexit to L1 on every call to vcpu_enter_guest.
Therefore, even in that case we need to set KVM_REQ_EVENT. This ensures
that inject_pending_event() is called, and from there kvm_check_nested_events().
Fixes: c300ab9f08 ("KVM: x86: Replace late check_nested_events() hack with more precise fix")
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 496c5fe25c upstream.
In isa206_idle_insn_mayloss() we store various registers into the stack
red zone, which is allowed.
However inside the IDLE_STATE_ENTER_SEQ_NORET macro we save r2 again,
to 0(r1), which corrupts the stack back chain.
We used to do the same in isa206_idle_insn_mayloss() itself, but we
fixed that in 73287caa92 ("powerpc64/idle: Fix SP offsets when saving
GPRs"), however we missed that the macro also corrupts the back chain.
Corrupting the back chain is bad for debuggability but doesn't
necessarily cause a bug.
However we recently changed the stack handling in some KVM code, and it
now relies on the stack back chain being valid when it returns. The
corruption causes that code to return with r1 pointing somewhere in
kernel data, at some point LR is restored from the stack and we branch
to NULL or somewhere else invalid.
Only affects Power8 hosts running KVM guests, with dynamic_mt_modes
enabled (which it is by default).
The fixes tag below points to the commit that changed the KVM stack
handling, exposing this bug. The actual corruption of the back chain has
always existed since 948cf67c47 ("powerpc: Add NAP mode support on
Power7 in HV mode").
Fixes: 9b4416c509 ("KVM: PPC: Book3S HV: Fix stack handling in idle_kvm_start_guest()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211020094826.3222052-1-mpe@ellerman.id.au
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cdeb5d7d89 upstream.
We call idle_kvm_start_guest() from power7_offline() if the thread has
been requested to enter KVM. We pass it the SRR1 value that was returned
from power7_idle_insn() which tells us what sort of wakeup we're
processing.
Depending on the SRR1 value we pass in, the KVM code might enter the
guest, or it might return to us to do some host action if the wakeup
requires it.
If idle_kvm_start_guest() is able to handle the wakeup, and enter the
guest it is supposed to indicate that by returning a zero SRR1 value to
us.
That was the behaviour prior to commit 10d91611f4 ("powerpc/64s:
Reimplement book3s idle code in C"), however in that commit the
handling of SRR1 was reworked, and the zeroing behaviour was lost.
Returning from idle_kvm_start_guest() without zeroing the SRR1 value can
confuse the host offline code, causing the guest to crash and other
weirdness.
Fixes: 10d91611f4 ("powerpc/64s: Reimplement book3s idle code in C")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211015133929.832061-2-mpe@ellerman.id.au
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b4416c509 upstream.
In commit 10d91611f4 ("powerpc/64s: Reimplement book3s idle code in
C") kvm_start_guest() became idle_kvm_start_guest(). The old code
allocated a stack frame on the emergency stack, but didn't use the
frame to store anything, and also didn't store anything in its caller's
frame.
idle_kvm_start_guest() on the other hand is written more like a normal C
function, it creates a frame on entry, and also stores CR/LR into its
callers frame (per the ABI). The problem is that there is no caller
frame on the emergency stack.
The emergency stack for a given CPU is allocated with:
paca_ptrs[i]->emergency_sp = alloc_stack(limit, i) + THREAD_SIZE;
So emergency_sp actually points to the first address above the emergency
stack allocation for a given CPU, we must not store above it without
first decrementing it to create a frame. This is different to the
regular kernel stack, paca->kstack, which is initialised to point at an
initial frame that is ready to use.
idle_kvm_start_guest() stores the backchain, CR and LR all of which
write outside the allocation for the emergency stack. It then creates a
stack frame and saves the non-volatile registers. Unfortunately the
frame it creates is not large enough to fit the non-volatiles, and so
the saving of the non-volatile registers also writes outside the
emergency stack allocation.
The end result is that we corrupt whatever is at 0-24 bytes, and 112-248
bytes above the emergency stack allocation.
In practice this has gone unnoticed because the memory immediately above
the emergency stack happens to be used for other stack allocations,
either another CPUs mc_emergency_sp or an IRQ stack. See the order of
calls to irqstack_early_init() and emergency_stack_init().
The low addresses of another stack are the top of that stack, and so are
only used if that stack is under extreme pressue, which essentially
never happens in practice - and if it did there's a high likelyhood we'd
crash due to that stack overflowing.
Still, we shouldn't be corrupting someone else's stack, and it is purely
luck that we aren't corrupting something else.
To fix it we save CR/LR into the caller's frame using the existing r1 on
entry, we then create a SWITCH_FRAME_SIZE frame (which has space for
pt_regs) on the emergency stack with the backchain pointing to the
existing stack, and then finally we switch to the new frame on the
emergency stack.
Fixes: 10d91611f4 ("powerpc/64s: Reimplement book3s idle code in C")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211015133929.832061-1-mpe@ellerman.id.au
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15bc01effe upstream.
In commit fda31c5029 ("signal: avoid double atomic counter
increments for user accounting") Linus made a clever optimization to
how rlimits and the struct user_struct. Unfortunately that
optimization does not work in the obvious way when moved to nested
rlimits. The problem is that the last decrement of the per user
namespace per user sigpending counter might also be the last decrement
of the sigpending counter in the parent user namespace as well. Which
means that simply freeing the leaf ucount in __free_sigqueue is not
enough.
Maintain the optimization and handle the tricky cases by introducing
inc_rlimit_get_ucounts and dec_rlimit_put_ucounts.
By moving the entire optimization into functions that perform all of
the work it becomes possible to ensure that every level is handled
properly.
The new function inc_rlimit_get_ucounts returns 0 on failure to
increment the ucount. This is different than inc_rlimit_ucounts which
increments the ucounts and returns LONG_MAX if the ucount counter has
exceeded it's maximum or it wrapped (to indicate the counter needs to
decremented).
I wish we had a single user to account all pending signals to across
all of the threads of a process so this complexity was not necessary
Cc: stable@vger.kernel.org
Fixes: d646969055 ("Reimplement RLIMIT_SIGPENDING on top of ucounts")
v1: https://lkml.kernel.org/r/87mtnavszx.fsf_-_@disp2133
Link: https://lkml.kernel.org/r/87fssytizw.fsf_-_@disp2133
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Tested-by: Rune Kleveland <rune.kleveland@infomedia.dk>
Tested-by: Yu Zhao <yuzhao@google.com>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ebcbe342b upstream.
Setting cred->ucounts in cred_alloc_blank does not make sense. The
uid and user_ns are deliberately not set in cred_alloc_blank but
instead the setting is delayed until key_change_session_keyring.
So move dealing with ucounts into key_change_session_keyring as well.
Unfortunately that movement of get_ucounts adds a new failure mode to
key_change_session_keyring. I do not see anything stopping the parent
process from calling setuid and changing the relevant part of it's
cred while keyctl_session_to_parent is running making it fundamentally
necessary to call get_ucounts in key_change_session_keyring. Which
means that the new failure mode cannot be avoided.
A failure of key_change_session_keyring results in a single threaded
parent keeping it's existing credentials. Which results in the parent
process not being able to access the session keyring and whichever
keys are in the new keyring.
Further get_ucounts is only expected to fail if the number of bits in
the refernece count for the structure is too few.
Since the code has no other way to report the failure of get_ucounts
and because such failures are not expected to be common add a WARN_ONCE
to report this problem to userspace.
Between the WARN_ONCE and the parent process not having access to
the keys in the new session keyring I expect any failure of get_ucounts
will be noticed and reported and we can find another way to handle this
condition. (Possibly by just making ucounts->count an atomic_long_t).
Cc: stable@vger.kernel.org
Fixes: 905ae01c4a ("Add a reference to ucounts for each cred")
Link: https://lkml.kernel.org/r/7k0ias0uf.fsf_-_@disp2133
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 342afce10d upstream.
Setting ds->num_ports to DSA_MAX_PORTS made DSA core allocate unnecessary
dsa_port's and call mt7530_port_disable for non-existent ports.
Set it to MT7530_NUM_PORTS to fix that, and dsa_is_user_port check in
port_enable/disable is no longer required.
Cc: stable@vger.kernel.org
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42871e95a3 upstream.
Commit 1d25684e22 ("ASoC: nau8824: Fix open coded prefix handling")
replaced the nau8824_dapm_enable_pin() helper with direct calls to
snd_soc_dapm_enable_pin(), but the helper was using
snd_soc_dapm_force_enable_pin() and not forcing the MICBIAS + SAR
supplies on breaks headphone vs headset and button-press detection.
Replace the snd_soc_dapm_enable_pin() calls with
snd_soc_dapm_force_enable_pin() to fix this.
Cc: stable@vger.kernel.org
Fixes: 1d25684e22 ("ASoC: nau8824: Fix open coded prefix handling")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210929201512.460360-1-hdegoede@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8913970c19 upstream.
In RHEL's gating selftests we've encountered memory corruption in the
uffd event test even with upstream kernel:
# ./userfaultfd anon 128 4
nr_pages: 32768, nr_pages_per_cpu: 32768
bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729)
bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877)
bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699)
bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196)
testing uffd-wp with pagemap (pgsize=4096): done
testing uffd-wp with pagemap (pgsize=2097152): done
testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963)
ERROR: faulting process failed (errno=0, line=1117)
It can be easily reproduced when global thp enabled, which is the
default for RHEL.
It's also known as a side effect of commit 0db282ba2c ("selftest: use
mmap instead of posix_memalign to allocate memory", 2021-07-23), which
is imho right itself on using mmap() to make sure the addresses will be
untagged even on arm.
The problem is, for each test we allocate buffers using two
allocate_area() calls. We assumed these two buffers won't affect each
other, however they could, because mmap() could have found that the two
buffers are near each other and having the same VMA flags, so they got
merged into one VMA.
It won't be a big problem if thp is not enabled, but when thp is
agressively enabled it means when initializing the src buffer it could
accidentally setup part of the dest buffer too when there's a shared THP
that overlaps the two regions. Then some of the dest buffer won't be
able to be trapped by userfaultfd missing mode, then it'll cause memory
corruption as described.
To fix it, do release_pages() after initializing the src buffer.
Since the previous two release_pages() calls are after
uffd_test_ctx_clear() which will unmap all the buffers anyway (which is
stronger than release pages; as unmap() also tear town pgtables), drop
them as they shouldn't really be anything useful.
We can mark the Fixes tag upon 0db282ba2c as it's reported to only
happen there, however the real "Fixes" IMHO should be 8ba6e86408, as
before that commit we'll always do explicit release_pages() before
registration of uffd, and 8ba6e86408 changed that logic by adding
extra unmap/map and we didn't release the pages at the right place.
Meanwhile I don't have a solid glue anyway on whether posix_memalign()
could always avoid triggering this bug, hence it's safer to attach this
fix to commit 8ba6e86408.
Link: https://lkml.kernel.org/r/20210923232512.210092-1-peterx@redhat.com
Fixes: 8ba6e86408 ("userfaultfd/selftests: reinitialize test context in each test")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Li Wang <liwan@redhat.com>
Tested-by: Li Wang <liwang@redhat.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b15fa9224e upstream.
Starting with kernel 5.11 built with CONFIG_FORTIFY_SOURCE mouting an
ocfs2 filesystem with either o2cb or pcmk cluster stack fails with the
trace below. Problem seems to be that strings for cluster stack and
cluster name are not guaranteed to be null terminated in the disk
representation, while strlcpy assumes that the source string is always
null terminated. This causes a read outside of the source string
triggering the buffer overflow detection.
detected buffer overflow in strlen
------------[ cut here ]------------
kernel BUG at lib/string.c:1149!
invalid opcode: 0000 [#1] SMP PTI
CPU: 1 PID: 910 Comm: mount.ocfs2 Not tainted 5.14.0-1-amd64 #1
Debian 5.14.6-2
RIP: 0010:fortify_panic+0xf/0x11
...
Call Trace:
ocfs2_initialize_super.isra.0.cold+0xc/0x18 [ocfs2]
ocfs2_fill_super+0x359/0x19b0 [ocfs2]
mount_bdev+0x185/0x1b0
legacy_get_tree+0x27/0x40
vfs_get_tree+0x25/0xb0
path_mount+0x454/0xa20
__x64_sys_mount+0x103/0x140
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Link: https://lkml.kernel.org/r/20210929180654.32460-1-vvidic@valentin-vidic.from.hr
Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5314454ea3 upstream.
Commit 6dbf7bb555 ("fs: Don't invalidate page buffers in
block_write_full_page()") uncovered a latent bug in ocfs2 conversion
from inline inode format to a normal inode format.
The code in ocfs2_convert_inline_data_to_extents() attempts to zero out
the whole cluster allocated for file data by grabbing, zeroing, and
dirtying all pages covering this cluster. However these pages are
beyond i_size, thus writeback code generally ignores these dirty pages
and no blocks were ever actually zeroed on the disk.
This oversight was fixed by commit 693c241a5f ("ocfs2: No need to zero
pages past i_size.") for standard ocfs2 write path, inline conversion
path was apparently forgotten; the commit log also has a reasoning why
the zeroing actually is not needed.
After commit 6dbf7bb555, things became worse as writeback code stopped
invalidating buffers on pages beyond i_size and thus these pages end up
with clean PageDirty bit but with buffers attached to these pages being
still dirty. So when a file is converted from inline format, then
writeback triggers, and then the file is grown so that these pages
become valid, the invalid dirtiness state is preserved,
mark_buffer_dirty() does nothing on these pages (buffers are already
dirty) but page is never written back because it is clean. So data
written to these pages is lost once pages are reclaimed.
Simple reproducer for the problem is:
xfs_io -f -c "pwrite 0 2000" -c "pwrite 2000 2000" -c "fsync" \
-c "pwrite 4000 2000" ocfs2_file
After unmounting and mounting the fs again, you can observe that end of
'ocfs2_file' has lost its contents.
Fix the problem by not doing the pointless zeroing during conversion
from inline format similarly as in the standard write path.
[akpm@linux-foundation.org: fix whitespace, per Joseph]
Link: https://lkml.kernel.org/r/20210930095405.21433-1-jack@suse.cz
Fixes: 6dbf7bb555 ("fs: Don't invalidate page buffers in block_write_full_page()")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: "Markov, Andrey" <Markov.Andrey@Dell.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ed65df63a3 upstream.
While writing an email explaining the "bit = 0" logic for a discussion on
making ftrace_test_recursion_trylock() disable preemption, I discovered a
path that makes the "not do the logic if bit is zero" unsafe.
The recursion logic is done in hot paths like the function tracer. Thus,
any code executed causes noticeable overhead. Thus, tricks are done to try
to limit the amount of code executed. This included the recursion testing
logic.
Having recursion testing is important, as there are many paths that can
end up in an infinite recursion cycle when tracing every function in the
kernel. Thus protection is needed to prevent that from happening.
Because it is OK to recurse due to different running context levels (e.g.
an interrupt preempts a trace, and then a trace occurs in the interrupt
handler), a set of bits are used to know which context one is in (normal,
softirq, irq and NMI). If a recursion occurs in the same level, it is
prevented*.
Then there are infrastructure levels of recursion as well. When more than
one callback is attached to the same function to trace, it calls a loop
function to iterate over all the callbacks. Both the callbacks and the
loop function have recursion protection. The callbacks use the
"ftrace_test_recursion_trylock()" which has a "function" set of context
bits to test, and the loop function calls the internal
trace_test_and_set_recursion() directly, with an "internal" set of bits.
If an architecture does not implement all the features supported by ftrace
then the callbacks are never called directly, and the loop function is
called instead, which will implement the features of ftrace.
Since both the loop function and the callbacks do recursion protection, it
was seemed unnecessary to do it in both locations. Thus, a trick was made
to have the internal set of recursion bits at a more significant bit
location than the function bits. Then, if any of the higher bits were set,
the logic of the function bits could be skipped, as any new recursion
would first have to go through the loop function.
This is true for architectures that do not support all the ftrace
features, because all functions being traced must first go through the
loop function before going to the callbacks. But this is not true for
architectures that support all the ftrace features. That's because the
loop function could be called due to two callbacks attached to the same
function, but then a recursion function inside the callback could be
called that does not share any other callback, and it will be called
directly.
i.e.
traced_function_1: [ more than one callback tracing it ]
call loop_func
loop_func:
trace_recursion set internal bit
call callback
callback:
trace_recursion [ skipped because internal bit is set, return 0 ]
call traced_function_2
traced_function_2: [ only traced by above callback ]
call callback
callback:
trace_recursion [ skipped because internal bit is set, return 0 ]
call traced_function_2
[ wash, rinse, repeat, BOOM! out of shampoo! ]
Thus, the "bit == 0 skip" trick is not safe, unless the loop function is
call for all functions.
Since we want to encourage architectures to implement all ftrace features,
having them slow down due to this extra logic may encourage the
maintainers to update to the latest ftrace features. And because this
logic is only safe for them, remove it completely.
[*] There is on layer of recursion that is allowed, and that is to allow
for the transition between interrupt context (normal -> softirq ->
irq -> NMI), because a trace may occur before the context update is
visible to the trace recursion logic.
Link: https://lore.kernel.org/all/609b565a-ed6e-a1da-f025-166691b5d994@linux.alibaba.com/
Link: https://lkml.kernel.org/r/20211018154412.09fcad3c@gandalf.local.home
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Jisheng Zhang <jszhang@kernel.org>
Cc: =?utf-8?b?546L6LSH?= <yun.wang@linux.alibaba.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Fixes: edc15cafcb ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1bd85aa65d upstream.
Currently, we check the wb_err too early for directories, before all of
the unsafe child requests have been waited on. In order to fix that we
need to check the mapping->wb_err later nearer to the end of ceph_fsync.
We also have an overly-complex method for tracking errors after
blocklisting. The errors recorded in cleanup_session_requests go to a
completely separate field in the inode, but we end up reporting them the
same way we would for any other error (in fsync).
There's no real benefit to tracking these errors in two different
places, since the only reporting mechanism for them is in fsync, and
we'd need to advance them both every time.
Given that, we can just remove i_meta_err, and convert the places that
used it to instead just use mapping->wb_err instead. That also fixes
the original problem by ensuring that we do a check_and_advance of the
wb_err at the end of the fsync op.
Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/52864
Reported-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98d0a6fb73 upstream.
Currently when mounting, we may end up finding an existing superblock
that corresponds to a blocklisted MDS client. This means that the new
mount ends up being unusable.
If we've found an existing superblock with a client that is already
blocklisted, and the client is not configured to recover on its own,
fail the match. Ditto if the superblock has been forcibly unmounted.
While we're in here, also rename "other" to the more conventional "fsc".
Cc: stable@vger.kernel.org
URL: https://bugzilla.redhat.com/show_bug.cgi?id=1901499
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a4fbe70c5c upstream.
The receiver should abort TP if 'total message size' in TP.CM_RTS and
TP.CM_BAM is less than 9 or greater than 1785 [1], but currently the
j1939 stack only checks the upper bound and the receiver will accept
the following broadcast message:
vcan1 18ECFF00 [8] 20 08 00 02 FF 00 23 01
vcan1 18EBFF00 [8] 01 00 00 00 00 00 00 00
vcan1 18EBFF00 [8] 02 00 FF FF FF FF FF FF
This patch adds check for the lower bound and abort illegal TP.
[1] SAE-J1939-82 A.3.4 Row 2 and A.3.6 Row 6.
Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Link: https://lore.kernel.org/all/1634203601-3460-1-git-send-email-zhangchangzhong@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 43a08c3bda upstream.
When isotp_sendmsg() concurrent, tx.state of all TX processes can be
ISOTP_IDLE. The conditions so->tx.state != ISOTP_IDLE and
wq_has_sleeper(&so->wait) can not protect TX buffer from being
accessed by multiple TX processes.
We can use cmpxchg() to try to modify tx.state to ISOTP_SENDING firstly.
If the modification of the previous process succeed, the later process
must wait tx.state to ISOTP_IDLE firstly. Thus, we can ensure TX buffer
is accessed by only one process at the same time. And we should also
restore the original tx.state at the subsequent error processes.
Fixes: e057dd3fc2 ("can: add ISO 15765-2:2016 transport protocol")
Link: https://lore.kernel.org/all/c2517874fbdf4188585cf9ddf67a8fa74d5dbde5.1633764159.git.william.xuanziyang@huawei.com
Cc: stable@vger.kernel.org
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9acf636215 upstream.
Using wait_event_interruptible() to wait for complete transmission,
but do not check the result of wait_event_interruptible() which can be
interrupted. It will result in TX buffer has multiple accessors and
the later process interferes with the previous process.
Following is one of the problems reported by syzbot.
=============================================================
WARNING: CPU: 0 PID: 0 at net/can/isotp.c:840 isotp_tx_timer_handler+0x2e0/0x4c0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-rc7+ #68
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
RIP: 0010:isotp_tx_timer_handler+0x2e0/0x4c0
Call Trace:
<IRQ>
? isotp_setsockopt+0x390/0x390
__hrtimer_run_queues+0xb8/0x610
hrtimer_run_softirq+0x91/0xd0
? rcu_read_lock_sched_held+0x4d/0x80
__do_softirq+0xe8/0x553
irq_exit_rcu+0xf8/0x100
sysvec_apic_timer_interrupt+0x9e/0xc0
</IRQ>
asm_sysvec_apic_timer_interrupt+0x12/0x20
Add result check for wait_event_interruptible() in isotp_sendmsg()
to avoid multiple accessers for tx buffer.
Fixes: e057dd3fc2 ("can: add ISO 15765-2:2016 transport protocol")
Link: https://lore.kernel.org/all/10ca695732c9dd267c76a3c30f37aefe1ff7e32f.1633764159.git.william.xuanziyang@huawei.com
Cc: stable@vger.kernel.org
Reported-by: syzbot+78bab6958a614b0c80b9@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e378f4967c ]
The enetc driver does not implement .ndo_change_mtu, instead it
configures the MAC register field PTC{Traffic Class}MSDUR[MAXSDU]
statically to a large value during probe time.
The driver used to configure only the max SDU for traffic class 0, and
that was fine while the driver could only use traffic class 0. But with
the introduction of mqprio, sending a large frame into any other TC than
0 is broken.
This patch fixes that by replicating per traffic class the static
configuration done in enetc_configure_port_mac().
Fixes: cbe9e83594 ("enetc: Enable TC offloading with mqprio")
Reported-by: Richie Pearn <richard.pearn@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: <Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20211020173340.1089992-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d00032394 ]
Current Work Queue Entry (WQE) checksum (csum) flags in the ethernet
segment (eseg) in case of IPsec crypto offload datapath are not aligned
with PRM/HW expectations.
Currently the driver always sets the l3_inner_csum flag in case of IPsec
because of the wrong usage of skb->encapsulation as indicator for inner
IPsec header since skb->encapsulation is always ON for IPsec packets
since IPsec itself is an encapsulation protocol. The above forced a
failing attempts of calculating csum of non-existing segments (like in
the IP|ESP|TCP packet case which does not have an l3_inner) which led
to lots of packet drops hence the low throughput.
Fix by using xo->inner_ipproto as indicator for inner IPsec header
instead of skb->encapsulation in addition to setting the csum flags
as following:
* Tunnel Mode:
* Pkt: MAC IP ESP IP L4
* CSUM: l3_cs | l3_inner_cs | l4_inner_cs
*
* Transport Mode:
* Pkt: MAC IP ESP L4
* CSUM: l3_cs [ | l4_cs (checksum partial case)]
*
* Tunnel(VXLAN TCP/UDP) over Transport Mode
* Pkt: MAC IP ESP UDP VXLAN IP L4
* CSUM: l3_cs | l3_inner_cs | l4_inner_cs
Fixes: f1267798c9 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d10457f85d ]
IPsec crypto offload current Software Parser (SWP) fields settings in
the ethernet segment (eseg) are not aligned with PRM/HW expectations.
Among others in case of IP|ESP|TCP packet, current driver sets the
offsets for inner_l3 and inner_l4 although there is no inner l3/l4
headers relative to ESP header in such packets.
SWP provides the offsets for HW ,so it can be used to find csum fields
to offload the checksum, however these are not necessarily used by HW
and are used as fallback in case HW fails to parse the packet, e.g
when performing IPSec Transport Aware (IP | ESP | TCP) there is no
need to add SW parse on inner packet. So in some cases packets csum
was calculated correctly , whereas in other cases it failed. The later
faced csum errors (caused by wrong packet length calculations) which
led to lots of packet drops hence the low throughput.
Fix by setting the SWP fields as expected in a IP|ESP|TCP packet.
the following describe the expected SWP offsets:
* Tunnel Mode:
* SWP: OutL3 InL3 InL4
* Pkt: MAC IP ESP IP L4
*
* Transport Mode:
* SWP: OutL3 OutL4
* Pkt: MAC IP ESP L4
*
* Tunnel(VXLAN TCP/UDP) over Transport Mode
* SWP: OutL3 InL3 InL4
* Pkt: MAC IP ESP UDP VXLAN IP L4
Fixes: f1267798c9 ("net/mlx5: Fix checksum issue of VXLAN and IPsec crypto offload")
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4225fea1cb ]
I got memory leak as follows when doing fault injection test:
unreferenced object 0xffff88800906c618 (size 8):
comm "i2c-idt82p33931", pid 4421, jiffies 4294948083 (age 13.188s)
hex dump (first 8 bytes):
70 74 70 30 00 00 00 00 ptp0....
backtrace:
[<00000000312ed458>] __kmalloc_track_caller+0x19f/0x3a0
[<0000000079f6e2ff>] kvasprintf+0xb5/0x150
[<0000000026aae54f>] kvasprintf_const+0x60/0x190
[<00000000f323a5f7>] kobject_set_name_vargs+0x56/0x150
[<000000004e35abdd>] dev_set_name+0xc0/0x100
[<00000000f20cfe25>] ptp_clock_register+0x9f4/0xd30 [ptp]
[<000000008bb9f0de>] idt82p33_probe.cold+0x8b6/0x1561 [ptp_idt82p33]
When posix_clock_register() returns an error, the name allocated
in dev_set_name() will be leaked, the put_device() should be used
to give up the device reference, then the name will be freed in
kobject_cleanup() and other memory will be freed in ptp_clock_release().
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: a33121e548 ("ptp: fix the race between the release of ptp_clock and cdev")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3cb958027c ]
When utilizing End to End delay mechanism, the following error messages show up:
|root@ehl1:~# ptp4l --tx_timestamp_timeout=50 -H -i eno2 -E -m
|ptp4l[950.573]: selected /dev/ptp3 as PTP clock
|ptp4l[950.586]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE
|ptp4l[950.586]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
|ptp4l[952.879]: port 1: new foreign master 001395.fffe.4897b4-1
|ptp4l[956.879]: selected best master clock 001395.fffe.4897b4
|ptp4l[956.879]: port 1: assuming the grand master role
|ptp4l[956.879]: port 1: LISTENING to GRAND_MASTER on RS_GRAND_MASTER
|ptp4l[962.017]: port 1: received DELAY_REQ without timestamp
|ptp4l[962.273]: port 1: received DELAY_REQ without timestamp
|ptp4l[963.090]: port 1: received DELAY_REQ without timestamp
Commit f2fb6b6275 ("net: stmmac: enable timestamp snapshot for required PTP
packets in dwmac v5.10a") already addresses this problem for the dwmac
v5.10. However, same holds true for all dwmacs above version v4.10. Correct the
check accordingly. Afterwards everything works as expected.
Tested on Intel Atom(R) x6414RE Processor.
Fixes: 14f347334b ("net: stmmac: Correctly take timestamp for PTPv2")
Fixes: f2fb6b6275 ("net: stmmac: enable timestamp snapshot for required PTP packets in dwmac v5.10a")
Suggested-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0dd8a25f35 ]
HNS3 driver includes hns3.ko, hnae3.ko and hclge.ko.
hns3.ko includes network stack and pci_driver, hclge.ko includes
HW device action, algo_ops and timer task, hnae3.ko includes some
register function.
When SRIOV is enable and hclge.ko is removed, HW device is unloaded
but VF still exists, PF will not reply VF mbx messages, and cause
errors.
This patch fix it by disable SRIOV before remove hclge.ko.
Fixes: e2cb1dec97 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1385cc81ba ]
The task of VF reset is performed through the workqueue. It checks the
value of hdev->reset_pending to determine whether to exit the loop.
However, the value of hdev->reset_pending may also be assigned by
the interrupt function hclgevf_misc_irq_handle(), which may cause the
loop fail to exit and keep occupying the workqueue. This loop is not
necessary, so remove it and the workqueue will be rescheduled if the
reset needs to be retried or a new reset occurs.
Fixes: 1cc9bc6e58 ("net: hns3: split hclgevf_reset() into preparing and rebuilding part")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 68752b24f5 ]
Currently when there is a rx page allocation failure, it is
possible that polling may be stopped if there is no more packet
to be reveiced, which may cause queue stall problem under memory
pressure.
This patch makes sure polling is scheduled again when there is
any rx page allocation failure, and polling will try to allocate
receive buffers until it succeeds.
Now the allocation retry is added, it is unnecessary to do the rx
page allocation at the end of rx cleaning, so remove it. And reset
the unused_count to zero after calling hns3_nic_alloc_rx_buffers()
to avoid calling hns3_nic_alloc_rx_buffers() repeatedly under
memory pressure.
Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 731797fdff ]
If ets dwrr bandwidth of tc is set to 0, the hardware will switch to SP
mode. In this case, this tc may occupy all the tx bandwidth if it has
huge traffic, so it violates the purpose of the user setting.
To fix this problem, limit the ets dwrr bandwidth must greater than 0.
Fixes: cacde272dd ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b63fcaab95 ]
Currently, DWRR of tc will be initialized to a fixed value when this tc
is enabled, but it is not been reset to 0 when this tc is disabled. It
cause a problem that the DWRR of unused tc is not 0 after using tc tool
to add and delete multi-tc parameters.
For examples, after enabling 4 TCs and restoring to 1 TC by follow
tc commands:
$ tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 0 1 2 3 queues \
8@0 8@8 8@16 8@24 hw 1 mode channel
$ tc qdisc del dev eth0 root
Now there is just one TC is enabled for eth0, but the tc info querying by
debugfs is shown as follow:
$ cat /mnt/hns3/0000:7d:00.0/tm/tc_sch_info
enabled tc number: 1
weight_offset: 14
TC MODE WEIGHT
0 dwrr 100
1 dwrr 100
2 dwrr 100
3 dwrr 100
4 dwrr 0
5 dwrr 0
6 dwrr 0
7 dwrr 0
This patch fixes it by resetting DWRR of tc to 0 when tc is disabled.
Fixes: 848440544b ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 60484103d5 ]
Add configuration of interrupt type and fifo interrupt enable of TM QCN
error event if enabled, otherwise this event will not be reported when
there is error.
Fixes: d914971df0 ("net: hns3: remove redundant query in hclge_config_tm_hw_err_int()")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 787252a10d ]
With PREEMPT_COUNT=y, when a CPU is offlined and then onlined again, we
get:
BUG: scheduling while atomic: swapper/1/0/0x00000000
no locks held by swapper/1/0.
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.15.0-rc2+ #100
Call Trace:
dump_stack_lvl+0xac/0x108
__schedule_bug+0xac/0xe0
__schedule+0xcf8/0x10d0
schedule_idle+0x3c/0x70
do_idle+0x2d8/0x4a0
cpu_startup_entry+0x38/0x40
start_secondary+0x2ec/0x3a0
start_secondary_prolog+0x10/0x14
This is because powerpc's arch_cpu_idle_dead() decrements the idle task's
preempt count, for reasons explained in commit a7c2bb8279 ("powerpc:
Re-enable preemption before cpu_die()"), specifically "start_secondary()
expects a preempt_count() of 0."
However, since commit 2c669ef697 ("powerpc/preempt: Don't touch the idle
task's preempt_count during hotplug") and commit f1a0a376ca ("sched/core:
Initialize the idle task with preemption disabled"), that justification no
longer holds.
The idle task isn't supposed to re-enable preemption, so remove the
vestigial preempt_enable() from the CPU offline path.
Tested with pseries and powernv in qemu, and pseries on PowerVM.
Fixes: 2c669ef697 ("powerpc/preempt: Don't touch the idle task's preempt_count during hotplug")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211015173902.2278118-1-nathanl@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4cce60f15c ]
Both arch/nios2/ and drivers/mmc/host/tmio_mmc.c define a macro
with the name "CTL_STATUS". Change the one in arch/nios2/ to be
"CTL_FSTATUS" (flags status) to eliminate the build warning.
In file included from ../drivers/mmc/host/tmio_mmc.c:22:
drivers/mmc/host/tmio_mmc.h:31: warning: "CTL_STATUS" redefined
31 | #define CTL_STATUS 0x1c
arch/nios2/include/asm/registers.h:14: note: this is the location of the previous definition
14 | #define CTL_STATUS 0
Fixes: b31ebd8055 ("nios2: Nios2 registers")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2dc4e9e88c ]
First fragmented packets (frag offset = 0) byte len is zeroed
when stolen by ip_defrag(). And since act_ct update the stats
only afterwards (at end of execute), bytes aren't correctly
accounted for such packets.
To fix this, move stats update to start of action execute.
Fixes: b57dc7c13e ("net/sched: Introduce action ct")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66d262804a ]
I compared the register definitions with the D-Link DWR-966
GPL sources and found that the PUAFD field definition was
incorrect. This definition is unused and causes no issues.
Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0a9bb11a5e ]
On i386, the baycom_epp driver wants to inspect X86 CPU features (TSC)
and then act on that data, but that info is not available when running
on UML, so prevent that test and do the default action.
Prevents this build error on UML + i386:
../drivers/net/hamradio/baycom_epp.c: In function ‘epp_bh’:
../drivers/net/hamradio/baycom_epp.c:630:6: error: implicit declaration of function ‘boot_cpu_has’; did you mean ‘get_cpu_mask’? [-Werror=implicit-function-declaration]
if (boot_cpu_has(X86_FEATURE_TSC)) \
^
../drivers/net/hamradio/baycom_epp.c:658:2: note: in expansion of macro ‘GETTICK’
GETTICK(time1);
Fixes: 68f5d3f3b6 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-um@lists.infradead.org
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
Cc: linux-hams@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 86f1e3a848 ]
With net.ipv4.tcp_l3mdev_accept=1 it is possible for a listen socket to
accept connection from the same client address in different VRFs. It is
also possible to set different MD5 keys for these clients which differ
only in the tcpm_l3index field.
This appears to work when distinguishing between different VRFs but not
between non-VRF and VRF connections. In particular:
* tcp_md5_do_lookup_exact will match a non-vrf key against a vrf key.
This means that adding a key with l3index != 0 after a key with l3index
== 0 will cause the earlier key to be deleted. Both keys can be present
if the non-vrf key is added later.
* _tcp_md5_do_lookup can match a non-vrf key before a vrf key. This
casues failures if the passwords differ.
Fix this by making tcp_md5_do_lookup_exact perform an actual exact
comparison on l3index and by making __tcp_md5_do_lookup perfer
vrf-bound keys above other considerations like prefixlen.
Fixes: dea53bb80e ("tcp: Add l3index to tcp_md5sig_key and md5 functions")
Signed-off-by: Leonard Crestez <cdleonard@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 46393d61a3 ]
Fix the following build/link error by adding a dependency on the CRC32
routines:
ld: drivers/net/usb/lan78xx.o: in function `lan78xx_set_multicast':
lan78xx.c:(.text+0x48cf): undefined reference to `crc32_le'
The actual use of crc32_le() comes indirectly through ether_crc().
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 075718fdaf ]
transport encap_port update should be updated when sctp_vtag_verify()
succeeds, namely, returns 1, not returns 0. Correct it in this patch.
While at it, also fix the indentation.
Fixes: a1dd2cf2f1 ("sctp: allow changing transport encap_port by peer packets")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 174c376278 ]
Because the data pointer of net/ipv4/vs/debug_level is not updated per
netns, it must be marked as read-only in non-init netns.
Fixes: c6d2d445d8 ("IPVS: netns, final patch enabling network name space.")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a482c5e00a ]
In rt_mt6(), when it's a nonlinear skb, the 1st skb_header_pointer()
only copies sizeof(struct ipv6_rt_hdr) to _route that rh points to.
The access by ((const struct rt0_hdr *)rh)->reserved will overflow
the buffer. So this access should be moved below the 2nd call to
skb_header_pointer().
Besides, after the 2nd skb_header_pointer(), its return value should
also be checked, othersize, *rp may cause null-pointer-ref.
v1->v2:
- clean up some old debugging log.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b726ddf984 ]
Currently when a user uses "devlink dev info", the fw.mgmt.api will be
the major.minor numbers as shown below:
devlink dev info pci/0000:3b:00.0
pci/0000:3b:00.0:
driver ice
serial_number 00-01-00-ff-ff-00-00-00
versions:
fixed:
board.id K91258-000
running:
fw.mgmt 6.1.2
fw.mgmt.api 1.7 <--- No patch number included
fw.mgmt.build 0xd75e7d06
fw.mgmt.srev 5
fw.undi 1.2992.0
fw.undi.srev 5
fw.psid.api 3.10
fw.bundle_id 0x800085cc
fw.app.name ICE OS Default Package
fw.app 1.3.27.0
fw.app.bundle_id 0xc0000001
fw.netlist 3.10.2000-3.1e.0
fw.netlist.build 0x2a76e110
stored:
fw.mgmt.srev 5
fw.undi 1.2992.0
fw.undi.srev 5
fw.psid.api 3.10
fw.bundle_id 0x800085cc
fw.netlist 3.10.2000-3.1e.0
fw.netlist.build 0x2a76e110
There are many features in the driver that depend on the major, minor,
and patch version of the FW. Without the patch number in the output for
fw.mgmt.api debugging issues related to the FW API version is difficult.
Also, using major.minor.patch aligns with the existing firmware version
which uses a 3 digit value.
Fix this by making the fw.mgmt.api print the major.minor.patch
versions. Shown below is the result:
devlink dev info pci/0000:3b:00.0
pci/0000:3b:00.0:
driver ice
serial_number 00-01-00-ff-ff-00-00-00
versions:
fixed:
board.id K91258-000
running:
fw.mgmt 6.1.2
fw.mgmt.api 1.7.9 <--- patch number included
fw.mgmt.build 0xd75e7d06
fw.mgmt.srev 5
fw.undi 1.2992.0
fw.undi.srev 5
fw.psid.api 3.10
fw.bundle_id 0x800085cc
fw.app.name ICE OS Default Package
fw.app 1.3.27.0
fw.app.bundle_id 0xc0000001
fw.netlist 3.10.2000-3.1e.0
fw.netlist.build 0x2a76e110
stored:
fw.mgmt.srev 5
fw.undi 1.2992.0
fw.undi.srev 5
fw.psid.api 3.10
fw.bundle_id 0x800085cc
fw.netlist 3.10.2000-3.1e.0
fw.netlist.build 0x2a76e110
Fixes: ff2e5c700e ("ice: add basic handler for devlink .info_get")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e4c2efa139 ]
Correct parameters order in call to ice_tunnel_idx_to_entry function.
Entry in sparse port table is correct when the idx is 0. For idx 1 one
correct entry should be skipped, for idx 2 two of them should be skipped
etc. Change if condition to be true when idx is 0, which means that
previous valid entry of this tunnel type were skipped.
Fixes: b20e6c17c4 ("ice: convert to new udp_tunnel infrastructure")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 73e30a62b1 ]
In the remove path, there is an attempt to free the aux_idx IDA whether
it was allocated or not. This can potentially cause a crash when
unloading the driver on systems that do not initialize support for RDMA.
But, this free cannot be gated by the status bit for RDMA, since it is
allocated if the driver detects support for RDMA at probe time, but the
driver can enter into a state where RDMA is not supported after the IDA
has been allocated at probe time and this would lead to a memory leak.
Initialize aux_idx to an invalid value and check for a valid value when
unloading to determine if an IDA free is necessary.
Fixes: d25a0fc41c ("ice: Initialize RDMA support")
Reported-by: Jun Miao <jun.miao@windriver.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ff7e932194 ]
Currently if the VSI is rebuilt/removed and the RDMA PF driver is active
the RDMA Tx queue scheduler node configuration will not be cleaned up.
This will cause the rebuild/re-add of the VSI to fail due to the
software structures not being correctly cleaned up for the VSI index.
Fix this by always calling ice_rm_vsi_rdma_cfg() for all VSI. If there
are no RDMA scheduler nodes created, then there is no harm in calling
ice_rm_vsi_rdma_cfg(). This change applies to all VSI types, so if
RDMA support is added for other VSI types they will also get this
change.
Fixes: 348048e724 ("ice: Implement iidc operations")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Jerzy Wiktor Jurkowski <jerzy.wiktor.jurkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b9b546dc0 ]
There is a noise issue for 8kHz sample rate on slave mode.
Compared with master mode, the difference is the DACDIV
setting, after correcting the DACDIV, the noise is gone.
There is no noise issue for 48kHz sample rate, because
the default value of DACDIV is correct for 48kHz.
So wm8960_configure_clocking() should be functional for
ADC and DAC function even if it is slave mode.
In order to be compatible for old use case, just add
condition for checking that sysclk is zero with
slave mode.
Fixes: 0e50b51aa2 ("ASoC: wm8960: Let wm8960 driver configure its bit clock and frame clock")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/1634102224-3922-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 293d92cbbd ]
The following warning occurred sporadically on s390:
DMA-API: nvme 0006:00:00.0: device driver maps memory from kernel text or rodata [addr=0000000048cc5e2f] [len=131072]
WARNING: CPU: 4 PID: 825 at kernel/dma/debug.c:1083 check_for_illegal_area+0xa8/0x138
It is a false-positive warning, due to broken logic in debug_dma_map_sg().
check_for_illegal_area() checks for overlay of sg elements with kernel text
or rodata. It is called with sg_dma_len(s) instead of s->length as
parameter. After the call to ->map_sg(), sg_dma_len() will contain the
length of possibly combined sg elements in the DMA address space, and not
the individual sg element length, which would be s->length.
The check will then use the physical start address of an sg element, and
add the DMA length for the overlap check, which could result in the false
warning, because the DMA length can be larger than the actual single sg
element length.
In addition, the call to check_for_illegal_area() happens in the iteration
over mapped_ents, which will not include all individual sg elements if
any of them were combined in ->map_sg().
Fix this by using s->length instead of sg_dma_len(s). Also put the call to
check_for_illegal_area() in a separate loop, iterating over all the
individual sg elements ("nents" instead of "mapped_ents").
While at it, as suggested by Robin Murphy, also move check_for_stack()
inside the new loop, as it is similarly concerned with validating the
individual sg elements.
Link: https://lore.kernel.org/lkml/20210705185252.4074653-1-gerald.schaefer@linux.ibm.com
Fixes: 884d05970b ("dma-debug: use sg_dma_len accessor")
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 68a3765c65 ]
syzbot reported following (harmless) WARN:
WARNING: CPU: 1 PID: 2648 at net/netfilter/core.c:468
nft_netdev_unregister_hooks net/netfilter/nf_tables_api.c:230 [inline]
nf_tables_unregister_hook include/net/netfilter/nf_tables.h:1090 [inline]
__nft_release_basechain+0x138/0x640 net/netfilter/nf_tables_api.c:9524
nft_netdev_event net/netfilter/nft_chain_filter.c:351 [inline]
nf_tables_netdev_event+0x521/0x8a0 net/netfilter/nft_chain_filter.c:382
reproducer:
unshare -n bash -c 'ip link add br0 type bridge; nft add table netdev t ; \
nft add chain netdev t ingress \{ type filter hook ingress device "br0" \
priority 0\; policy drop\; \}'
Problem is that when netns device exit hooks create the UNREGISTER
event, the .pre_exit hook for nf_tables core has already removed the
base hook. Notifier attempts to do this again.
The need to do base hook unregister unconditionally was needed in the past,
because notifier was last stage where reg->dev dereference was safe.
Now that nf_tables does the hook removal in .pre_exit, this isn't
needed anymore.
Reported-and-tested-by: syzbot+154bd5be532a63aa778b@syzkaller.appspotmail.com
Fixes: 767d1216bf ("netfilter: nftables: fix possible UAF over chains from packet path in netns")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6e6a8ef088 ]
VM_SHARED mappings are currently forbidden in a memslot with MTE to
prevent two VMs racing to sanitise the same page. However, this check
is performed while holding current->mm's mmap_lock, but fails to release
it. Fix this by releasing the lock when needed.
Fixes: ea7fc1bb1c ("KVM: arm64: Introduce MTE VM feature")
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005122031.809857-1-qperret@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d58a17ef5 ]
The KVM page-table library refcounts the pages of concatenated stage-2
PGDs individually. However, when running KVM in protected mode, the
host's stage-2 PGD is currently managed by EL2 as a single high-order
compound page, which can cause the refcount of the tail pages to reach 0
when they shouldn't, hence corrupting the page-table.
Fix this by introducing a new hyp_split_page() helper in the EL2 page
allocator (matching the kernel's split_page() function), and make use of
it from host_s2_zalloc_pages_exact().
Fixes: 1025c8c0c6 ("KVM: arm64: Wrap the host with a stage 2")
Acked-by: Will Deacon <will@kernel.org>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211005090155.734578-5-qperret@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ceef3240f9 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding SPI IDs for parts that
only have a compatible listed.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210924194956.46079-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 74b7ee0e7b ]
With pause and resume test for ARC, there is occasionally
channel swap issue. The reason is that currently driver set
the DPATH out of reset first, then start the DMA, the first
data got from FIFO may not be the Left channel.
Moving DPATH out of reset operation after the dma enablement
to fix this issue.
Fixes: 2856448686 ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1631265510-27384-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c20106944e ]
If nfsd has existing listening sockets without any processes, then an error
returned from svc_create_xprt() for an additional transport will remove
those existing listeners. We're seeing this in practice when userspace
attempts to create rpcrdma transports without having the rpcrdma modules
present before creating nfsd kernel processes. Fix this by checking for
existing sockets before calling nfsd_destroy().
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 012e974501 ]
Rebooting xtensa images loaded with the '-kernel' option in qemu does
not work. When executing a reboot command, the qemu session either hangs
or experiences an endless sequence of error messages.
Kernel panic - not syncing: Unrecoverable error in exception handler
Reset code jumps to the CPU restart address, but Linux can not recover
from there because code and data in the kernel init sections have been
discarded and overwritten at this point.
XTFPGA platforms have a means to reset the CPU by writing 0xdead into a
specific FPGA IO address. When used in QEMU the kernel image loaded with
the '-kernel' option gets restored to its original state allowing the
machine to boot successfully.
Use that mechanism to attempt a platform reset. If it does not work,
fall back to the existing mechanism.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f3d7c2cdf6 ]
Use platform data to initialize xtfpga device drivers when CONFIG_USE_OF
is not selected. This fixes xtfpga networking when CONFIG_USE_OF is not
selected but CONFIG_OF is.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit baf33d7a75 ]
For the situation that the disconnect event comes very late when the
device is unplugged, the driver would resubmit the RX bulk transfer
after getting the callback with -EPROTO immediately and continually.
Finally, soft lockup occurs.
This patch avoids to resubmit RX immediately. It uses a workqueue to
schedule the RX NAPI. And the NAPI would resubmit the RX. It let the
disconnect event have opportunity to stop the submission before soft
lockup.
Reported-by: Jason-ch Chen <jason-ch.chen@mediatek.com>
Tested-by: Jason-ch Chen <jason-ch.chen@mediatek.com>
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9172b5c4a7 ]
Like xen_start_flags, xen_domain_type gets set before .bss gets cleared.
Hence this variable also needs to be prevented from getting put in .bss,
which is possible because XEN_NATIVE is an enumerator evaluating to
zero. Any use prior to init_hvm_pv_info() setting the variable again
would lead to wrong decisions; one such case is xenboot_console_setup()
when called as a result of "earlyprintk=xen".
Use __ro_after_init as more applicable than either __section(".data") or
__read_mostly.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/d301677b-6f22-5ae6-bd36-458e1f323d0b@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 6f1fce595b upstream.
Fix lots of fallthrough warnings, e.g.:
arch/parisc/math-emu/fpudispatch.c:323:33: warning: this statement may fall through [-Wimplicit-fallthrough=]
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55a51ea140 upstream.
If CONFIG_BLK_DEBUG_FS=n:
block/mq-deadline.c:274:12: warning: ‘dd_queued’ defined but not used [-Wunused-function]
274 | static u32 dd_queued(struct deadline_data *dd, enum dd_prio prio)
| ^~~~~~~~~
Fix this by moving dd_queued() just before the sole function that calls
it.
Fixes: 7b05bf7710 ("Revert "block/mq-deadline: Prioritize high-priority requests"")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 38ba64d12d ("block/mq-deadline: Track I/O statistics")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20210830091128.1854266-1-geert@linux-m68k.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c976a5657 upstream.
Bridging, and possibly other upper stack gizmos, adds the
lower device's netdev->dev_addr to its own uc list, and
then requests it be deleted when the upper bridge device is
removed. This delete request also happens with the bridging
vlan_filtering is enabled and then disabled.
Bonding has a similar behavior with the uc list, but since it
also uses set_mac to manage netdev->dev_addr, it doesn't have
the same the failure case.
Because we store our netdev->dev_addr in our uc list, we need
to ignore the delete request from dev_uc_sync so as to not
lose the address and all hope of communicating. Note that
ndo_set_mac_address is expressly changing netdev->dev_addr,
so no limitation is set there.
Fixes: 2a654540be ("ionic: Add Rx filter and rx_mode ndo support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d5f7954b7 upstream.
The NXP LS1028A switch has two Ethernet ports towards the CPU, but only
one of them is capable of acting as an NPI port at a time (inject and
extract packets using DSA tags).
However, using the alternative ocelot-8021q tagging protocol, it should
be possible to use both CPU ports symmetrically, but for that we need to
mark both ports in the device tree as DSA masters.
In the process of doing that, it can be seen that traffic to/from the
network stack gets broken, and this is because the Felix driver iterates
through all DSA CPU ports and configures them as NPI ports. But since
there can only be a single NPI port, we effectively end up in a
situation where DSA thinks the default CPU port is the first one, but
the hardware port configured to be an NPI is the last one.
I would like to treat this as a bug, because if the updated device trees
are going to start circulating, it would be really good for existing
kernels to support them, too.
Fixes: adb3dccf09 ("net: dsa: felix: convert to the new .change_tag_protocol DSA API")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebb4c6a990 upstream.
The sad reality is that when a PTP frame with a TX timestamping request
is transmitted, it isn't guaranteed that it will make it all the way to
the wire (due to congestion inside the switch), and that a timestamp
will be taken by the hardware and placed in the timestamp FIFO where an
IRQ will be raised for it.
The implication is that if enough PTP frames are silently dropped by the
hardware such that the timestamp ID has rolled over, it is possible to
match a timestamp to an old skb.
Furthermore, nobody will match on the real skb corresponding to this
timestamp, since we stupidly matched on a previous one that was stale in
the queue, and stopped there.
So PTP timestamping will be broken and there will be no way to recover.
It looks like the hardware parses the sequenceID from the PTP header,
and also provides that metadata for each timestamp. The driver currently
ignores this, but it shouldn't.
As an extra resiliency measure, do the following:
- check whether the PTP sequenceID also matches between the skb and the
timestamp, treat the skb as stale otherwise and free it
- if we see a stale skb, don't stop there and try to match an skb one
more time, chances are there's one more skb in the queue with the same
timestamp ID, otherwise we wouldn't have ever found the stale one (it
is by timestamp ID that we matched it).
While this does not prevent PTP packet drops, it at least prevents
the catastrophic consequences of incorrect timestamp matching.
Since we already call ptp_classify_raw in the TX path, save the result
in the skb->cb of the clone, and just use that result in the interrupt
code path.
Fixes: 4e3b0468e6 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fba01283d8 upstream.
It appears that Ocelot switches cannot timestamp non-PTP frames,
I tested this using the isochron program at:
https://github.com/vladimiroltean/tsn-scripts
with the result that the driver increments the ocelot_port->ts_id
counter as expected, puts it in the REW_OP, but the hardware seems to
not timestamp these packets at all, since no IRQ is emitted.
Therefore check whether we are sending PTP frames, and refuse to
populate REW_OP otherwise.
Fixes: 4e3b0468e6 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9fde506e0c upstream.
When skb_match is NULL, it means we received a PTP IRQ for a timestamp
ID that the kernel has no idea about, since there is no skb in the
timestamping queue with that timestamp ID.
This is a grave error and not something to just "continue" over.
So print a big warning in case this happens.
Also, move the check above ocelot_get_hwtimestamp(), there is no point
in reading the full 64-bit current PTP time if we're not going to do
anything with it anyway for this skb.
Fixes: 4e3b0468e6 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52849bcf00 upstream.
PTP packets with 2-step TX timestamp requests are matched to packets
based on the egress port number and a 6-bit timestamp identifier.
All PTP timestamps are held in a common FIFO that is 128 entry deep.
This patch ensures that back-to-back timestamping requests cannot exceed
the hardware FIFO capacity. If that happens, simply send the packets
without requesting a TX timestamp to be taken (in the case of felix,
since the DSA API has a void return code in ds->ops->port_txtstamp) or
drop them (in the case of ocelot).
I've moved the ts_id_lock from a per-port basis to a per-switch basis,
because we need separate accounting for both numbers of PTP frames in
flight. And since we need locking to inc/dec the per-switch counter,
that also offers protection for the per-port counter and hence there is
no reason to have a per-port counter anymore.
Fixes: 4e3b0468e6 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c57fe0037a upstream.
At present, there is a problem when user space bombards a port with PTP
event frames which have TX timestamping requests (or when a tc-taprio
offload is installed on a port, which delays the TX timestamps by a
significant amount of time). The driver will happily roll over the 2-bit
timestamp ID and this will cause incorrect matches between an skb and
the TX timestamp collected from the FIFO.
The Ocelot switches have a 6-bit PTP timestamp identifier, and the value
63 is reserved, so that leaves identifiers 0-62 to be used.
The timestamp identifiers are selected by the REW_OP packet field, and
are actually shared between CPU-injected frames and frames which match a
VCAP IS2 rule that modifies the REW_OP. The hardware supports
partitioning between the two uses of the REW_OP field through the
PTP_ID_LOW and PTP_ID_HIGH registers, and by default reserves the PTP
IDs 0-3 for CPU-injected traffic and the rest for VCAP IS2.
The driver does not use VCAP IS2 to set REW_OP for 2-step timestamping,
and it also writes 0xffffffff to both PTP_ID_HIGH and PTP_ID_LOW in
ocelot_init_timestamp() which makes all timestamp identifiers available
to CPU injection.
Therefore, we can make use of all 63 timestamp identifiers, which should
allow more timestampable packets to be in flight on each port. This is
only part of the solution, more issues will be addressed in future changes.
Fixes: 4e3b0468e6 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d4a223a86 upstream.
Commit 4dd0d5c33c ("ice: add lock around Tx timestamp tracker flush")
added a lock around the Tx timestamp tracker flow which is used to
cleanup any left over SKBs and prepare for device removal.
This lock is problematic because it is being held around a call to
ice_clear_phy_tstamp. The clear function takes a mutex to send a PHY
write command to firmware. This could lead to a deadlock if the mutex
actually sleeps, and causes the following warning on a kernel with
preemption debugging enabled:
[ 715.419426] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:573
[ 715.427900] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3100, name: rmmod
[ 715.435652] INFO: lockdep is turned off.
[ 715.439591] Preemption disabled at:
[ 715.439594] [<0000000000000000>] 0x0
[ 715.446678] CPU: 52 PID: 3100 Comm: rmmod Tainted: G W OE 5.15.0-rc4+ #42 bdd7ec3018e725f159ca0d372ce8c2c0e784891c
[ 715.458058] Hardware name: Intel Corporation S2600STQ/S2600STQ, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
[ 715.468483] Call Trace:
[ 715.470940] dump_stack_lvl+0x6a/0x9a
[ 715.474613] ___might_sleep.cold+0x224/0x26a
[ 715.478895] __mutex_lock+0xb3/0x1440
[ 715.482569] ? stack_depot_save+0x378/0x500
[ 715.486763] ? ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.494979] ? kfree+0xc1/0x520
[ 715.498128] ? mutex_lock_io_nested+0x12a0/0x12a0
[ 715.502837] ? kasan_set_free_info+0x20/0x30
[ 715.507110] ? __kasan_slab_free+0x10b/0x140
[ 715.511385] ? slab_free_freelist_hook+0xc7/0x220
[ 715.516092] ? kfree+0xc1/0x520
[ 715.519235] ? ice_deinit_lag+0x16c/0x220 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.527359] ? ice_remove+0x1cf/0x6a0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.535133] ? pci_device_remove+0xab/0x1d0
[ 715.539318] ? __device_release_driver+0x35b/0x690
[ 715.544110] ? driver_detach+0x214/0x2f0
[ 715.548035] ? bus_remove_driver+0x11d/0x2f0
[ 715.552309] ? pci_unregister_driver+0x26/0x250
[ 715.556840] ? ice_module_exit+0xc/0x2f [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.564799] ? __do_sys_delete_module.constprop.0+0x2d8/0x4e0
[ 715.570554] ? do_syscall_64+0x3b/0x90
[ 715.574303] ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 715.579529] ? start_flush_work+0x542/0x8f0
[ 715.583719] ? ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.591923] ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.599960] ? wait_for_completion_io+0x250/0x250
[ 715.604662] ? lock_acquire+0x196/0x200
[ 715.608504] ? do_raw_spin_trylock+0xa5/0x160
[ 715.612864] ice_sbq_rw_reg+0x1e6/0x2f0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.620813] ? ice_reset+0x130/0x130 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.628497] ? __debug_check_no_obj_freed+0x1e8/0x3c0
[ 715.633550] ? trace_hardirqs_on+0x1c/0x130
[ 715.637748] ice_write_phy_reg_e810+0x70/0xf0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.646220] ? do_raw_spin_trylock+0xa5/0x160
[ 715.650581] ? ice_ptp_release+0x910/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.658797] ? ice_ptp_release+0x255/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.667013] ice_clear_phy_tstamp+0x2c/0x110 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.675403] ice_ptp_release+0x408/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.683440] ice_remove+0x560/0x6a0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.691037] ? _raw_spin_unlock_irqrestore+0x46/0x73
[ 715.696005] pci_device_remove+0xab/0x1d0
[ 715.700018] __device_release_driver+0x35b/0x690
[ 715.704637] driver_detach+0x214/0x2f0
[ 715.708389] bus_remove_driver+0x11d/0x2f0
[ 715.712489] pci_unregister_driver+0x26/0x250
[ 715.716857] ice_module_exit+0xc/0x2f [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[ 715.724637] __do_sys_delete_module.constprop.0+0x2d8/0x4e0
[ 715.730210] ? free_module+0x6d0/0x6d0
[ 715.733963] ? task_work_run+0xe1/0x170
[ 715.737803] ? exit_to_user_mode_loop+0x17f/0x1d0
[ 715.742509] ? rcu_read_lock_sched_held+0x12/0x80
[ 715.747215] ? trace_hardirqs_on+0x1c/0x130
[ 715.751401] do_syscall_64+0x3b/0x90
[ 715.754981] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 715.760033] RIP: 0033:0x7f4dfe59000b
[ 715.763612] Code: 73 01 c3 48 8b 0d 6d 1e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 1e 0c 00 f7 d8 64 89 01 48
[ 715.782357] RSP: 002b:00007ffe8c891708 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 715.789923] RAX: ffffffffffffffda RBX: 00005558a20468b0 RCX: 00007f4dfe59000b
[ 715.797054] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005558a2046918
[ 715.804189] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 715.811319] R10: 00007f4dfe603ac0 R11: 0000000000000206 R12: 00007ffe8c891940
[ 715.818455] R13: 00007ffe8c8920a3 R14: 00005558a20462a0 R15: 00005558a20468b0
Notice that this is the only case where we use the lock in this way. In
the cleanup kthread and work kthread the lock is only taken around the
bit accesses. This was done intentionally to avoid this kind of issue.
The way the lock is used, we only protect ordering of bit sets vs bit
clears. The Tx writers in the hot path don't need to be protected
against the entire kthread loop. The Tx queues threads only need to
ensure that they do not re-use an index that is currently in use. The
cleanup loop does not need to block all new set bits, since it will
re-queue itself if new timestamps are present.
Fix the tracker flow so that it uses the same flow as the standard
cleanup thread. In addition, ensure the in_use bitmap actually gets
cleared properly.
This fixes the warning and also avoids the potential deadlock that might
have occurred otherwise.
Fixes: 4dd0d5c33c ("ice: add lock around Tx timestamp tracker flush")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9973a43012 upstream.
Fix the following build/link errors by adding a dependency on
CRYPTO, CRYPTO_HASH, CRYPTO_SHA256 and CRC32:
ld: drivers/net/usb/r8152.o: in function `rtl8152_fw_verify_checksum':
r8152.c:(.text+0x2b2a): undefined reference to `crypto_alloc_shash'
ld: r8152.c:(.text+0x2bed): undefined reference to `crypto_shash_digest'
ld: r8152.c:(.text+0x2c50): undefined reference to `crypto_destroy_tfm'
ld: drivers/net/usb/r8152.o: in function `_rtl8152_set_rx_mode':
r8152.c:(.text+0xdcb0): undefined reference to `crc32_le'
Fixes: 9370f2d05a ("r8152: support request_firmware for RTL8153")
Fixes: ac718b6930 ("net/usb: new driver for RTL8152")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5a14ea7b4 upstream.
The error code is missing in this code scenario, add the error code
'-EINVAL' to the return value 'rc'.
Eliminate the follow smatch warning:
drivers/net/ethernet/qlogic/qed/qed_main.c:1298 qed_slowpath_start()
warn: missing error code 'rc'.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Fixes: d51e4af5c2 ("qed: aRFS infrastructure support")
Signed-off-by: chongjiapeng <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1413269086 upstream.
Introduction of lockless subqueues broke the class statistics.
Before the change stats were accumulated in `bstats' and `qstats'
on the stack which was then copied to struct gnet_dump.
After the change the `bstats' and `qstats' are initialized to 0
and never updated, yet still fed to gnet_dump. The code updates
the global qdisc->cpu_bstats and qdisc->cpu_qstats instead,
clobbering them. Most likely a copy-paste error from the code in
mqprio_dump().
__gnet_stats_copy_basic() and __gnet_stats_copy_queue() accumulate
the values for per-CPU case but for global stats they overwrite
the value, so only stats from the last loop iteration / tc end up
in sch->[bq]stats.
Use the on-stack [bq]stats variables again and add the stats manually
in the global case.
Fixes: ce679e8df7 ("net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mqprio")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
https://lore.kernel.org/all/20211007175000.2334713-2-bigeasy@linutronix.de/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 596143e3ae upstream.
Fix modpost Section mismatch error in next_platform_timer().
[...]
WARNING: modpost: vmlinux.o(.text.unlikely+0x26e60): Section mismatch in reference from the function next_platform_timer() to the variable .init.data:acpi_gtdt_desc
The function next_platform_timer() references
the variable __initdata acpi_gtdt_desc.
This is often because next_platform_timer lacks a __initdata
annotation or the annotation of acpi_gtdt_desc is wrong.
WARNING: modpost: vmlinux.o(.text.unlikely+0x26e64): Section mismatch in reference from the function next_platform_timer() to the variable .init.data:acpi_gtdt_desc
The function next_platform_timer() references
the variable __initdata acpi_gtdt_desc.
This is often because next_platform_timer lacks a __initdata
annotation or the annotation of acpi_gtdt_desc is wrong.
ERROR: modpost: Section mismatches detected.
Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them.
make[1]: *** [scripts/Makefile.modpost:59: vmlinux.symvers] Error 1
make[1]: *** Deleting file 'vmlinux.symvers'
make: *** [Makefile:1176: vmlinux] Error 2
[...]
Fixes: a712c3ed9b ("acpi/arm64: Add memory-mapped timer support in GTDT driver")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/20210823092526.2407526-1-liu.yun@linux.dev
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 14eb0cb4e9 upstream.
In theory a context can be destroyed and a new one allocated at the same
address, making the pointer comparision to detect when we don't need to
update the current pagetables invalid. Instead assign a sequence number
to each context on creation, and use this for the check.
Fixes: 84c31ee16f ("drm/msm/a6xx: Add support for per-instance pagetables")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95c58291ee upstream.
The overflow check does causes a warning from clang-14 when 'sz' is a type
that is smaller than size_t:
drivers/gpu/drm/msm/msm_gem_submit.c:217:10: error: result of comparison of constant 18446744073709551615 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (sz == SIZE_MAX) {
Change the type accordingly.
Fixes: 20224d715a ("drm/msm/submit: Move copy_from_user ahead of locking bos")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20210927113632.3849987-1-arnd@kernel.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 97794170b6 upstream.
In commit e11f5bd822 ("drm: Add support for DP 1.4 Compliance edid
corruption test") the function connector_bad_edid() started assuming
that the memory for the EDID passed to it was big enough to hold
`edid[0x7e] + 1` blocks of data (1 extra for the base block). It
completely ignored the fact that the function was passed `num_blocks`
which indicated how much memory had been allocated for the EDID.
Let's fix this by adding a bounds check.
This is important for handling the case where there's an error in the
first block of the EDID. In that case we will call
connector_bad_edid() without having re-allocated memory based on
`edid[0x7e]`.
Fixes: e11f5bd822 ("drm: Add support for DP 1.4 Compliance edid corruption test")
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211005192905.v2.1.Ib059f9c23c2611cb5a9d760e7d0a700c1295928d@changeid
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75b3cb97eb upstream.
Intermittent Kernel crash has been observed on probe in
bcm_qspi_mspi_l2_isr() handler when the MSPI spifie interrupt bit
has not been cleared before registering for interrupts.
Fix the driver to move SoC specific custom interrupt handling code
before we register IRQ in probe. Also clear MSPI interrupt status
resgiter prior to registering IRQ handlers.
Fixes: cc20a38612 ("spi: iproc-qspi: Add Broadcom iProc SoCs support")
Signed-off-by: Kamal Dasu <kdasu@broadcom.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20211008203603.40915-3-kdasu.kdev@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6840615f85 upstream.
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding an id_table listing the
SPI IDs for everything.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210923170023.1683-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b02420169 upstream.
Change kstrtou32() argument 'base' to be zero instead of 'len'.
It works by chance for setting one bit value, but it is not supposed to
work in case value passed to mlxreg_io_attr_store() is greater than 1.
It works for example, for:
echo 1 > /sys/devices/platform/mlxplat/mlxreg-io/hwmon/.../jtag_enable
But it will fail for:
echo n > /sys/devices/platform/mlxplat/mlxreg-io/hwmon/.../jtag_enable,
where n > 1.
The flow for input buffer conversion is as below:
_kstrtoull(const char *s, unsigned int base, unsigned long long *res)
calls:
rv = _parse_integer(s, base, &_res);
For the second case, where n > 1:
- _parse_integer() converts 's' to 'val'.
For n=2, 'len' is set to 2 (string buffer is 0x32 0x0a), for n=3
'len' is set to 3 (string buffer 0x33 0x0a), etcetera.
- 'base' is equal or greater then '2' (length of input buffer).
As a result, _parse_integer() exits with result zero (rv):
rv = 0;
while (1) {
...
if (val >= base)-> (2 >= 2)
break;
...
rv++;
...
}
And _kstrtoull() in their turn will fail:
if (rv == 0)
return -EINVAL;
Fixes: 5ec4a8ace0 ("platform/mellanox: Introduce support for Mellanox register access driver")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210927142214.2613929-2-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 332fdf951d upstream.
Currently, mlxsw allows cooling states to be set above the maximum
cooling state supported by the driver:
# cat /sys/class/thermal/thermal_zone2/cdev0/type
mlxsw_fan
# cat /sys/class/thermal/thermal_zone2/cdev0/max_state
10
# echo 18 > /sys/class/thermal/thermal_zone2/cdev0/cur_state
# echo $?
0
This results in out-of-bounds memory accesses when thermal state
transition statistics are enabled (CONFIG_THERMAL_STATISTICS=y), as the
transition table is accessed with a too large index (state) [1].
According to the thermal maintainer, it is the responsibility of the
driver to reject such operations [2].
Therefore, return an error when the state to be set exceeds the maximum
cooling state supported by the driver.
To avoid dead code, as suggested by the thermal maintainer [3],
partially revert commit a421ce088a ("mlxsw: core: Extend cooling
device with cooling levels") that tried to interpret these invalid
cooling states (above the maximum) in a special way. The cooling levels
array is not removed in order to prevent the fans going below 20% PWM,
which would cause them to get stuck at 0% PWM.
[1]
BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x271/0x290
Read of size 4 at addr ffff8881052f7bf8 by task kworker/0:0/5
CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.15.0-rc3-custom-45935-gce1adf704b14 #122
Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2FO"/"SA000874", BIOS 4.6.5 03/08/2016
Workqueue: events_freezable_power_ thermal_zone_device_check
Call Trace:
dump_stack_lvl+0x8b/0xb3
print_address_description.constprop.0+0x1f/0x140
kasan_report.cold+0x7f/0x11b
thermal_cooling_device_stats_update+0x271/0x290
__thermal_cdev_update+0x15e/0x4e0
thermal_cdev_update+0x9f/0xe0
step_wise_throttle+0x770/0xee0
thermal_zone_device_update+0x3f6/0xdf0
process_one_work+0xa42/0x1770
worker_thread+0x62f/0x13e0
kthread+0x3ee/0x4e0
ret_from_fork+0x1f/0x30
Allocated by task 1:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc+0x7c/0x90
thermal_cooling_device_setup_sysfs+0x153/0x2c0
__thermal_cooling_device_register.part.0+0x25b/0x9c0
thermal_cooling_device_register+0xb3/0x100
mlxsw_thermal_init+0x5c5/0x7e0
__mlxsw_core_bus_device_register+0xcb3/0x19c0
mlxsw_core_bus_device_register+0x56/0xb0
mlxsw_pci_probe+0x54f/0x710
local_pci_probe+0xc6/0x170
pci_device_probe+0x2b2/0x4d0
really_probe+0x293/0xd10
__driver_probe_device+0x2af/0x440
driver_probe_device+0x51/0x1e0
__driver_attach+0x21b/0x530
bus_for_each_dev+0x14c/0x1d0
bus_add_driver+0x3ac/0x650
driver_register+0x241/0x3d0
mlxsw_sp_module_init+0xa2/0x174
do_one_initcall+0xee/0x5f0
kernel_init_freeable+0x45a/0x4de
kernel_init+0x1f/0x210
ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff8881052f7800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 1016 bytes inside of
1024-byte region [ffff8881052f7800, ffff8881052f7c00)
The buggy address belongs to the page:
page:0000000052355272 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1052f0
head:0000000052355272 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x200000000010200(slab|head|node=0|zone=2)
raw: 0200000000010200 ffffea0005034800 0000000300000003 ffff888100041dc0
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8881052f7a80: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc
ffff8881052f7b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8881052f7b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffff8881052f7c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8881052f7c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[2] https://lore.kernel.org/linux-pm/9aca37cb-1629-5c67-1895-1fdc45c0244e@linaro.org/
[3] https://lore.kernel.org/linux-pm/af9857f2-578e-de3a-e62b-6baff7e69fd4@linaro.org/
CC: Daniel Lezcano <daniel.lezcano@linaro.org>
Fixes: a50c1e3565 ("mlxsw: core: Implement thermal zone")
Fixes: a421ce088a ("mlxsw: core: Extend cooling device with cooling levels")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20211012174955.472928-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 776c750108 upstream.
I got a null-ptr-deref report:
KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
...
RIP: 0010:regulator_enable+0x84/0x260
...
Call Trace:
ahci_platform_enable_regulators+0xae/0x320
ahci_platform_enable_resources+0x1a/0x120
ahci_probe+0x4f/0x1b9
platform_probe+0x10b/0x280
...
entry_SYSCALL_64_after_hwframe+0x44/0xae
If devm_regulator_get() in ahci_platform_get_resources() fails,
hpriv->phy_regulator will point to NULL, when enabling or disabling it,
null-ptr-deref will occur.
ahci_probe()
ahci_platform_get_resources()
devm_regulator_get(, "phy") // failed, let phy_regulator = NULL
ahci_platform_enable_resources()
ahci_platform_enable_regulators()
regulator_enable(hpriv->phy_regulator) // null-ptr-deref
commit 962399bb7f ("ata: libahci_platform: Fix regulator_get_optional()
misuse") replaces devm_regulator_get_optional() with devm_regulator_get(),
but PHY regulator omits to delete "hpriv->phy_regulator = NULL;" like AHCI.
Delete it like AHCI regulator to fix this bug.
Fixes: commit 962399bb7f ("ata: libahci_platform: Fix regulator_get_optional() misuse")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 291c932fc3 upstream.
'skb' is allocated in digital_in_send_sdd_req(), but not free when
digital_in_send_cmd() failed, which will cause memory leak. Fix it
by freeing 'skb' if digital_in_send_cmd() return failed.
Fixes: 2c66daecc4 ("NFC Digital: Add NFC-A technology support")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58e7dcc9ca upstream.
'params' is allocated in digital_tg_listen_mdaa(), but not free when
digital_send_cmd() failed, which will cause memory leak. Fix it by
freeing 'params' if digital_send_cmd() return failed.
Fixes: 1c7a4c24fb ("NFC Digital: Add target NFC-DEP support")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40507e7aad upstream.
After recent cleanups, gcc started warning about a suspicious
memcpy() call during the s2io_io_resume() function:
In function '__dev_addr_set',
inlined from 'eth_hw_addr_set' at include/linux/etherdevice.h:318:2,
inlined from 's2io_set_mac_addr' at drivers/net/ethernet/neterion/s2io.c:5205:2,
inlined from 's2io_io_resume' at drivers/net/ethernet/neterion/s2io.c:8569:7:
arch/x86/include/asm/string_32.h:182:25: error: '__builtin_memcpy' accessing 6 bytes at offsets 0 and 2 overlaps 4 bytes at offset 2 [-Werror=restrict]
182 | #define memcpy(t, f, n) __builtin_memcpy(t, f, n)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/netdevice.h:4648:9: note: in expansion of macro 'memcpy'
4648 | memcpy(dev->dev_addr, addr, len);
| ^~~~~~
What apparently happened is that an old cleanup changed the calling
conventions for s2io_set_mac_addr() from taking an ethernet address
as a character array to taking a struct sockaddr, but one of the
callers was not changed at the same time.
Change it to instead call the low-level do_s2io_prog_unicast() function
that still takes the old argument type.
Fixes: 2fd3768845 ("S2io: Added support set_mac_address driver entry point")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211013143613.2049096-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef1100ef20 upstream.
When the ksz module is installed and removed using rmmod, kernel crashes
with null pointer dereferrence error. During rmmod, ksz_switch_remove
function tries to cancel the mib_read_workqueue using
cancel_delayed_work_sync routine and unregister switch from dsa.
During dsa_unregister_switch it calls ksz_mac_link_down, which in turn
reschedules the workqueue since mib_interval is non-zero.
Due to which queue executed after mib_interval and it tries to access
dp->slave. But the slave is unregistered in the ksz_switch_remove
function. Hence kernel crashes.
To avoid this crash, before canceling the workqueue, resetted the
mib_interval to 0.
v1 -> v2:
-Removed the if condition in ksz_mib_read_work
Fixes: 469b390e1b ("net: dsa: microchip: use delayed_work instead of timer + work")
Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a3e0aeddf upstream.
mv88e6xxx_port_ppu_updates() interpretes data in the PORT_STS
register incorrectly for internal ports (ie no PPU). In these
cases, the PHY_DETECT bit indicates link status. This results
in forcing the MAC state whenever the PHY link goes down which
is not intended. As a side effect, LED's configured to show
link status stay lit even though the physical link is down.
Add a check in mac_link_down and mac_link_up to see if it
concerns an external port and only then, look at PPU status.
Fixes: 5d5b231da7 (net: dsa: mv88e6xxx: use PHY_DETECT in mac_link_up/mac_link_down)
Reported-by: Maarten Zanders <m.zanders@televic.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Maarten Zanders <maarten.zanders@mind.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f49823939e upstream.
In case a PHY device was probed thus in the PHY_READY state, but not
configured and with no network device attached yet, we should not be
trying to shut it down because it has been brought back into reset by
phy_device_reset() towards the end of phy_probe() and anyway we have not
configured the PHY yet.
Fixes: e2f016cf77 ("net: phy: add a shutdown procedure")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 075da584ba upstream.
Some old IPs do not provide the hardware feature register.
On these IPs, this register is read 0x00000000.
In old driver version, this feature was handled but a regression came
with the commit f10a6a3541 ("stmmac: rework get_hw_feature function").
Indeed, this commit removes the return value in dma->get_hw_feature().
This return value was used to indicate the validity of retrieved
information and used later on in stmmac_hw_init() to override
priv->plat data if this hardware feature were valid.
This patch restores the return code in ->get_hw_feature() in order
to indicate the hardware feature validity and override priv->plat
data only if this hardware feature is valid.
Fixes: f10a6a3541 ("stmmac: rework get_hw_feature function")
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b2107cdc43 upstream.
Before this patch, mlx5 representors advertised the
NETIF_F_VLAN_CHALLENGED bit, this could lead to missing features when
using reps with vxlan/bridge and maybe other virtual interfaces,
when such interfaces inherit this bit and block vlan usage in their
topology.
Example:
$ip link add dev bridge type bridge
# add representor interface to the bridge
$ip link set dev pf0hpf master
$ip link add link bridge name vlan10 type vlan id 10 protocol 802.1q
Error: 8021q: VLANs not supported on device.
Reps are perfectly capable of handling vlan traffic, although they don't
implement vlan_{add,kill}_vid ndos, hence, remove
NETIF_F_VLAN_CHALLENGED advertisement.
Fixes: cb67b83292 ("net/mlx5e: Introduce SRIOV VF representors")
Reported-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0bc73ad46a upstream.
Due to current HW arch limitations, RX-FCS (scattering FCS frame field
to software) and RX-port-timestamp (improved timestamp accuracy on the
receive side) can't work together.
RX-port-timestamp is not controlled by the user and it is enabled by
default when supported by the HW/FW.
This patch sets RX-port-timestamp opposite to RX-FCS configuration.
Fixes: 102722fc68 ("net/mlx5e: Add support for RXFCS feature flag")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95f7f3e7dc upstream.
Commit 8f3d65c166 ("net/smc: fix wait on already cleared link")
introduced link refcounting to avoid waits on already cleared links.
This patch extents and improves the refcounting to cover all
remaining possible cases for this kind of error situation.
Fixes: 15e1b99aad ("net/smc: no WR buffer wait for terminating link group")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e599ee234a upstream.
Fix the following build/link error by adding a dependency on the CRC32
routines:
ld: drivers/net/ethernet/arc/emac_main.o: in function `arc_emac_set_rx_mode':
emac_main.c:(.text+0xb11): undefined reference to `crc32_le'
The crc32_le() call comes through the ether_crc_le() call in
arc_emac_set_rx_mode().
[v2: moved the select to ARC_EMAC_CORE; the Makefile is a bit confusing,
but the error comes from emac_main.o, which is part of the arc_emac module,
which in turn is enabled by CONFIG_ARC_EMAC_CORE. Note that arc_emac is
different from emac_arc...]
Fixes: 775dd682e2 ("arc_emac: implement promiscuous mode and multicast filtering")
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Link: https://lore.kernel.org/r/20211012093446.1575-1-vegard.nossum@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55a9968c7e upstream.
The commit 15add06841 ("gpio: pca953x: add ->set_config implementation")
introduced support for bias setting. However this, due to being half-baked,
brought potential issues:
- the turning bias via disabling makes the pin floating for a while;
- once enabled, bias can't be disabled.
Fix all these by adding support for bias disabling and move the disabling
part under the corresponding conditional.
While at it, add support for default setting, since it's cheap to add.
Fixes: 15add06841 ("gpio: pca953x: add ->set_config implementation")
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be44918383 upstream.
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85f74acf09 upstream.
The request tag is no longer the only component of the command id.
Fixes: e7006de6c2 ("nvme: code command_id with a genctr for use-after-free validation")
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa2a30f8e0 upstream.
As per RZ/G2L HW(Rev.0.50) manual, clock monitor register value
0 means clock is not supplied and 1 means clock is supplied.
This patch fixes the issue by removing the inverted logic.
Fixing the above, triggered following 2 issues
1) GIC interrupts don't work if we disable IA55_CLK and DMAC_ACLK.
Fixed this issue by adding these clocks as critical clocks.
2) DMA is not working, since the DMA driver is not turning on DMAC_PCLK.
So will provide a fix in the DMA driver to turn on DMA_PCLK.
Fixes: ef3c613ccd ("clk: renesas: Add CPG core wrapper for RZ/G2L SoC")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://lore.kernel.org/r/20210922112405.26413-2-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13dbc954b3 upstream.
dtbs_check currently complains that:
arch/arm/boot/dts/bcm2711-rpi-4-b.dts:220.10-231.4: Warning
(pci_device_reg): /scb/pcie@7d500000/pci@1,0: PCI unit address format
error, expected "0,0"
Unsurprisingly pci@0,0 is the right address, as illustrated by its reg
property:
&pcie0 {
pci@0,0 {
/*
* As defined in the IEEE Std 1275-1994 document,
* reg is a five-cell address encoded as (phys.hi
* phys.mid phys.lo size.hi size.lo). phys.hi
* should contain the device's BDF as 0b00000000
* bbbbbbbb dddddfff 00000000. The other cells
* should be zero.
*/
reg = <0 0 0 0 0>;
};
};
The device is clearly 0. So fix it.
Also add a missing 'device_type = "pci"'.
Fixes: 258f92d2f8 ("ARM: dts: bcm2711: Add reset controller to xHCI node")
Suggested-by: Rob Herring <robh@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210831125843.1233488-1-nsaenzju@redhat.com
Signed-off-by: Nicolas Saenz Julienne <nsaenz@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 244f5d597e upstream.
Currently the arm_ffa firmware driver can be built as module and hence
all the users of FFA driver. If any driver on the ffa bus is removed or
unregistered, the remove callback on all the device bound to the driver
being removed should be callback. For that to happen, we must register
a remove callback on the ffa_bus which is currently missing. This results
in the probe getting called again without the previous remove callback
on a device which may result in kernel crash.
Fix the issue by registering the remove callback on the FFA bus.
Link: https://lore.kernel.org/r/20210924092859.3057562-1-sudeep.holla@arm.com
Fixes: e781858488 ("firmware: arm_ffa: Add initial FFA bus support for device enumeration")
Reported-by: Jens Wiklander <jens.wiklander@linaro.org>
Tested-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb7b52e6db upstream.
When arm_ffa firmware driver module is unloaded or removed we call
__ffa_devices_unregister on all the devices on the ffa bus. It must
unregister all the devices instead it is currently just releasing the
devices without unregistering. That is pure wrong as when we try to
load the module back again, it will result in the kernel crash something
like below.
-->8
CPU: 2 PID: 232 Comm: modprobe Not tainted 5.15.0-rc2+ #169
Hardware name: FVP Base RevC (DT)
Call trace:
dump_backtrace+0x0/0x1cc
show_stack+0x18/0x64
dump_stack_lvl+0x64/0x7c
dump_stack+0x18/0x38
sysfs_create_dir_ns+0xe4/0x140
kobject_add_internal+0x170/0x358
kobject_add+0x94/0x100
device_add+0x178/0x5f0
device_register+0x20/0x30
ffa_device_register+0x80/0xcc [ffa_module]
ffa_setup_partitions+0x7c/0x108 [ffa_module]
init_module+0x290/0x2dc [ffa_module]
do_one_initcall+0xbc/0x230
do_init_module+0x58/0x304
load_module+0x15e0/0x1f68
__arm64_sys_finit_module+0xb8/0xf4
invoke_syscall+0x44/0x140
el0_svc_common+0xb4/0xf0
do_el0_svc+0x24/0x80
el0_svc+0x20/0x50
el0t_64_sync_handler+0x84/0xe4
el0t_64_sync+0x1a0/0x1a4
kobject_add_internal failed for arm-ffa-8001 with -EEXIST, don't try to
register things with the same name in the same directory.
----
Fix the issue by calling device_unregister in __ffa_devices_unregister
which will also take care of calling device_release(which is mapped to
ffa_release_device)
Link: https://lore.kernel.org/r/20210924092859.3057562-2-sudeep.holla@arm.com
Fixes: e781858488 ("firmware: arm_ffa: Add initial FFA bus support for device enumeration")
Tested-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f565d0ead upstream.
When OP-TEE driver is built as a module, OP-TEE client devices
registered on TEE bus during probe should be unregistered during
optee_remove. So implement optee_unregister_devices() accordingly.
Fixes: c3fa24af92 ("tee: optee: add TEE bus device enumeration support")
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a2a79577d upstream.
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI ID table.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e2cd44490 upstream.
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding an id_table listing the
SPI IDs for everything.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210923172453.4921-1-broonie@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f42752729e upstream.
The newly added SPI device ID table does not work because the
entry is incorrectly copied from the OF device table.
During build testing, this shows as a compile failure when building
it as a loadable module:
drivers/misc/eeprom/eeprom_93xx46.c:424:1: error: redefinition of '__mod_of__eeprom_93xx46_of_table_device_table'
MODULE_DEVICE_TABLE(of, eeprom_93xx46_of_table);
Change the entry to refer to the correct symbol.
Fixes: 137879f7ff ("eeprom: 93xx46: Add SPI device ID table")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211014153730.3821376-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 137879f7ff upstream.
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding a SPI device ID table.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210922184048.34770-1-broonie@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1a913270e5 upstream.
In Sigma-Delta devices the SDO line is also used as an interrupt.
Leaving IRQ on level instead of falling might trigger a sample read
when the IRQ is enabled, as the SDO line is already low. Not sure
if SDO line will always immediately go high in ad_sd_buffer_postenable
before the IRQ is enabled.
Also the datasheet seem to explicitly say the falling edge of the SDO
should be used as an interrupt:
>From the AD7793 datasheet: " The DOUT/RDY falling edge can be
used as an interrupt to a processor"
Fixes: da4d3d6bb9 ("iio: adc: ad-sigma-delta: Allow custom IRQ flags")
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210906065630.16325-4-alexandru.tachici@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e081102f30 upstream.
Correct IRQ flag here is falling.
In Sigma-Delta devices the SDO line is also used as an interrupt.
Leaving IRQ on level instead of falling might trigger a sample read
when the IRQ is enabled, as the SDO line is already low. Not sure
if SDO line will always immediately go high in ad_sd_buffer_postenable
before the IRQ is enabled.
Also the datasheet seem to explicitly say the falling edge of the SDO
should be used as an interrupt:
>From the AD7780 datasheet: " The DOUT/Figure 22 RDY falling edge
can be used as an interrupt to a processor"
Fixes: da4d3d6bb9 ("iio: adc: ad-sigma-delta: Allow custom IRQ flags")
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210906065630.16325-3-alexandru.tachici@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 89a86da5cb upstream.
IRQ type in ad_sigma_delta_info struct was missing.
In Sigma-Delta devices the SDO line is also used as an interrupt.
Leaving IRQ on level instead of falling might trigger a sample read
when the IRQ is enabled, as the SDO line is already low. Not sure
if SDO line will always immediately go high in ad_sd_buffer_postenable
before the IRQ is enabled.
Also the datasheet seem to explicitly say the falling edge of the SDO
should be used as an interrupt:
>From the AD7192 datasheet: "The DOUT/RDY falling edge can be used
as an interrupt to a processor,"
Fixes: da4d3d6bb9 ("iio: adc: ad-sigma-delta: Allow custom IRQ flags")
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210906065630.16325-2-alexandru.tachici@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f729a592ad upstream.
SYNC_STATE_ONLY device links intentionally allow cycles because cyclic
sync_state() dependencies are valid and necessary.
However a SYNC_STATE_ONLY device link where the consumer and the supplier
are the same device is pointless because the device link would be deleted
as soon as the device probes (because it's also the consumer) and won't
affect when the sync_state() callback is called. It's a waste of CPU cycles
and memory to create this device link. So reject any attempts to create
such a device link.
Fixes: 05ef983e0d ("driver core: Add device link support for SYNC_STATE_ONLY flag")
Cc: stable <stable@vger.kernel.org>
Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210929190549.860541-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f779e1d35 upstream.
When an interrupt is passed through, the KVM XIVE device calls the
set_vcpu_affinity() handler which raises the P bit to mask the
interrupt and to catch any in-flight interrupts while routing the
interrupt to the guest.
On the guest side, drivers (like some Intels) can request at probe
time some MSIs and call synchronize_irq() to check that there are no
in flight interrupts. This will call the XIVE get_irqchip_state()
handler which will always return true as the interrupt P bit has been
set on the host side and lock the CPU in an infinite loop.
Fix that by discarding disabled interrupts in get_irqchip_state().
Fixes: da15c03b04 ("powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race")
Cc: stable@vger.kernel.org #v5.4+
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: seeteena <s1seetee@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211011070203.99726-1-clg@kaod.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 711885906b upstream.
This Kconfig option was added initially so that memory encryption is
enabled by default on machines which support it.
However, devices which have DMA masks that are less than the bit
position of the encryption bit, aka C-bit, require the use of an IOMMU
or the use of SWIOTLB.
If the IOMMU is disabled or in passthrough mode, the kernel would switch
to SWIOTLB bounce-buffering for those transfers.
In order to avoid that,
2cc13bb4f5 ("iommu: Disable passthrough mode when SME is active")
disables the default IOMMU passthrough mode so that devices for which the
default 256K DMA is insufficient, can use the IOMMU instead.
However 2, there are cases where the IOMMU is disabled in the BIOS, etc.
(think the usual hardware folk "oops, I dropped the ball there" cases) or a
driver doesn't properly use the DMA APIs or a device has a firmware or
hardware bug, e.g.:
ea68573d40 ("drm/amdgpu: Fail to load on RAVEN if SME is active")
However 3, in the above GPU use case, there are APIs like Vulkan and
some OpenGL/OpenCL extensions which are under the assumption that
user-allocated memory can be passed in to the kernel driver and both the
GPU and CPU can do coherent and concurrent access to the same memory.
That cannot work with SWIOTLB bounce buffers, of course.
So, in order for those devices to function, drop the "default y" for the
SME by default active option so that users who want to have SME enabled,
will need to either enable it in their config or use "mem_encrypt=on" on
the kernel command line.
[ tlendacky: Generalize commit message. ]
Fixes: 7744ccdbc1 ("x86/mm: Add Secure Memory Encryption (SME) support")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/8bbacd0e-4580-3194-19d2-a0ecad7df09c@molgen.mpg.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff63198850 upstream.
It turns out that access to config space before completing the feature
negotiation is broken for big endian guests at least with QEMU hosts up
to 6.1 inclusive. This affects any device that accesses config space in
the validate callback: at the moment that is virtio-net with
VIRTIO_NET_F_MTU but since 82e89ea077 ("virtio-blk: Add validation for
block size in config space") that also started affecting virtio-blk with
VIRTIO_BLK_F_BLK_SIZE. Further, unlike VIRTIO_NET_F_MTU which is off by
default on QEMU, VIRTIO_BLK_F_BLK_SIZE is on by default, which resulted
in lots of people not being able to boot VMs on BE.
The spec is very clear that what we are doing is legal so QEMU needs to
be fixed, but given it's been broken for so many years and no one
noticed, we need to give QEMU a bit more time before applying this.
Further, this patch is incomplete (does not check blk size is a power
of two) and it duplicates the logic from nbd.
Revert for now, and we'll reapply a cleaner logic in the next release.
Cc: stable@vger.kernel.org
Fixes: 82e89ea077 ("virtio-blk: Add validation for block size in config space")
Cc: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d388fa01f upstream.
If a cell has 'nbits' equal to a multiple of BITS_PER_BYTE the logic
*p &= GENMASK((cell->nbits%BITS_PER_BYTE) - 1, 0);
will become undefined behavior because nbits modulo BITS_PER_BYTE is 0, and we
subtract one from that making a large number that is then shifted more than the
number of bits that fit into an unsigned long.
UBSAN reports this problem:
UBSAN: shift-out-of-bounds in drivers/nvmem/core.c:1386:8
shift exponent 64 is too large for 64-bit type 'unsigned long'
CPU: 6 PID: 7 Comm: kworker/u16:0 Not tainted 5.15.0-rc3+ #9
Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
Workqueue: events_unbound deferred_probe_work_func
Call trace:
dump_backtrace+0x0/0x170
show_stack+0x24/0x30
dump_stack_lvl+0x64/0x7c
dump_stack+0x18/0x38
ubsan_epilogue+0x10/0x54
__ubsan_handle_shift_out_of_bounds+0x180/0x194
__nvmem_cell_read+0x1ec/0x21c
nvmem_cell_read+0x58/0x94
nvmem_cell_read_variable_common+0x4c/0xb0
nvmem_cell_read_variable_le_u32+0x40/0x100
a6xx_gpu_init+0x170/0x2f4
adreno_bind+0x174/0x284
component_bind_all+0xf0/0x264
msm_drm_bind+0x1d8/0x7a0
try_to_bring_up_master+0x164/0x1ac
__component_add+0xbc/0x13c
component_add+0x20/0x2c
dp_display_probe+0x340/0x384
platform_probe+0xc0/0x100
really_probe+0x110/0x304
__driver_probe_device+0xb8/0x120
driver_probe_device+0x4c/0xfc
__device_attach_driver+0xb0/0x128
bus_for_each_drv+0x90/0xdc
__device_attach+0xc8/0x174
device_initial_probe+0x20/0x2c
bus_probe_device+0x40/0xa4
deferred_probe_work_func+0x7c/0xb8
process_one_work+0x128/0x21c
process_scheduled_works+0x40/0x54
worker_thread+0x1ec/0x2a8
kthread+0x138/0x158
ret_from_fork+0x10/0x20
Fix it by making sure there are any bits to mask out.
Fixes: 69aba7948c ("nvmem: Add a simple NVMEM framework for consumers")
Cc: Douglas Anderson <dianders@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20211013124511.18726-1-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d9b7748ffc upstream.
The number of correctable errors is displayed as uncorrectable
errors because the "SBE" error count is passed to both calls of
edac_mc_handle_error().
Pass the correct uncorrectable error count to the second
edac_mc_handle_error() call when logging uncorrectable errors.
[ bp: Massage commit message. ]
Fixes: 7f6998a412 ("ARM: 8888/1: EDAC: Add driver for the Marvell Armada XP SDRAM and L2 cache ECC")
Signed-off-by: Hans Potsch <hans.potsch@nokia.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20211006121332.58788-1-hans.potsch@nokia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f9a174f91 upstream.
The virtio specification virtio-v1.1-cs01 states: "Transitional devices
MUST detect Legacy drivers by detecting that VIRTIO_F_VERSION_1 has not
been acknowledged by the driver." This is exactly what QEMU as of 6.1
has done relying solely on VIRTIO_F_VERSION_1 for detecting that.
However, the specification also says: "... the driver MAY read (but MUST
NOT write) the device-specific configuration fields to check that it can
support the device ..." before setting FEATURES_OK.
In that case, any transitional device relying solely on
VIRTIO_F_VERSION_1 for detecting legacy drivers will return data in
legacy format. In particular, this implies that it is in big endian
format for big endian guests. This naturally confuses the driver which
expects little endian in the modern mode.
It is probably a good idea to amend the spec to clarify that
VIRTIO_F_VERSION_1 can only be relied on after the feature negotiation
is complete. Before validate callback existed, config space was only
read after FEATURES_OK. However, we already have two regressions, so
let's address this here as well.
The regressions affect the VIRTIO_NET_F_MTU feature of virtio-net and
the VIRTIO_BLK_F_BLK_SIZE feature of virtio-blk for BE guests when
virtio 1.0 is used on both sides. The latter renders virtio-blk unusable
with DASD backing, because things simply don't work with the default.
See Fixes tags for relevant commits.
For QEMU, we can work around the issue by writing out the feature bits
with VIRTIO_F_VERSION_1 bit set. We (ab)use the finalize_features
config op for this. This isn't enough to address all vhost devices since
these do not get the features until FEATURES_OK, however it looks like
the affected devices actually never handled the endianness for legacy
mode correctly, so at least that's not a regression.
No devices except virtio net and virtio blk seem to be affected.
Long term the right thing to do is to fix the hypervisors.
Cc: <stable@vger.kernel.org> #v4.11
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: 82e89ea077 ("virtio-blk: Add validation for block size in config space")
Fixes: fe36cbe067 ("virtio_net: clear MTU when out of range")
Reported-by: markver@us.ibm.com
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/20211011053921.1198936-1-pasic@linux.ibm.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9a470db27 upstream.
fastrpc driver is using find_vma() without any protection, as a
result we see below warning due to recent patch 5b78ed24e8
("mm/pagemap: add mmap_assert_locked() annotations to find_vma*()")
which added mmap_assert_locked() in find_vma() function.
This bug went un-noticed in previous versions. Fix this issue by adding
required protection while calling find_vma().
CPU: 0 PID: 209746 Comm: benchmark_model Not tainted 5.15.0-rc2-00445-ge14fe2bf817a-dirty #969
Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : find_vma+0x64/0xd0
lr : find_vma+0x60/0xd0
sp : ffff8000158ebc40
...
Call trace:
find_vma+0x64/0xd0
fastrpc_internal_invoke+0x570/0xda8
fastrpc_device_ioctl+0x3e0/0x928
__arm64_sys_ioctl+0xac/0xf0
invoke_syscall+0x44/0x100
el0_svc_common.constprop.3+0x70/0xf8
do_el0_svc+0x24/0x88
el0_svc+0x3c/0x138
el0t_64_sync_handler+0x90/0xb8
el0t_64_sync+0x180/0x184
Fixes: 80f3afd72b ("misc: fastrpc: consider address offset before sending to DSP")
Cc: stable@vger.kernel.org
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210922154326.8927-1-srinivas.kandagatla@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2115b2b16 upstream.
Commit 7c75bde329 ("usb: musb: musb_dsps: request_irq() after
initializing musb") has inverted the calls to
dsps_setup_optional_vbus_irq() and dsps_create_musb_pdev() without
updating correctly the error path. dsps_create_musb_pdev() allocates and
registers a new platform device which must be unregistered and freed
with platform_device_unregister(), and this is missing upon
dsps_setup_optional_vbus_irq() error.
While on the master branch it seems not to trigger any issue, I observed
a kernel crash because of a NULL pointer dereference with a v5.10.70
stable kernel where the patch mentioned above was backported. With this
kernel version, -EPROBE_DEFER is returned the first time
dsps_setup_optional_vbus_irq() is called which triggers the probe to
error out without unregistering the platform device. Unfortunately, on
the Beagle Bone Black Wireless, the platform device still living in the
system is being used by the USB Ethernet gadget driver, which during the
boot phase triggers the crash.
My limited knowledge of the musb world prevents me to revert this commit
which was sent to silence a robot warning which, as far as I understand,
does not make sense. The goal of this patch was to prevent an IRQ to
fire before the platform device being registered. I think this cannot
ever happen due to the fact that enabling the interrupts is done by the
->enable() callback of the platform musb device, and this platform
device must be already registered in order for the core or any other
user to use this callback.
Hence, I decided to fix the error path, which might prevent future
errors on mainline kernels while also fixing older ones.
Fixes: 7c75bde329 ("usb: musb: musb_dsps: request_irq() after initializing musb")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/r/20211005221631.1529448-1-miquel.raynal@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 38fa3206bf upstream.
While reboot the system by sysrq, the following bug will be occur.
BUG: sleeping function called from invalid context at kernel/locking/semaphore.c:90
in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 10052, name: rc.shutdown
CPU: 3 PID: 10052 Comm: rc.shutdown Tainted: G W O 5.10.0 #1
Call trace:
dump_backtrace+0x0/0x1c8
show_stack+0x18/0x28
dump_stack+0xd0/0x110
___might_sleep+0x14c/0x160
__might_sleep+0x74/0x88
down_interruptible+0x40/0x118
virt_efi_reset_system+0x3c/0xd0
efi_reboot+0xd4/0x11c
machine_restart+0x60/0x9c
emergency_restart+0x1c/0x2c
sysrq_handle_reboot+0x1c/0x2c
__handle_sysrq+0xd0/0x194
write_sysrq_trigger+0xbc/0xe4
proc_reg_write+0xd4/0xf0
vfs_write+0xa8/0x148
ksys_write+0x6c/0xd8
__arm64_sys_write+0x18/0x28
el0_svc_common.constprop.3+0xe4/0x16c
do_el0_svc+0x1c/0x2c
el0_svc+0x20/0x30
el0_sync_handler+0x80/0x17c
el0_sync+0x158/0x180
The reason for this problem is that irq has been disabled in
machine_restart() and then it calls down_interruptible() in
virt_efi_reset_system(), which would occur sleep in irq context,
it is dangerous! Commit 99409b935c9a("locking/semaphore: Add
might_sleep() to down_*() family") add might_sleep() in
down_interruptible(), so the bug info is here. down_trylock()
can solve this problem, cause there is no might_sleep.
--------
Cc: <stable@vger.kernel.org>
Signed-off-by: Zhang Jianhua <chris.zjh@huawei.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3a72ca803 upstream.
Joe reports that using a statically allocated buffer for converting CPER
error records into human readable text is probably a bad idea. Even
though we are not aware of any actual issues, a stack buffer is clearly
a better choice here anyway, so let's move the buffer into the stack
frames of the two functions that refer to it.
Cc: <stable@vger.kernel.org>
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42641042c1 upstream.
clang-14 complains about an unusual way of converting a pointer to
an integer:
drivers/misc/cb710/sgbuf2.c:50:15: error: performing pointer subtraction with a null pointer has undefined behavior [-Werror,-Wnull-pointer-subtraction]
return ((ptr - NULL) & 3) != 0;
Replace this with a normal cast to uintptr_t.
Fixes: 5f5bac8272 ("mmc: Driver for CB710/720 memory card reader (MMC part)")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210927121408.939246-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff0e50d356 upstream.
The command ring pointer is located at [6:63] bits of the command
ring control register (CRCR). All the control bits like command stop,
abort are located at [0:3] bits. While aborting a command, we read the
CRCR and set the abort bit and write to the CRCR. The read will always
give command ring pointer as all zeros. So we essentially write only
the control bits. Since we split the 64 bit write into two 32 bit writes,
there is a possibility of xHC command ring stopped before the upper
dword (all zeros) is written. If that happens, xHC updates the upper
dword of its internal command ring pointer with all zeros. Next time,
when the command ring is restarted, we see xHC memory access failures.
Fix this issue by only writing to the lower dword of CRCR where all
control bits are located.
Cc: stable@vger.kernel.org
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20211008092547.3996295-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d67e332e6 upstream.
When CONFIG_MODULE_UNLOAD is disabled, the module->exit member
is not defined, causing a build failure:
kernel/module.c:4493:8: error: no member named 'exit' in 'struct module'
mod->exit = *exit;
add an #ifdef block around this.
Fixes: cf68fffb66 ("add support for Clang CFI")
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4afb912f43 upstream.
Error injection testing uncovered a case where we'd end up with a
corrupt file system with a missing extent in the middle of a file. This
occurs because the if statement to decide if we should abort is wrong.
The only way we would abort in this case is if we got a ret !=
-EOPNOTSUPP and we called from the file clone code. However the
prealloc code uses this path too. Instead we need to abort if there is
an error, and the only error we _don't_ abort on is -EOPNOTSUPP and only
if we came from the clone file code.
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d175209be0 upstream.
I hit a stuck relocation on btrfs/061 during my overnight testing. This
turned out to be because we had left over extent entries in our extent
root for a data reloc inode that no longer existed. This happened
because in btrfs_drop_extents() we only update refs if we have SHAREABLE
set or we are the tree_root. This regression was introduced by
aeb935a455 ("btrfs: don't set SHAREABLE flag for data reloc tree")
where we stopped setting SHAREABLE for the data reloc tree.
The problem here is we actually do want to update extent references for
data extents in the data reloc tree, in fact we only don't want to
update extent references if the file extents are in the log tree.
Update this check to only skip updating references in the case of the
log tree.
This is relatively rare, because you have to be running scrub at the
same time, which is what btrfs/061 does. The data reloc inode has its
extents pre-allocated, and then we copy the extent into the
pre-allocated chunks. We theoretically should never be calling
btrfs_drop_extents() on a data reloc inode. The exception of course is
with scrub, if our pre-allocated extent falls inside of the block group
we are scrubbing, then the block group will be marked read only and we
will be forced to cow that extent. This means we will call
btrfs_drop_extents() on that range when we COW that file extent.
This isn't really problematic if we do this, the data reloc inode
requires that our extent lengths match exactly with the extent we are
copying, thankfully we validate the extent is correct with
get_new_location(), so if we happen to COW only part of the extent we
won't link it in when we do the relocation, so we are safe from any
other shenanigans that arise because of this interaction with scrub.
Fixes: aeb935a455 ("btrfs: don't set SHAREABLE flag for data reloc tree")
CC: stable@vger.kernel.org # 5.8+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cfd312695b upstream.
At replay_one_name(), we are treating any error from btrfs_lookup_inode()
as if the inode does not exists. Fix this by checking for an error and
returning it to the caller.
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52db77791f upstream.
At __inode_add_ref(), we treating any error returned from
btrfs_lookup_dir_item() or from btrfs_lookup_dir_index_item() as meaning
that there is no existing directory entry in the fs/subvolume tree.
This is not correct since we can get errors such as, for example, -EIO
when reading extent buffers while searching the fs/subvolume's btree.
So fix that and return the error to the caller when it is not -ENOENT.
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e15ac64137 upstream.
At replay_one_one(), we are treating any error returned from
btrfs_lookup_dir_item() or from btrfs_lookup_dir_index_item() as meaning
that there is no existing directory entry in the fs/subvolume tree.
This is not correct since we can get errors such as, for example, -EIO
when reading extent buffers while searching the fs/subvolume's btree.
So fix that and return the error to the caller when it is not -ENOENT.
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 171316a68d upstream.
The return type of ktime_divns() is s64. The timeout_to_jiffies() currently
assigns the result of this ktime_divns() to unsigned long, which on 32 bit
systems may overflow. Furthermore, the result of this function is sometimes
also passed to functions which expect signed long, dma_fence_wait_timeout()
is one such example.
Fix this by adjusting the type of remaining_jiffies to s64, so we do not
suffer overflow there, and return a value limited to range of 0..INT_MAX,
which is safe for all usecases of this timeout.
The above overflow can be triggered if userspace passes in too large timeout
value, larger than INT_MAX / HZ seconds. The kernel detects it and complains
about "schedule_timeout: wrong timeout value %lx" and generates a warning
backtrace.
Note that this fixes commit 6cedb8b377 ("drm/msm: avoid using 'timespec'"),
because the previously used timespec_to_jiffies() function returned unsigned
long instead of s64:
static inline unsigned long timespec_to_jiffies(const struct timespec *value)
Fixes: 6cedb8b377 ("drm/msm: avoid using 'timespec'")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: Rob Clark <robdclark@chromium.org>
Cc: stable@vger.kernel.org # 5.6+
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210917005913.157379-1-marex@denx.de
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a7e0b0e9f upstream.
Since commit 98659487b8 ("drm/msm: add support to take dpu snapshot")
the following NULL pointer dereference is seen on i.MX53:
[ 3.275493] msm msm: bound 30000000.gpu (ops a3xx_ops)
[ 3.287174] [drm] Initialized msm 1.8.0 20130625 for msm on minor 0
[ 3.293915] 8<--- cut here ---
[ 3.297012] Unable to handle kernel NULL pointer dereference at virtual address 00000028
[ 3.305244] pgd = (ptrval)
[ 3.307989] [00000028] *pgd=00000000
[ 3.311624] Internal error: Oops: 805 [#1] SMP ARM
[ 3.316430] Modules linked in:
[ 3.319503] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0+g682d702b426b #1
[ 3.326652] Hardware name: Freescale i.MX53 (Device Tree Support)
[ 3.332754] PC is at __mutex_init+0x14/0x54
[ 3.336969] LR is at msm_disp_snapshot_init+0x24/0xa0
i.MX53 does not use the DPU controller.
Fix the problem by only calling msm_disp_snapshot_init() on platforms that
use the DPU controller.
Cc: stable@vger.kernel.org
Fixes: 98659487b8 ("drm/msm: add support to take dpu snapshot")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20210914174831.2044420-1-festevam@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1d94b0129 upstream.
Commit 64f7c698be ("drm/nouveau/fifo: add engine_id hook") replaced
fifo/chang84.c g84_fifo_chan_engine() call with an indirect call of
fifo/g84.c g84_fifo_engine_id(). The G84_FIFO_ENGN_* values returned
from the later g84_fifo_engine_id() are incremented by 1 compared to
the previous g84_fifo_chan_engine() return values.
This is fine either way for most of the code, except this one line
where an engine bit programmed into the hardware is derived from the
return value. Decrement the return value accordingly, otherwise the
wrong engine bit is programmed into the hardware and that leads to
the following failure:
nouveau 0000:01:00.0: gr: 00000030 [ILLEGAL_MTHD ILLEGAL_CLASS] ch 1 [003fbce000 DRM] subc 3 class 0000 mthd 085c data 00000420
On the following hardware:
lspci -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation GT216GLM [Quadro FX 880M] (rev a2)
lspci -ns 01:00.0
01:00.0 0300: 10de:0a3c (rev a2)
Fixes: 64f7c698be ("drm/nouveau/fifo: add engine_id hook")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: <stable@vger.kernel.org> # 5.12+
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211007214117.231472-1-marex@denx.de
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e5809a4dd upstream.
For non-4K PAGE_SIZE configs, the largest gigantic huge page size is
CONT_PMD_SHIFT order. On arm64 with 64K PAGE_SIZE, the gigantic page is
16G. Therefore, one should be able to specify 'hugetlb_cma=16G' on the
kernel command line so that one gigantic page can be allocated from CMA.
However, when adding such an option the following message is produced:
hugetlb_cma: cma area should be at least 8796093022208 MiB
This is because the calculation for non-4K gigantic page order is
incorrect in the arm64 specific routine arm64_hugetlb_cma_reserve().
Fixes: abb7962adc ("arm64/hugetlb: Reserve CMA areas for gigantic pages on 16K and 64K configs")
Cc: <stable@vger.kernel.org> # 5.9.x
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20211005202529.213812-1-mike.kravetz@oracle.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af89ebaa64 upstream.
gpr_get() return the entire pt_regs (include sr) to userspace, if we
don't restore the C bit in gpr_set, it may break the ALU result in
that context. So the C flag bit is part of gpr context, that's why
riscv totally remove the C bit in the ISA. That makes sr reg clear
from userspace to supervisor privilege.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fbd63c08cd upstream.
csky restore_sigcontext() blindly overwrites regs->sr with the value
it finds in sigcontext. Attacker can store whatever they want in there,
which includes things like S-bit. Userland shouldn't be able to set
that, or anything other than C flag (bit 0).
Do the same thing other architectures with protected bits in flags
register do - preserve everything that shouldn't be settable in
user mode, picking the rest from the value saved is sigcontext.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4459b11e8 upstream.
DM uses blk-mq's quiesce/unquiesce to stop/start device mapper queue.
But blk-mq's unquiesce may come from outside events, such as elevator
switch, updating nr_requests or others, and request may come during
suspend, so simply ask for blk-mq to requeue it.
Fixes one kernel panic issue when running updating nr_requests and
dm-mpath suspend/resume stress test.
Cc: stable@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be358af119 upstream.
I received a build failure for a new patch I'm working on the nds32
architecture, and when I went to test it, I couldn't get to my build error,
because it failed to build with a bunch of:
Error: invalid operands (*UND* and *UND* sections) for `^'
issues with various files. Those files were temporary asm files that looked
like: kernel/.tmp_mc_fork.s
I decided to look deeper, and found that the "mc" portion of that name
stood for "mcount", and was created by the recordmcount.pl script. One that
I wrote over a decade ago. Once I knew the source of the problem, I was
able to investigate it further.
The way the recordmcount.pl script works (BTW, there's a C version that
simply modifies the ELF object) is by doing an "objdump" on the object
file. Looks for all the calls to "mcount", and creates an offset of those
locations from some global variable it can use (usually a global function
name, found with <.*>:). Creates a asm file that is a table of references
to these locations, using the found variable/function. Compiles it and
links it back into the original object file. This asm file is called
".tmp_mc_<object_base_name>.s".
The problem here is that the objdump produced by the nds32 object file,
contains things that look like:
0000159a <.L3^B1>:
159a: c6 00 beqz38 $r6, 159a <.L3^B1>
159a: R_NDS32_9_PCREL_RELA .text+0x159e
159c: 84 d2 movi55 $r6, #-14
159e: 80 06 mov55 $r0, $r6
15a0: ec 3c addi10.sp #0x3c
Where ".L3^B1 is somehow selected as the "global" variable to index off of.
Then the assembly file that holds the mcount locations looks like this:
.section __mcount_loc,"a",@progbits
.align 2
.long .L3^B1 + -5522
.long .L3^B1 + -5384
.long .L3^B1 + -5270
.long .L3^B1 + -5098
.long .L3^B1 + -4970
.long .L3^B1 + -4758
.long .L3^B1 + -4122
[...]
And when it is compiled back to an object to link to the original object,
the compile fails on the "^" symbol.
Simple solution for now, is to have the perl script ignore using function
symbols that have an "^" in the name.
Link: https://lkml.kernel.org/r/20211014143507.4ad2c0f7@gandalf.local.home
Cc: stable@vger.kernel.org
Acked-by: Greentime Hu <green.hu@gmail.com>
Fixes: fbf58a52ac ("nds32/ftrace: Add RECORD_MCOUNT support")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f60f574100 upstream.
From QPIC V2 onwards there is a separate register to read
last code word "QPIC_NAND_READ_LOCATION_LAST_CW_n".
qcom_nandc_read_cw_raw() is used to read only one code word
at a time. If we will configure number of code words to 1 in
in QPIC_NAND_DEV0_CFG0 register then QPIC controller thinks
its reading the last code word, since from QPIC V2 onwards
we are having separate register to read the last code word,
we have to configure "QPIC_NAND_READ_LOCATION_LAST_CW_n"
register to fetch data from controller buffer to system
memory.
Fixes: 503ee5aad4 ("mtd: rawnand: qcom: update last code word register")
Cc: stable@kernel.org
Signed-off-by: Md Sadre Alam <mdalam@codeaurora.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/1630998357-1359-1-git-send-email-mdalam@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3fd1a986e upstream.
We need to define the codec pin 0x1b to be the mic, but somehow
the mic doesn't support hot plugging detection, and Windows also has
this issue, so we set it to phantom headset-mic.
Also the determine_headset_type() often returns the omtp type by a
mistake when we plug a ctia headset, this makes the mic can't record
sound at all. Because most of the headset are ctia type nowadays and
some machines have the fixed ctia type audio jack, it is possible this
machine has the fixed ctia jack too. Here we set this mic jack to
fixed ctia type, this could avoid the mic type detection mistake and
make the ctia headset work stable.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214537
Reported-and-tested-by: msd <msd.mmq@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20211012114748.5238-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd6dd6e3c7 upstream.
This applies a SND_PCI_QUIRK(...) to the TongFang PHxTxX1 barebone. This
fixes the issue of the internal Microphone not working after booting
another OS.
When booting a certain another OS this barebone keeps some coeff settings
even after a cold shutdown. These coeffs prevent the microphone detection
from working in Linux, making the Laptop think that there is always an
external microphone plugged-in and therefore preventing the use of the
internal one.
The relevant indexes and values where gathered by naively diff-ing and
reading a working and a non-working coeff dump.
Signed-off-by: Werner Sembach <wse@tuxedocomputers.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211006130415.538243-1-wse@tuxedocomputers.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b987fe844 upstream.
The headphone mic is not working on Dell Latitude laptops with ALC3254.
The codec vendor id is 0x10ec0295 and share the same pincfg as defined
in ALC295_STANDARD_PINS. So the ALC269_FIXUP_DELL1_MIC_NO_PRESENCE will
be applied per alc269_pin_fixup_tbl[] but actually the headphone mic is
using NID 0x1b instead of 0x1a. The ALC269_FIXUP_DELL4_MIC_NO_PRESENCE
need to be applied instead.
Use ALC269_FIXUP_DELL4_MIC_NO_PRESENCE for particular models before
a generic fixup comes out.
Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211001062856.1037901-1-chris.chiu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f8763c59c upstream.
John Keeping reported and posted a patch for a potential UAF in
rawmidi sequencer destruction: the snd_rawmidi_dev_seq_free() may be
called after the associated rawmidi object got already freed.
After a deeper look, it turned out that the bug is rather the
incorrect private_free call order for a snd_seq_device. The
snd_seq_device private_free gets called at the release callback of the
sequencer device object, while this was rather expected to be executed
at the snd_device call chains that runs at the beginning of the whole
card-free procedure. It's been broken since the rewrite of
sequencer-device binding (although it hasn't surfaced because the
sequencer device release happens usually right along with the card
device release).
This patch corrects the private_free call to be done in the right
place, at snd_seq_device_dev_free().
Fixes: 7c37ae5c62 ("ALSA: seq: Rewrite sequencer device binding with standard bus")
Reported-and-tested-by: John Keeping <john@metanate.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210930114114.8645-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 228af5a4fa upstream.
Michael Forney reported an incorrect padding type that was defined in
the commit 80fe7430c7 ("ALSA: add new 32-bit layout for
snd_pcm_mmap_status/control") for PCM control mmap data.
His analysis is correct, and this caused the misplacements of PCM
control data on 32bit arch and 32bit compat mode.
The bug is that the __pad2 definition in __snd_pcm_mmap_control64
struct was wrongly with __pad_before_uframe, which should have been
__pad_after_uframe instead. This struct is used in SYNC_PTR ioctl and
control mmap. Basically this bug leads to two problems:
- The offset of avail_min field becomes wrong, it's placed right after
appl_ptr without padding on little-endian
- When appl_ptr and avail_min are read as 64bit values in kernel side,
the values become either zero or corrupted (mixed up)
One good news is that, because both user-space and kernel
misunderstand the wrong offset, at least, 32bit application running on
32bit kernel works as is. Also, 64bit applications are unaffected
because the padding size is zero. The remaining problem is the 32bit
compat mode; as mentioned in the above, avail_min is placed right
after appl_ptr on little-endian archs, 64bit kernel reads bogus values
for appl_ptr updates, which may lead to streaming bugs like jumping,
XRUN or whatever unexpected.
(However, we haven't heard any serious bug reports due to this over
years, so practically seen, it's fairly safe to assume that the impact
by this bug is limited.)
Ideally speaking, we should correct the wrong mmap status control
definition. But this would cause again incompatibility with the
existing binaries, and fixing it (e.g. by renumbering ioctls) would be
really messy.
So, as of this patch, we only correct the behavior of 32bit compat
mode and keep the rest as is. Namely, the SYNC_PTR ioctl is now
handled differently in compat mode to read/write the 32bit values at
the right offsets. The control mmap of 32bit apps on 64bit kernels
has been already disabled (which is likely rather an overlook, but
this worked fine at this time :), so covering SYNC_PTR ioctl should
suffice as a fallback.
Fixes: 80fe7430c7 ("ALSA: add new 32-bit layout for snd_pcm_mmap_status/control")
Reported-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: <stable@vger.kernel.org>
Cc: Rich Felker <dalias@libc.org>
Link: https://lore.kernel.org/r/29QBMJU8DE71E.2YZSH8IHT5HMH@mforney.org
Link: https://lore.kernel.org/r/20211010075546.23220-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f067d5585c ]
The bytes for max_power_out from the ibm-cffps devices differ in byte
order for some power supplies.
The Witherspoon power supply returns the bytes in MSB/LSB order.
The Rainier power supply returns the bytes in LSB/MSB order.
The Witherspoon power supply uses version cffps1. The Rainier power
supply should use version cffps2. If version is cffps1, swap the bytes
before output to max_power_out.
Tested:
Witherspoon before: 3148. Witherspoon after: 3148.
Rainier before: 53255. Rainier after: 2000.
Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20210928205051.1222815-1-bjwyman@gmail.com
[groeck: Replaced yoda programming]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f792565326 ]
Users of rdpmc rely on the mmapped user page to calculate accurate
time_enabled. Currently, userpage->time_enabled is only updated when the
event is added to the pmu. As a result, inactive event (due to counter
multiplexing) does not have accurate userpage->time_enabled. This can
be reproduced with something like:
/* open 20 task perf_event "cycles", to create multiplexing */
fd = perf_event_open(); /* open task perf_event "cycles" */
userpage = mmap(fd); /* use mmap and rdmpc */
while (true) {
time_enabled_mmap = xxx; /* use logic in perf_event_mmap_page */
time_enabled_read = read(fd).time_enabled;
if (time_enabled_mmap > time_enabled_read)
BUG();
}
Fix this by updating userpage for inactive events in merge_sched_in.
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-and-tested-by: Lucian Grijincu <lucian@fb.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210929194313.2398474-1-songliubraving@fb.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 103bde372f ]
When CONFIG_INET is not set, there are failing references to IPv4
functions, so make this driver depend on INET.
Fixes these build errors:
sparc64-linux-ld: drivers/net/ethernet/sun/sunvnet_common.o: in function `sunvnet_start_xmit_common':
sunvnet_common.c:(.text+0x1a68): undefined reference to `__icmp_send'
sparc64-linux-ld: drivers/net/ethernet/sun/sunvnet_common.o: in function `sunvnet_poll_common':
sunvnet_common.c:(.text+0x358c): undefined reference to `ip_send_check'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Aaron Young <aaron.young@oracle.com>
Cc: Rashmi Narasimhan <rashmi.narasimhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9b3b353ef3 ]
Commit 9d682ea6bc ("vboxsf: Fix the check for the old binary
mount-arguments struct") was meant to fix a build error due to sign
mismatch in 'char' and the use of character constants, but it just moved
the error elsewhere, in that on some architectures characters and signed
and on others they are unsigned, and that's just how the C standard
works.
The proper fix is a simple "don't do that then". The code was just
being silly and odd, and it should never have cared about signed vs
unsigned characters in the first place, since what it is testing is not
four "characters", but four bytes.
And the way to compare four bytes is by using "memcmp()".
Which compilers will know to just turn into a single 32-bit compare with
a constant, as long as you don't have crazy debug options enabled.
Link: https://lore.kernel.org/lkml/20210927094123.576521-1-arnd@kernel.org/
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 763716a55c ]
This patch is a replication of Christian Lamparter's "net: bgmac-bcma:
handle deferred probe error due to mac-address" patch for the
bgmac-platform driver [1].
As is the case with the bgmac-bcma driver, this change is to cover the
scenario where the MAC address cannot yet be discovered due to reliance
on an nvmem provider which is yet to be instantiated, resulting in a
random address being assigned that has to be manually overridden.
[1] https://lore.kernel.org/netdev/20210919115725.29064-1-chunkeey@gmail.com
Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b193e15ac6 ]
We observed below report when playing with netlink sock:
UBSAN: shift-out-of-bounds in net/sched/sch_api.c:580:10
shift exponent 249 is too large for 32-bit type
CPU: 0 PID: 685 Comm: a.out Not tainted
Call Trace:
dump_stack_lvl+0x8d/0xcf
ubsan_epilogue+0xa/0x4e
__ubsan_handle_shift_out_of_bounds+0x161/0x182
__qdisc_calculate_pkt_len+0xf0/0x190
__dev_queue_xmit+0x2ed/0x15b0
it seems like kernel won't check the stab log value passing from
user, and will use the insane value later to calculate pkt_len.
This patch just add a check on the size/cell_log to avoid insane
calculation.
Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4bb0bd81ce ]
When we have several pending signals, have entered with the kernel
with large exception frame *and* have already built at least one
sigframe, regs->stkadj is going to be non-zero and regs->format/sr/pc
are going to be junk - the real values are in shifted exception stack
frame we'd built when putting together the first sigframe.
If that happens, subsequent sigframes are going to be garbage.
Not hard to fix - just need to find the "adjusted" frame first
and look for format/vector/sr/pc in it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/YP2dBIAPTaVvHiZ6@zeniv-ca.linux.org.uk
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7970a19b71 ]
The ipv4 and device notifiers are called with RTNL mutex held.
The table walk can take some time, better not block other RTNL users.
'ip a' has been reported to block for up to 20 seconds when conntrack table
has many entries and device down events are frequent (e.g., PPP).
Reported-and-tested-by: Martin Zaharinov <micron10@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 30db406923 ]
masq_inet6_event is called asynchronously from system work queue,
because the inet6 notifier is atomic and nf_iterate_cleanup can sleep.
The ipv4 and device notifiers call nf_iterate_cleanup directly.
This is legal, but these notifiers are called with RTNL mutex held.
A large conntrack table with many devices coming and going will have severe
impact on the system usability, with 'ip a' blocking for several seconds.
This change places the defer code into a helper and makes it more
generic so ipv4 and ifdown notifiers can be converted to defer the
cleanup walk as well in a follow patch.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a8e1813ff ]
Invoke release_firmware() when the firmware fails to boot in
sof_probe_continue().
The request_firmware() framework must be informed of failures in
sof_probe_continue() otherwise its internal "batching"
feature (different from caching) cached the firmware image
forever. Attempts to correct the file in /lib/firmware/ were then
silently and confusingly ignored until the next reboot. Unloading the
drivers did not help because from their disconnected perspective the
firmware had failed so there was nothing to release.
Also leverage the new snd_sof_fw_unload() function to simplify the
snd_sof_device_remove() function.
Signed-off-by: Marc Herbert <marc.herbert@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20210916085008.28929-1-peter.ujfalusi@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c8fbaa553 ]
Add the new PIDs to wacom_wac.c to support the new models in the Intuos series.
[jkosina@suse.cz: fix changelog]
Signed-off-by: Joshua Dickens <joshua.dickens@wacom.com>
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 310e2d43c3 ]
ip6tables only sets the `IP6T_F_PROTO` flag on a rule if a protocol is
specified (`-p tcp`, for example). However, if the flag is not set,
`ip6_packet_match` doesn't call `ipv6_find_hdr` for the skb, in which
case the fragment offset is left uninitialized and a garbage value is
passed to each matcher.
Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7b9cf90366 ]
USB-audio driver assumes that the normal resume would preserve the
device configuration while reset_resume wouldn't, and tries to restore
the mixer elements only at reset_resume callback. However, this seems
too naive, and some devices do behave differently, resetting the
volume at the normal resume; this resulted in the inconsistent volume
that surprised users.
This patch changes the mixer resume code to handle both the normal and
reset resume in the same way, always restoring the original mixer
element values. This allows us to unify the both callbacks as well as
dropping the no longer used reset_resume field, which ends up with a
good code reduction.
A slight behavior change by this patch is that now we assign
restore_mixer_value() as the default resume callback, and the function
is no longer called at reset-resume when the resume callback is
overridden by the quirk function. That is, if needed, the quirk
resume function would have to handle similarly as
restore_mixer_value() by itself.
Reported-by: En-Shuo Hsu <enshuo@chromium.org>
Cc: Yu-Hsuan Hsu <yuhsuan@chromium.org>
Link: https://lore.kernel.org/r/CADDZ45UPsbpAAqP6=ZkTT8BE-yLii4Y7xSDnjK550G2DhQsMew@mail.gmail.com
Link: https://lore.kernel.org/r/20210910105155.12862-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 64794d6db4 ]
Loud Technologies Mackie Onyx 1640i (former model) is identified as
the model which uses OXFW971. The analysis of packet dump shows that
it transfers events in blocking method of IEC 61883-6, however the
default behaviour of ALSA oxfw driver is for non-blocking method.
This commit adds code to detect it assuming that all of loud models
based on OXFW971 have such quirk. It brings no functional change
except for alignment rule of PCM buffer.
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20210913021042.10085-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 55ce2f649b ]
Current error path of ext4_write_inline_data_end() is not correct.
Firstly, it should pass out the error value if ext4_get_inode_loc()
return fail, or else it could trigger infinite loop if we inject error
here. And then it's better to add inode to orphan list if it return fail
in ext4_journal_stop(), otherwise we could not restore inline xattr
entry after power failure. Finally, we need to reset the 'ret' value if
ext4_write_inline_data_end() return success in ext4_write_end() and
ext4_journalled_write_end(), otherwise we could not get the error return
value of ext4_journal_stop().
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210716122024.1105856-3-yi.zhang@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4df031ff58 ]
After commit 3da40c7b08 ("ext4: only call ext4_truncate when size <=
isize"), i_disksize could always be updated to i_size in ext4_setattr(),
and we could sure that i_disksize <= i_size since holding inode lock and
if i_disksize < i_size there are delalloc writes pending in the range
upto i_size. If the end of the current write is <= i_size, there's no
need to touch i_disksize since writeback will push i_disksize upto
i_size eventually. So we can switch to check i_size instead of
i_disksize in ext4_da_write_end() when write to the end of the file.
we also could remove ext4_mark_inode_dirty() together because we defer
inode dirtying to generic_write_end() or ext4_da_write_inline_data_end().
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210716122024.1105856-2-yi.zhang@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit b44d52a50b upstream.
A packet received on a trunk will have bit 2 set in Forward DSA tagged
frame. Bit 1 can be either 0 or 1 and is otherwise undefined and bit 0
indicates the frame CFI. Masking with 7 thus results in frames as
being identified as being from a trunk when in fact they are not. Fix
the mask to just look at bit 2.
Fixes: 5b60dadb71 ("net: dsa: tag_dsa: Support reception of packets from LAG devices")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e3cd95234 upstream.
On recent Intel systems the HPET stops working when the system reaches PC10
idle state.
The approach of adding PCI ids to the early quirks to disable HPET on
these systems is a whack a mole game which makes no sense.
Check for PC10 instead and force disable HPET if supported. The check is
overbroad as it does not take ACPI, intel_idle enablement and command
line parameters into account. That's fine as long as there is at least
PMTIMER available to calibrate the TSC frequency. The decision can be
overruled by adding "hpet=force" on the kernel command line.
Remove the related early PCI quirks for affected Ice Cake and Coffin Lake
systems as they are not longer required. That should also cover all
other systems, i.e. Tiger Rag and newer generations, which are most
likely affected by this as well.
Fixes: Yet another hardware trainwreck
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: stable@vger.kernel.org
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3958b9c34c upstream.
Commit
3c73b81a91 ("x86/entry, selftests: Further improve user entry sanity checks")
added a warning if AC is set when in the kernel.
Commit
662a022189 ("x86/entry: Fix AC assertion")
changed the warning to only fire if the CPU supports SMAP.
However, the warning can still trigger on a machine that supports SMAP
but where it's disabled in the kernel config and when running the
syscall_nt selftest, for example:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 49 at irqentry_enter_from_user_mode
CPU: 0 PID: 49 Comm: init Tainted: G T 5.15.0-rc4+ #98 e6202628ee053b4f310759978284bd8bb0ce6905
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:irqentry_enter_from_user_mode
...
Call Trace:
? irqentry_enter
? exc_general_protection
? asm_exc_general_protection
? asm_exc_general_protectio
IS_ENABLED(CONFIG_X86_SMAP) could be added to the warning condition, but
even this would not be enough in case SMAP is disabled at boot time with
the "nosmap" parameter.
To be consistent with "nosmap" behaviour, clear X86_FEATURE_SMAP when
!CONFIG_X86_SMAP.
Found using entry-fuzz + satrandconfig.
[ bp: Massage commit message. ]
Fixes: 3c73b81a91 ("x86/entry, selftests: Further improve user entry sanity checks")
Fixes: 662a022189 ("x86/entry: Fix AC assertion")
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20211003223423.8666-1-vegard.nossum@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d298b03506 upstream.
Ser Olmy reported a boot failure:
init[1] bad frame in sigreturn frame:(ptrval) ip:b7c9fbe6 sp:bf933310 orax:ffffffff \
in libc-2.33.so[b7bed000+156000]
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU: 0 PID: 1 Comm: init Tainted: G W 5.14.9 #1
Hardware name: Hewlett-Packard HP PC/HP Board, BIOS JD.00.06 12/06/2001
Call Trace:
dump_stack_lvl
dump_stack
panic
do_exit.cold
do_group_exit
get_signal
arch_do_signal_or_restart
? force_sig_info_to_task
? force_sig
exit_to_user_mode_prepare
syscall_exit_to_user_mode
do_int80_syscall_32
entry_INT80_32
on an old 32-bit Intel CPU:
vendor_id : GenuineIntel
cpu family : 6
model : 6
model name : Celeron (Mendocino)
stepping : 5
microcode : 0x3
Ser bisected the problem to the commit in Fixes.
tglx suggested reverting the rejection of invalid MXCSR values which
this commit introduced and replacing it with what the old code did -
simply masking them out to zero.
Further debugging confirmed his suggestion:
fpu->state.fxsave.mxcsr: 0xb7be13b4, mxcsr_feature_mask: 0xffbf
WARNING: CPU: 0 PID: 1 at arch/x86/kernel/fpu/signal.c:384 __fpu_restore_sig+0x51f/0x540
so restore the original behavior only for 32-bit kernels where you have
ancient machines with buggy hardware. For 32-bit programs on 64-bit
kernels, user space which supplies wrong MXCSR values is considered
malicious so fail the sigframe restoration there.
Fixes: 6f9866a166 ("x86/fpu/signal: Let xrstor handle the features to init")
Reported-by: Ser Olmy <ser.olmy@protonmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Ser Olmy <ser.olmy@protonmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YVtA67jImg3KlBTw@zn.tnic
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eb8257a121 ]
On pseries LPAR when an empty slot is assigned to partition OR in single
LPAR mode, kdump kernel crashes during issuing PHB reset.
In the kdump scenario, we traverse all PHBs and issue reset using the
pe_config_addr of the first child device present under each PHB. However
the code assumes that none of the PHB slots can be empty and uses
list_first_entry() to get the first child device under the PHB. Since
list_first_entry() expects the list to be non-empty, it returns an
invalid pci_dn entry and ends up accessing NULL phb pointer under
pci_dn->phb causing kdump kernel crash.
This patch fixes the below kdump kernel crash by skipping empty slots:
audit: initializing netlink subsys (disabled)
thermal_sys: Registered thermal governor 'fair_share'
thermal_sys: Registered thermal governor 'step_wise'
cpuidle: using governor menu
pstore: Registered nvram as persistent store backend
Issue PHB reset ...
audit: type=2000 audit(1631267818.000:1): state=initialized audit_enabled=0 res=1
BUG: Kernel NULL pointer dereference on read at 0x00000268
Faulting instruction address: 0xc000000008101fb0
Oops: Kernel access of bad area, sig: 7 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 7 PID: 1 Comm: swapper/7 Not tainted 5.14.0 #1
NIP: c000000008101fb0 LR: c000000009284ccc CTR: c000000008029d70
REGS: c00000001161b840 TRAP: 0300 Not tainted (5.14.0)
MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 28000224 XER: 20040002
CFAR: c000000008101f0c DAR: 0000000000000268 DSISR: 00080000 IRQMASK: 0
...
NIP pseries_eeh_get_pe_config_addr+0x100/0x1b0
LR __machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350
Call Trace:
0xc00000001161bb80 (unreliable)
__machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350
do_one_initcall+0x60/0x2d0
kernel_init_freeable+0x350/0x3f8
kernel_init+0x3c/0x17c
ret_from_kernel_thread+0x5c/0x64
Fixes: 5a090f7c36 ("powerpc/pseries: PCIE PHB reset")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
[mpe: Tweak wording and trim oops]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/163215558252.413351.8600189949820258982.stgit@jupiter
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d93f9e2374 ]
At interrupt exit, kuap_kernel_restore() calls kuap_unlock() with the
value contained in regs->kuap. However, when regs->kuap contains
0xffffffff it means that KUAP was not unlocked so calling kuap_unlock()
is unrelevant and results in jeopardising the contents of kernel space
segment registers.
So check that regs->kuap doesn't contain KUAP_NONE before calling
kuap_unlock(). In the meantime it also means that if KUAP has not
been correcly locked back at interrupt exit, it must be locked
before continuing. This is done by checking the content of
current->thread.kuap which was returned by kuap_get_and_assert_locked()
Fixes: 16132529ce ("powerpc/32s: Rework Kernel Userspace Access Protection")
Reported-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0d0c4d0f050a637052287c09ba521bad960a2790.1631715131.git.christophe.leroy@csgroup.eu
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f08fb25bc6 ]
The machine check handler is not considered NMI on 64s. The early
handler is the true NMI handler, and then it schedules the
machine_check_exception handler to run when interrupts are enabled.
This works fine except the case of an unrecoverable MCE, where the true
NMI is taken when MSR[RI] is clear, it can not recover, so it calls
machine_check_exception directly so something might be done about it.
Calling an async handler from NMI context can result in irq state and
other things getting corrupted. This can also trigger the BUG at
arch/powerpc/include/asm/interrupt.h:168
BUG_ON(!arch_irq_disabled_regs(regs) && !(regs->msr & MSR_EE));
Fix this by making an _async version of the handler which is called
in the normal case, and a NMI version that is called for unrecoverable
interrupts.
Fixes: 2b43dd7653 ("powerpc/64: enable MSR[EE] in irq replay pt_regs")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211004145642.1331214-6-npiggin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d0afd44c05 ]
_exception can be called by machine check handlers when the MCE hits
user code (e.g., pseries and powernv). This will enable local irqs
because, which is a dicey thing to do in NMI or hard irq context.
This seemed to worked out okay because a userspace MCE can basically be
treated like a synchronous interrupt (after async / imprecise MCEs are
filtered out). Since NMI and hard irq handlers have started growing
nmi_enter / irq_enter, and more irq state sanity checks, this has
started to cause problems (or at least trigger warnings).
The Fixes tag to the commit which introduced this rather than try to
work out exactly which commit was the first that could possibly cause a
problem because that may be difficult to prove.
Fixes: 9f2f79e3a3 ("powerpc: Disable interrupts in 64-bit kernel FP and vector faults")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211004145642.1331214-3-npiggin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dc02368164 ]
Commit e31694e0a7 ("objtool: Don't make .altinstructions writable")
aligned objtool-created and kernel-created .altinstructions section
flags, but there remains a minor discrepency in their use of a section
entry size: objtool sets one while the kernel build does not.
While sh_entsize of sizeof(struct alt_instr) seems intuitive, this small
deviation can cause failures with external tooling (kpatch-build).
Fix this by creating new .altinstructions sections with sh_entsize of 0
and then later updating sec->sh_size as alternatives are added to the
section. An added benefit is avoiding the data descriptor and buffer
created by elf_create_section(), but previously unused by
elf_add_alternative().
Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20210822225037.54620-2-joe.lawrence@redhat.com
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 258aad75c6 ]
Commit d39df15851 ("scsi: iscsi: Have abort handler get ref to conn")
added iscsi_get_conn()/iscsi_put_conn() calls during abort handling but
then also changed the handling of the case where we detect an already
completed task where we now end up doing a goto to the common put/cleanup
code. This results in a iscsi_task use after free, because the common
cleanup code will do a put on the iscsi_task.
This reverts the goto and moves the iscsi_get_conn() to after we've checked
if the iscsi_task is valid.
Link: https://lore.kernel.org/r/20211004210608.9962-1-michael.christie@oracle.com
Fixes: d39df15851 ("scsi: iscsi: Have abort handler get ref to conn")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 59a4e0d551 ]
As far as I can tell this should be enabled on rv32 as well, I'm not
sure why it's rv64-only. checksyscalls is complaining about our lack of
clone3() on rv32.
Fixes: 56ac5e2139 ("riscv: enable sys_clone3 syscall for rv64")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fa1049135c ]
Change setting for 400KHz frequency support by more accurate value.
Fixes: 66b0c2846b ("i2c: mlxcpld: Add support for I2C bus frequency setting")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 52f57396c7 ]
Value for getting frequency capability wrongly has been taken from
register offset instead of register value.
Fixes: 66b0c2846b ("i2c: mlxcpld: Add support for I2C bus frequency setting")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8bb0ab3ae7 ]
riscv architectures relying on mmap_sem for write in their
arch_setup_additional_pages. If the waiting task gets killed by the oom
killer it would block oom_reaper from asynchronous address space reclaim
and reduce the chances of timely OOM resolving. Wait for the lock in
the killable mode and return with EINTR if the task got killed while
waiting.
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 78a743cd82 ]
As commit 601255ae3c ("arm64: vdso: move data page before code pages"), the
same issue exists on riscv, testcase is shown below, make sure that vdso.so is
bigger than page size,
struct timespec tp;
clock_gettime(5, &tp);
printf("tv_sec: %ld, tv_nsec: %ld\n", tp.tv_sec, tp.tv_nsec);
without this patch, test result : tv_sec: 0, tv_nsec: 0
with this patch, test result : tv_sec: 1629271537, tv_nsec: 748000000
Move the vdso data page in front of the VDSO area to fix the issue.
Fixes: ad5d1122b8 ("riscv: use vDSO common flow to reduce the latency of the time-related functions")
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bb4a23c994 ]
The asm/vdso.h will be included in vdso.lds.S in the next patch, the
following cleanup is needed to avoid syntax error:
1.the declaration of sys_riscv_flush_icache() is moved into asm/syscall.h.
2.the definition of struct vdso_data is moved into kernel/vdso.c.
2.the definition of VDSO_SYMBOL is placed under "#ifndef __ASSEMBLY__".
Also remove the redundant linux/types.h include.
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a290f510a1 ]
We don't have a VDSO for the !MMU configurations, so don't try to build
one.
Fixes: fde9c59aeb ("riscv: explicitly use symbol offsets for VDSO")
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fde9c59aeb ]
The current implementation of the `__rt_sigaction` reference computed an
absolute offset relative to the mapped base of the VDSO. While this can
be handled in the medlow model, the medany model cannot handle this as
it is meant to be position independent. The current implementation
relied on the BFD linker relaxing the PC-relative relocation into an
absolute relocation as it was a near-zero address allowing it to be
referenced relative to `zero`.
We now extract the offsets and create a generated header allowing the
build with LLVM and lld to succeed as we no longer depend on the linker
rewriting address references near zero. This change was largely
modelled after the ARM64 target which does something similar.
Signed-off-by: Saleem Abdulrasool <abdulras@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6558b646ce ]
acpi_i2c_find_adapter_by_handle() calls bus_find_device() which takes a
reference on the adapter which is never released which will result in a
reference count leak and render the adapter unremovable. Make sure to
put the adapter after creating the client in the same manner that we do
for OF.
Fixes: 525e6fabea ("i2c / ACPI: add support for ACPI reconfigure notifications")
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[wsa: fixed title]
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23c216b335 ]
According to dma-api.rst, the dma_get_required_mask() helper should return
"the mask that the platform requires to operate efficiently". Which in
the case of PPC64 means the bypass mask and not a mask from an IOMMU table
which is shorter and slower to use due to map/unmap operations (especially
expensive on "pseries").
However the existing implementation ignores the possibility of bypassing
and returns the IOMMU table mask on the pseries platform which makes some
drivers (mpt3sas is one example) choose 32bit DMA even though bypass is
supported. The powernv platform sort of handles it by having a bigger
default window with a mask >=40 but it only works as drivers choose
63/64bit if the required mask is >32 which is rather pointless.
This reintroduces the bypass capability check to let drivers make
a better choice of the DMA mask.
Fixes: f1565c24b5 ("powerpc: use the generic dma_ops_bypass mode")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210930034454.95794-1-aik@ozlabs.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8d6c414cd2 ]
The commit 6da5b0f027 ("net: ensure unbound datagram socket to be
chosen when not in a VRF") modified compute_score() so that a device
match is always made, not just in the case of an l3mdev skb, then
increments the score also for unbound sockets. This ensures that
sockets bound to an l3mdev are never selected when not in a VRF.
But as unbound and bound sockets are now scored equally, this results
in the last opened socket being selected if there are matches in the
default VRF for an unbound socket and a socket bound to a dev that is
not an l3mdev. However, handling prior to this commit was to always
select the bound socket in this case. Reinstate this handling by
incrementing the score only for bound sockets. The required isolation
due to choosing between an unbound socket and a socket bound to an
l3mdev remains in place due to the device match always being made.
The same approach is taken for compute_score() for stream sockets.
Fixes: 6da5b0f027 ("net: ensure unbound datagram socket to be chosen when not in a VRF")
Fixes: e78190581a ("net: ensure unbound stream socket to be chosen when not in a VRF")
Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/cf0a8523-b362-1edf-ee78-eef63cbbb428@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e5a20573a ]
When VSI set up failed in i40e_probe() as part of PF switch set up
driver was trying to free misc IRQ vectors in
i40e_clear_interrupt_scheme and produced a kernel Oops:
Trying to free already-free IRQ 266
WARNING: CPU: 0 PID: 5 at kernel/irq/manage.c:1731 __free_irq+0x9a/0x300
Workqueue: events work_for_cpu_fn
RIP: 0010:__free_irq+0x9a/0x300
Call Trace:
? synchronize_irq+0x3a/0xa0
free_irq+0x2e/0x60
i40e_clear_interrupt_scheme+0x53/0x190 [i40e]
i40e_probe.part.108+0x134b/0x1a40 [i40e]
? kmem_cache_alloc+0x158/0x1c0
? acpi_ut_update_ref_count.part.1+0x8e/0x345
? acpi_ut_update_object_reference+0x15e/0x1e2
? strstr+0x21/0x70
? irq_get_irq_data+0xa/0x20
? mp_check_pin_attr+0x13/0xc0
? irq_get_irq_data+0xa/0x20
? mp_map_pin_to_irq+0xd3/0x2f0
? acpi_register_gsi_ioapic+0x93/0x170
? pci_conf1_read+0xa4/0x100
? pci_bus_read_config_word+0x49/0x70
? do_pci_enable_device+0xcc/0x100
local_pci_probe+0x41/0x90
work_for_cpu_fn+0x16/0x20
process_one_work+0x1a7/0x360
worker_thread+0x1cf/0x390
? create_worker+0x1a0/0x1a0
kthread+0x112/0x130
? kthread_flush_work_fn+0x10/0x10
ret_from_fork+0x1f/0x40
The problem is that at that point misc IRQ vectors
were not allocated yet and we get a call trace
that driver is trying to free already free IRQ vectors.
Add a check in i40e_clear_interrupt_scheme for __I40E_MISC_IRQ_REQUESTED
PF state before calling i40e_free_misc_vector. This state is set only if
misc IRQ vectors were properly initialized.
Fixes: c17401a1dd ("i40e: use separate state bit for miscellaneous IRQ setup")
Reported-by: PJ Waskiewicz <pwaskiewicz@jumptrading.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 857b6c6f66 ]
The loop in i40e_get_capabilities can never end. The problem is that
although i40e_aq_discover_capabilities returns with an error if there's
a firmware problem, the returned error is not checked. There is a check for
pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most
firmware problems.
When i40e_aq_discover_capabilities encounters a firmware problem, it will
encounter the same problem on its next invocation. As the result, the loop
becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking
at the code, it can happen with a range of other firmware errors.
I don't know what the correct behavior should be: whether the firmware
should be retried a few times, or whether pf->hw.aq.asq_last_status should
be always set to the encountered firmware error (but then it would be
pointless and can be just replaced by the i40e_aq_discover_capabilities
return value). However, the current behavior with an endless loop under the
rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we
explained the bug to them 7 months ago.
This may not be the best possible fix but it's better than hanging the whole
system on a firmware bug.
Fixes: 56a62fc868 ("i40e: init code and hardware support")
Tested-by: Stefan Assmann <sassmann@redhat.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d343679919 ]
rtnl_fill_statsinfo() is filling skb with one mandatory if_stats_msg structure.
nlmsg_put(skb, pid, seq, type, sizeof(struct if_stats_msg), flags);
But if_nlmsg_stats_size() never considered the needed storage.
This bug did not show up because alloc_skb(X) allocates skb with
extra tailroom, because of added alignments. This could very well
be changed in the future to have deterministic behavior.
Fixes: 10c9ead9f3 ("rtnetlink: add new RTM_GETSTATS message to dump link stats")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 922aa9bcac ]
Prevent possible crashes when cleaning up after unsuccessful
initializations.
Fixes: 893ce44df5 ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: Tao Liu <xliutaox@google.com>
Signed-off-by: Catherine Sully <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d03477ee10 ]
The qpl_map_size is rounded up to a multiple of sizeof(long), but the
number of qpls doesn't have to be.
Fixes: f5cedc84a3 ("gve: Add transmit and receive support")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4aeaed80b ]
The current implementation enable PCS EEE feature in the event of link
up, but PCS EEE feature is not disabled on link down.
This patch makes sure PCE EEE feature is disabled on link down.
Fixes: 656ed8b015 ("net: stmmac: fix EEE init issue when paired with EEE capable PHYs")
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 590df78bc7 ]
When Energy-Efficient Ethernet(EEE) is disable from the MAC side,
we need to clear the DW_VR_MII_EEE_TRN_LPI bit of DW_VR_MII_EEE_MCTRL1
register.
Fixes: 7617af3d1a ("net: pcs: Introducing support for DWC xpcs Energy Efficient Ethernet")
Cc: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 11b8e2bb98 ]
The gbefb driver not only registers a driver but also the device for that
driver. This is all well and good when run on the IP32 machines that are
supported by the driver but since the driver supports building with
COMPILE_TEST we might also be building on other platforms which do not have
this hardware and will crash instantiating the driver. Add an IS_ENABLED()
check so we compile out the device registration if we don't have the Kconfig
option for the machine enabled.
Fixes: 552ccf6b25 ("video: fbdev: gbefb: add COMPILE_TEST support")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210921212102.30803-1-broonie@kernel.org
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c64c8e04a1 ]
Recent rework, which made HDMI PHY driver a platform device, inadvertely
reversed clock setup order. HW is very touchy about it. Proper way is to
handle controllers resets and clocks first and HDMI PHYs second.
Currently, without this fix, first mode set completely fails (nothing on
HDMI monitor) on H3 era PHYs. On H6, it still somehow work.
Move HDMI PHY reset & clocks handling to sun8i_hdmi_phy_init() which
will assure that code is executed after controllers reset & clocks are
handled. Additionally, add sun8i_hdmi_phy_deinit() which will deinit
them at controllers driver unload.
Tested on A64, H3, H6 and R40.
Fixes: 9bf3797796 ("drm/sun4i: dw-hdmi: Make HDMI PHY into a platform device")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210915175836.3158839-1-jernej.skrabec@gmail.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b13a270ace ]
Commit 94f6345712 ("bus: ti-sysc: Implement quirk handling for
CLKDM_NOAUTO") should have also added the quirk for dra7 dcan1 in
addition to dcan2 for errata i893 handling.
Let's also pass the quirk flag for legacy mode booting for if "ti,hwmods"
dts property is used with related dcan hwmod data. This should be only
needed if anybody needs to git bisect earlier stable trees though.
Fixes: 94f6345712 ("bus: ti-sysc: Implement quirk handling for CLKDM_NOAUTO")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 248b061689 ]
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.
To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.
Fixes: c9a6b82f45 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b072ef1215 ]
Memory is allocated for ttm->sg by kmalloc in kfd_mem_dmamap_userptr,
but isn't freed by kfree in kfd_mem_dmaunmap_userptr. Free it!
Fixes: 264fb4d332 ("drm/amdgpu: Add multi-GPU DMA mapping helpers")
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7707a4d01a ]
While existing code is correct, KCSAN is reporting
a data-race in netlink_insert / netlink_sendmsg [1]
It is correct to read nlk->bound without a lock, as netlink_autobind()
will acquire all needed locks.
[1]
BUG: KCSAN: data-race in netlink_insert / netlink_sendmsg
write to 0xffff8881031c8b30 of 1 bytes by task 18752 on cpu 0:
netlink_insert+0x5cc/0x7f0 net/netlink/af_netlink.c:597
netlink_autobind+0xa9/0x150 net/netlink/af_netlink.c:842
netlink_sendmsg+0x479/0x7c0 net/netlink/af_netlink.c:1892
sock_sendmsg_nosec net/socket.c:703 [inline]
sock_sendmsg net/socket.c:723 [inline]
____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
___sys_sendmsg net/socket.c:2446 [inline]
__sys_sendmsg+0x1ed/0x270 net/socket.c:2475
__do_sys_sendmsg net/socket.c:2484 [inline]
__se_sys_sendmsg net/socket.c:2482 [inline]
__x64_sys_sendmsg+0x42/0x50 net/socket.c:2482
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff8881031c8b30 of 1 bytes by task 18751 on cpu 1:
netlink_sendmsg+0x270/0x7c0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:703 [inline]
sock_sendmsg net/socket.c:723 [inline]
__sys_sendto+0x2a8/0x370 net/socket.c:2019
__do_sys_sendto net/socket.c:2031 [inline]
__se_sys_sendto net/socket.c:2027 [inline]
__x64_sys_sendto+0x74/0x90 net/socket.c:2027
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x00 -> 0x01
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 18751 Comm: syz-executor.0 Not tainted 5.14.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: da314c9923 ("netlink: Replace rhash_portid with bound")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3cf002d5a ]
According to Synopsys DesignWare Cores Ethernet PCS databook, it is
required to disable Clause 37 auto-negotiation by programming bit-12
(AN_ENABLE) to 0 if it is already enabled, before programming various
fields of VR_MII_AN_CTRL registers.
After all these programming are done, it is then required to enable
Clause 37 auto-negotiation by programming bit-12 (AN_ENABLE) to 1.
Fixes: b97b5331b8 ("net: pcs: add C37 SGMII AN support for intel mGbE controller")
Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0854a05133 ]
Commit de1799667b ("net: bridge: add STP xstats")
added an additional nla_reserve_64bit() in br_fill_linkxstats(),
but forgot to update br_get_linkxstats_size() accordingly.
This can trigger the following in rtnl_stats_get()
WARN_ON(err == -EMSGSIZE);
Fixes: de1799667b ("net: bridge: add STP xstats")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dbe0b88064 ]
bridge_fill_linkxstats() is using nla_reserve_64bit().
We must use nla_total_size_64bit() instead of nla_total_size()
for corresponding data structure.
Fixes: 1080ab95e3 ("net: bridge: add support for IGMP/MLD stats and export them via netlink")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fdddf8c3a4 ]
With patch "drm/i915/vbt: Fix backlight parsing for VBT 234+"
the size of bdb_lfp_backlight_data structure has been increased,
causing if-statement in the parse_lfp_backlight function
that comapres this structure size to the one retrieved from BDB,
always to fail for older revisions.
This patch calculates expected size of the structure for a given
BDB version and compares it with the value gathered from BDB.
Tested on Chromebook Pixelbook (Nocturne) (reports bdb->version = 221)
Fixes: d381baad29 ("drm/i915/vbt: Fix backlight parsing for VBT 234+")
Tested-by: Lukasz Majczak <lma@semihalf.com>
Signed-off-by: Lukasz Majczak <lma@semihalf.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210930134606.227234-1-lma@semihalf.com
(cherry picked from commit 4378daf5d0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 544021e3f2 ]
When pipe A is disabled and MIPI DSI is enabled on pipe B,
the AMT KVMR feature will incorrectly see pipe A as enabled.
Set 0x42080 bit 23=1 before enabling DSI on pipe B and leave
it set while DSI is enabled on pipe B. No impact to setting
it all the time.
Changes since V5:
- Added reviewed-by
- Removed redundant braces and debug message format - Imre
Changes since V4:
- Modified function comment Wa_<number>:icl,jsl,ehl - Lucas
- Modified debug message in sync state - Imre
Changes since V3:
- More meaningful name to workaround - Imre
- Remove boolean check clear flag
- Add WA_verify hook in dsi sync_state
Changes since V2:
- Used REG_BIT, ignored pipe A and used sw state check - Jani
- Made function wrapper - Jani
Changes since V1:
- ./dim checkpatch errors addressed
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210615105613.851491-1-tejaskumarx.surendrakumar.upadhyay@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 783f3db030 ]
Any pending interrupt can prevent entering standby based power off state.
To avoid it, disable the GIC CPU interface.
Fixes: 8148d21360 ("ARM: imx6: register pm_power_off handler if "fsl,pmic-stby-poweroff" is set")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b94aa318a ]
On the LS1028A this instance of the eSDHC controller is intended for
either an eMMC or eSDIO card. It doesn't provide a card detect pin and
its IO voltage is fixed at 1.8V.
Remove the bogus broken-cd property, instead add the non-removable
property. Fix the voltage-ranges property and set it to 1.8V only.
Fixes: 491d3a3fc1 ("arm64: dts: ls1028a: Add esdhc node in dts")
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9786cca4b4 ]
The buck2 output of the PMIC is the VDD core voltage of the cpu.
Switching off this will poweroff the CPU. Add the 'regulator-always-on'
property to avoid this.
Fixes: 8668d8b2e6 ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards")
Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 04aa946d57 ]
Before commit 0e30f47232 ("mtd: spi-nor: add support for DTR protocol"),
for all PP command, it only support 1-1-1 mode, no matter the tx setting
in dts. But after the upper commit, the logic change. It will choose
the best mode(fastest mode) which flash device and spi-nor host controller
both support.
qspi and fspi host controller do not support read 1-4-4 mode. so need to
set the tx to 1, let the common code finally select read 1-1-4 mode.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Fixes: 0e30f47232 ("mtd: spi-nor: add support for DTR protocol")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2a4f4a302 ]
Before commit 0e30f47232 ("mtd: spi-nor: add support for DTR protocol"),
for all PP command, it only support 1-1-1 mode, no matter the tx setting
in dts. But after the upper commit, the logic change. It will choose
the best mode(fastest mode) which flash device and spi-nor host controller
both support.
Though the spi-nor device on imx6sx-sdb/imx6ul(l/z)-14x14-evk board
do not support PP-1-4-4/PP-1-1-4, but if tx is 4 in dts file, it will also
impact the read mode selection. For the spi-nor device on the upper mentioned
boards, they support read 1-4-4 mode and read 1-1-4 mode according to the
device internal sfdp register. But qspi host controller do not support
read 1-4-4 mode. so need to set the tx to 1, let the common code finally
select read 1-1-4 mode, PP-1-1-1 mode.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Fixes: 0e30f47232 ("mtd: spi-nor: add support for DTR protocol")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7cd8b1542a ]
The driver can't be loaded automatically because it misses
module alias to be provided. Add corresponding MODULE_DEVICE_TABLE()
call to the driver.
Fixes: 863d08ece9 ("supports eg20t ptp clock")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eed183abc0 ]
Property phy-connection-type contains invalid value "sgmii-2500" per scheme
defined in file ethernet-controller.yaml.
Correct phy-connection-type value should be "2500base-x".
Signed-off-by: Pali Rohár <pali@kernel.org>
Fixes: 84e0f1c138 ("powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)")
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aec3f415f7 ]
Commit 2d26f6e39a ("net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings")
while getting rid of a runtime PM warning ended up breaking ethernet
on rk3399 based devices. By dropping an extra reference to the device,
the commit ends up enabling suspend / resume of the ethernet device -
which appears to be broken.
While the issue with runtime pm is being investigated, partially
revert commit 2d26f6e39a to restore the network on rk3399.
Fixes: 2d26f6e39a ("net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings")
Suggested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Cc: Michael Riesch <michael.riesch@wolfvision.net>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20210929135049.3426058-1-punitagrawal@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 019d9329e7 ]
When ocelot_flower.c calls ocelot_vcap_filter_add(), the filter has a
given filter->id.cookie. This filter is added to the block->rules list.
However, when ocelot_flower.c calls ocelot_vcap_block_find_filter_by_id()
which passes the cookie as argument, the filter is never found by
filter->id.cookie when searching through the block->rules list.
This is unsurprising, since the filter->id.cookie is an unsigned long,
but the cookie argument provided to ocelot_vcap_block_find_filter_by_id()
is a signed int, and the comparison fails.
Fixes: 50c6cc5b92 ("net: mscc: ocelot: store a namespaced VCAP filter ID")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210930125330.2078625-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4729445b47 ]
When fed an empty BPF object, bpftool gen skeleton -L crashes at
btf__set_fd() since it assumes presence of obj->btf, however for
the sequence below clang adds no .BTF section (hence no BTF).
Reproducer:
$ touch a.bpf.c
$ clang -O2 -g -target bpf -c a.bpf.c
$ bpftool gen skeleton -L a.bpf.o
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* THIS FILE IS AUTOGENERATED! */
struct a_bpf {
struct bpf_loader_ctx ctx;
Segmentation fault (core dumped)
The same occurs for files compiled without BTF info, i.e. without
clang's -g flag.
Fixes: 6723474373 (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930061634.1840768-1-memxor@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f88c487634 ]
When setting number of completion EQs of the SF, consider number of
online CPUs.
Without this consideration, when number of online cpus are less than 8,
unnecessary 8 completion EQs are allocated.
Fixes: c36326d38d ("net/mlx5: Round-Robin EQs over IRQs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ac8b7d50ae ]
The maximum irq_index can be 2047, This means irq_name should have 4
characters reserve for the irq_index. Hence, increase it to 4.
Fixes: 3af26495a2 ("net/mlx5: Enlarge interrupt field in CREATE_EQ")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 99b9a678b2 ]
When in Real-time mode, HW clock is synced with the PTP daemon. Hence
driver should not re-calibrate the next pulse (via MTPPSE repetitive
events mechanism).
This patch arms repetitive events only in free-running mode.
Fixes: 432119de33 ("net/mlx5: Add cyc2time HW translation mode support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6472829470 ]
Allow configuration of 1PPS start time only with time-stamp representing
a round second. Prior to this patch driver allowed setting of a
non-round-second which is not supported by the device. Avoid unexpected
behavior by restricting start-time configuration to a round-second.
Fixes: 4272f9b88d ("net/mlx5e: Change 1PPS out scheme")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d758d4a3a ]
The value for maximum number of channels is first calculated based
on the netdev's profile and current function resources (specifically,
number of MSIX vectors, which depends among other things on the number
of online cores in the system).
This value is then used to calculate the netdev's number of rxqs/txqs.
Once created (by alloc_etherdev_mqs), the number of netdev's rxqs/txqs
is constant and we must not exceed it.
To achieve this, keep the maximum number of channels in sync upon any
netdevice re-attach.
Use mlx5e_get_max_num_channels() for calculating the number of netdev's
rxqs/txqs. After netdev is created, use mlx5e_calc_max_nch() (which
coinsiders core device resources, profile, and netdev) to init or
update priv->max_nch.
Before this patch, the value of priv->max_nch might get out of sync,
mistakenly allowing accesses to out-of-bounds objects, which would
crash the system.
Track the number of channels stats structures used in a separate
field, as they are persistent to suspend/resume operations. All the
collected stats of every channel index that ever existed should be
preserved. They are reset only when struct mlx5e_priv is,
in mlx5e_priv_cleanup(), which is part of the profile changing flow.
There is no point anymore in blocking a profile change due to max_nch
mismatch in mlx5e_netdev_change_profile(). Remove the limitation.
Fixes: a1f240f180 ("net/mlx5e: Adjust to max number of channles when re-attaching")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f9a10440f0 ]
Currently in Rx data path IPsec crypto offloaded packets uses
csum_none flag, so checksum is handled by the stack, this naturally
have some performance/cpu utilization impact on such flows. As Nvidia
NIC starting from ConnectX6DX provides checksum complete value out of
the box also for such flows there is no sense in taking csum_none path,
furthermore the stack (xfrm) have the method to handle checksum complete
corrections for such flows i.e. IPsec trailer removal and consequently
checksum value adjustment.
Because of the above and in addition the ConnectX6DX is the first HW
which supports IPsec crypto offload then it is safe to report csum
complete for IPsec offloaded traffic.
Fixes: b2ac7541e3 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 30e29a9a2b ]
In prealloc_elems_and_freelist(), the multiplication to calculate the
size passed to bpf_map_area_alloc() could lead to an integer overflow.
As a result, out-of-bounds write could occur in pcpu_freelist_populate()
as reported by KASAN:
[...]
[ 16.968613] BUG: KASAN: slab-out-of-bounds in pcpu_freelist_populate+0xd9/0x100
[ 16.969408] Write of size 8 at addr ffff888104fc6ea0 by task crash/78
[ 16.970038]
[ 16.970195] CPU: 0 PID: 78 Comm: crash Not tainted 5.15.0-rc2+ #1
[ 16.970878] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 16.972026] Call Trace:
[ 16.972306] dump_stack_lvl+0x34/0x44
[ 16.972687] print_address_description.constprop.0+0x21/0x140
[ 16.973297] ? pcpu_freelist_populate+0xd9/0x100
[ 16.973777] ? pcpu_freelist_populate+0xd9/0x100
[ 16.974257] kasan_report.cold+0x7f/0x11b
[ 16.974681] ? pcpu_freelist_populate+0xd9/0x100
[ 16.975190] pcpu_freelist_populate+0xd9/0x100
[ 16.975669] stack_map_alloc+0x209/0x2a0
[ 16.976106] __sys_bpf+0xd83/0x2ce0
[...]
The possibility of this overflow was originally discussed in [0], but
was overlooked.
Fix the integer overflow by changing elem_size to u64 from u32.
[0] https://lore.kernel.org/bpf/728b238e-a481-eb50-98e9-b0f430ab01e7@gmail.com/
Fixes: 557c0c6e7d ("bpf: convert stackmap to pre-allocation")
Signed-off-by: Tatsuhiko Yasumatsu <th.yasumatsu@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930135545.173698-1-th.yasumatsu@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b232537074 ]
Starting with v5.15-rc1, we may now see some am335x beaglebone black
device produce the following error on pruss probe:
Unhandled fault: external abort on non-linefetch (0x1008) at 0xe0326000
This has started with the enabling of pruss for am335x in the dts files.
Turns out the is caused by the PRM reset handling not waiting for the
reset bit to clear. To fix the issue, let's always wait for the reset
bit to clear, even if there is a separate reset status register.
We attempted to fix a similar issue for dra7 iva with a udelay() in
commit effe89e400 ("soc: ti: omap-prm: Fix occasional abort on reset
deassert for dra7 iva"). There is no longer a need for the udelay()
for dra7 iva reset either with the check added for reset bit clearing.
Cc: Drew Fustini <pdp7pdp7@gmail.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: "H. Nikolaus Schaller" <hns@goldelico.com>
Cc: Robert Nelson <robertcnelson@gmail.com>
Cc: Yongqin Liu <yongqin.liu@linaro.org>
Fixes: effe89e400 ("soc: ti: omap-prm: Fix occasional abort on reset deassert for dra7 iva")
Reported-by: Matti Vaittinen <mazziesaccount@gmail.com>
Tested-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79e3445b38 ]
On ARM CPUs that lack div/mod instructions, ALU32 BPF_DIV and BPF_MOD are
implemented using a call to a helper function. Before, the emitted code
for those function calls failed to preserve caller-saved ARM registers.
Since some of those registers happen to be mapped to BPF registers, it
resulted in eBPF register values being overwritten.
This patch emits code to push and pop the remaining caller-saved ARM
registers r2-r3 into the stack during the div/mod function call. ARM
registers r0-r1 are used as arguments and return value, and those were
already saved and restored correctly.
Fixes: 39c13c204b ("arm: eBPF JIT compiler")
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2c964c5586 ]
Deactivate old rule first, then append the new rule, so rule replacement
notification via netlink first reports the deletion of the old rule with
handle X in first place, then it adds the new rule (reusing the handle X
of the replaced old rule).
Note that the abort path releases the transaction that has been created
by nft_delrule() on error.
Fixes: ca08987885 ("netfilter: nf_tables: deactivate expressions in rule replecement routine")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e189ae161d ]
Add position handle to allow to identify the rule location from netlink
events. Otherwise, userspace cannot incrementally update a userspace
cache through monitoring events.
Skip handle dump if the rule has been either inserted (at the beginning
of the ruleset) or appended (at the end of the ruleset), the
NLM_F_APPEND netlink flag is sufficient in these two cases.
Handle NLM_F_REPLACE as NLM_F_APPEND since the rule replacement
expansion appends it after the specified rule handle.
Fixes: 96518518cc ("netfilter: add nftables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 339031bafe ]
This is a revert of
7b1957b049 ("netfilter: nf_defrag_ipv4: use net_generic infra")
and a partial revert of
8b0adbe3e3 ("netfilter: nf_defrag_ipv6: use net_generic infra").
If conntrack is builtin and kernel is booted with:
nf_conntrack.enable_hooks=1
.... kernel will fail to boot due to a NULL deref in
nf_defrag_ipv4_enable(): Its called before the ipv4 defrag initcall is
made, so net_generic() returns NULL.
To resolve this, move the user refcount back to struct net so calls
to those functions are possible even before their initcalls have run.
Fixes: 7b1957b049 ("netfilter: nf_defrag_ipv4: use net_generic infra")
Fixes: 8b0adbe3e3 ("netfilter: nf_defrag_ipv6: use net_generic infra").
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe5c735d0d ]
There is a Killer AX1650 2x2 Wi-Fi 6 and Bluetooth 5.1 wireless adapter
found on Dell XPS 15 (9510) laptop, its configuration was present on
Linux v5.7, however accidentally it has been removed from the list of
supported devices, let's add it back.
The problem is manifested on driver initialization:
Intel(R) Wireless WiFi driver for Linux
iwlwifi 0000:00:14.3: enabling device (0000 -> 0002)
iwlwifi: No config found for PCI dev 43f0/1651, rev=0x354, rfid=0x10a100
iwlwifi: probe of 0000:00:14.3 failed with error -22
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=213939
Fixes: 3f910a2583 ("iwlwifi: pcie: convert all AX101 devices to the device tables")
Cc: Julien Wajsberg <felash@gmail.com>
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Acked-by: Luca Coelho <luca@coelho.fi>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210924122154.2376577-1-vladimir.zapolskiy@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d67ed2510d ]
CONFIG_OF can be set by a randconfig or by a user -- without setting the
early flattree option (OF_EARLY_FLATTREE). This causes build errors.
However, if randconfig or a user sets USE_OF in the Xtensa config,
the right kconfig symbols are set to fix the build.
Fixes these build errors:
../arch/xtensa/kernel/setup.c:67:19: error: ‘__dtb_start’ undeclared here (not in a function); did you mean ‘dtb_start’?
67 | void *dtb_start = __dtb_start;
| ^~~~~~~~~~~
../arch/xtensa/kernel/setup.c: In function 'xtensa_dt_io_area':
../arch/xtensa/kernel/setup.c:201:14: error: implicit declaration of function 'of_flat_dt_is_compatible'; did you mean 'of_machine_is_compatible'? [-Werror=implicit-function-declaration]
201 | if (!of_flat_dt_is_compatible(node, "simple-bus"))
../arch/xtensa/kernel/setup.c:204:18: error: implicit declaration of function 'of_get_flat_dt_prop' [-Werror=implicit-function-declaration]
204 | ranges = of_get_flat_dt_prop(node, "ranges", &len);
../arch/xtensa/kernel/setup.c:204:16: error: assignment to 'const __be32 *' {aka 'const unsigned int *'} from 'int' makes pointer from integer without a cast [-Werror=int-conversion]
204 | ranges = of_get_flat_dt_prop(node, "ranges", &len);
| ^
../arch/xtensa/kernel/setup.c: In function 'early_init_devtree':
../arch/xtensa/kernel/setup.c:228:9: error: implicit declaration of function 'early_init_dt_scan'; did you mean 'early_init_devtree'? [-Werror=implicit-function-declaration]
228 | early_init_dt_scan(params);
../arch/xtensa/kernel/setup.c:229:9: error: implicit declaration of function 'of_scan_flat_dt' [-Werror=implicit-function-declaration]
229 | of_scan_flat_dt(xtensa_dt_io_area, NULL);
xtensa-elf-ld: arch/xtensa/mm/mmu.o:(.text+0x0): undefined reference to `xtensa_kio_paddr'
Fixes: da844a8177 ("xtensa: add device trees support")
Fixes: 6cb971114f ("xtensa: remap io area defined in device tree")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fb8c3a3c52 ]
Randconfig builds still show a failure for the ath5k driver,
similar to the one that was fixed for ath9k earlier:
WARNING: unmet direct dependencies detected for MAC80211_LEDS
Depends on [n]: NET [=y] && WIRELESS [=y] && MAC80211 [=y] && (LEDS_CLASS [=m]=y || LEDS_CLASS [=m]=MAC80211 [=y])
Selected by [m]:
- ATH5K [=m] && NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_ATH [=y] && (PCI [=y] || ATH25) && MAC80211 [=y]
net/mac80211/led.c: In function 'ieee80211_alloc_led_names':
net/mac80211/led.c:34:22: error: 'struct led_trigger' has no member named 'name'
34 | local->rx_led.name = kasprintf(GFP_KERNEL, "%srx",
| ^
Copying the same logic from my ath9k patch makes this one work
as well, stubbing out the calls to the LED subsystem.
Fixes: b64acb28da ("ath9k: fix build error with LEDS_CLASS=m")
Fixes: 72cdab8087 ("ath9k: Do not select MAC80211_LEDS by default")
Fixes: 3a078876ca ("ath5k: convert LED code to use mac80211 triggers")
Link: https://lore.kernel.org/all/20210722105501.1000781-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210920122359.353810-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 450e7fe9b1 ]
Currently, it is no longer possible to retrieve a DHCP address
on the imx6qdl-pico board.
This issue has been exposed by commit f5d9aa79df ("ARM: imx6q:
remove clk-out fixup for the Atheros AR8031 and AR8035 PHYs").
Fix it by describing the qca,clk-out-frequency property as suggested
by the commit above.
Fixes: 98670a0bb0 ("ARM: dts: imx6qdl: Add imx6qdl-pico support")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c187e2eb3 ]
The MIC2025 switch input signal nEN is active low, describe it as such
in the DT. The previous change to this regulator polarity was incorrectly
influenced by broken quirks in gpiolib-of.c, which is now long fixed. So
fix this regulator polarity setting here once and for all.
Fixes: 3c3601cd6a ("ARM: dts: imx53: Update USB configuration on M53Menlo")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c8c1efe14a ]
The panel already contains pinctrl-0 phandle, but it is missing
the default pinctrl-names property, so the pin configuration is
ignored. Fill in the missing pinctrl-names property, so the pin
configuration is applied.
Fixes: d81765d693 ("ARM: dts: imx53: Update LCD panel node on M53Menlo")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 833d51d7c6 ]
PT_LOAD type denotes that the segment should be loaded into the final
firmware memory region. Hash segment is not one such, because it's only
needed for PAS init and shouldn't be in the final firmware memory region.
That's why mdt_phdr_valid() explicitly reject non PT_LOAD segment and
hash segment. This actually makes the hash segment type check in
qcom_mdt_read_metadata() unnecessary and redundant. For a hash segment,
it won't be loaded into firmware memory region anyway, due to the
QCOM_MDT_TYPE_HASH check in mdt_phdr_valid(), even if it has a PT_LOAD
type for some reason (misusing or abusing?).
Some firmware files on Sony phones are such examples, e.g WCNSS firmware
of Sony Xperia M4 Aqua phone. The type of hash segment is just PT_LOAD.
Drop the unnecessary hash segment type check in qcom_mdt_read_metadata()
to fix firmware loading failure on these phones, while hash segment is
still kept away from the final firmware memory region.
Fixes: 498b98e939 ("soc: qcom: mdt_loader: Support loading non-split images")
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20210828070202.7033-1-shawn.guo@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f1db21c315 ]
The 28NM DSI PLL driver for msm8960 calculates with a 27MHz reference
clock and should hence use PXO, not CXO which runs at 19.2MHz.
Note that none of the DSI PHY/PLL drivers currently use this "ref"
clock; they all rely on (sometimes inexistant) global clock names and
usually function normally without a parent clock. This discrepancy will
be corrected in a future patch, for which this change needs to be in
place first.
Fixes: 6969d1d9c6 ("ARM: dts: qcom-apq8064: Set 'cxo_board' as ref clock of the DSI PHY")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
Link: https://lore.kernel.org/r/20210829203027.276143-2-marijn.suijten@somainline.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e879f855e5 ]
After commit a6d90e9f22 ("bus: ti-sysc: AM3: RNG is GP only"), clang
with -Wimplicit-fallthrough enabled warns:
drivers/bus/ti-sysc.c:2958:3: warning: unannotated fall-through between
switch labels [-Wimplicit-fallthrough]
default:
^
drivers/bus/ti-sysc.c:2958:3: note: insert 'break;' to avoid
fall-through
default:
^
break;
1 warning generated.
Clang's version of this warning is a little bit more pedantic than
GCC's. Add the missing break to satisfy it to match what has been done
all over the kernel tree.
Fixes: a6d90e9f22 ("bus: ti-sysc: AM3: RNG is GP only")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit f5ef336fd2 upstream.
The UFS driver uses blk_mq_tagset_busy_iter() when identifying task
management requests to complete, however blk_mq_tagset_busy_iter() doesn't
work.
blk_mq_tagset_busy_iter() only iterates requests dispatched by the block
layer. That appears as if it might have started since commit 37f4a24c24
("blk-mq: centralise related handling into blk_mq_get_driver_tag") which
removed 'data->hctx->tags->rqs[rq->tag] = rq' from blk_mq_rq_ctx_init()
which gets called:
blk_get_request
blk_mq_alloc_request
__blk_mq_alloc_request
blk_mq_rq_ctx_init
Since UFS task management requests are not dispatched by the block layer,
hctx->tags->rqs[rq->tag] remains NULL, and since blk_mq_tagset_busy_iter()
relies on finding requests using hctx->tags->rqs[rq->tag], UFS task
management requests are never found by blk_mq_tagset_busy_iter().
By using blk_mq_tagset_busy_iter(), the UFS driver was relying on internal
details of the block layer, which was fragile and subsequently got
broken. Fix by removing the use of blk_mq_tagset_busy_iter() and having the
driver keep track of task management requests.
Link: https://lore.kernel.org/r/20210922091059.4040-1-adrian.hunter@intel.com
Fixes: 1235fc569e ("scsi: ufs: core: Fix task management request completion timeout")
Fixes: 69a6c269c0 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs")
Cc: stable@vger.kernel.org
Tested-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b663b34c9 upstream.
Since the LED multicolor framework support was added in commit
92a81562e6 ("leds: lp55xx: Add multicolor framework support to lp55xx")
LEDs on this platform stopped working.
Author of the framework attempted to accommodate this DT to the
framework in commit b86d3d21cd ("ARM: dts: imx6dl-yapp4: Add reg property
to the lp5562 channel node") but that is not sufficient. A color property
is now required even if the multicolor framework is not used, otherwise
the driver probe fails:
lp5562: probe of 1-0030 failed with error -22
Add the color property to fix this.
Fixes: 92a81562e6 ("leds: lp55xx: Add multicolor framework support to lp55xx")
Cc: <stable@vger.kernel.org>
Cc: linux-leds@vger.kernel.org
Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ba5acfb34 upstream.
If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number
whenever sd_max is less than GSS_SEQ_WIN, and the comparison:
seq_num <= sd->sd_max - GSS_SEQ_WIN
in gss_check_seq_num is pretty much always true, even when that's
clearly not what was intended.
This was causing pynfs to hang when using krb5, because pynfs uses zero
as the initial gss sequence number. That's perfectly legal, but this
logic error causes knfsd to drop the rpc in that case. Out-of-order
sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this.
Fixes: 10b9d99a3d ("SUNRPC: Augment server-side rpcgss tracepoints")
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2e717d655 upstream.
RFC3530 notes that the 'dircount' field may be zero, in which case the
recommendation is to ignore it, and only enforce the 'maxcount' field.
In RFC5661, this recommendation to ignore a zero valued field becomes a
requirement.
Fixes: aee3776441 ("nfsd4: fix rd_dircount enforcement")
Cc: <stable@vger.kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d625050c7 upstream.
init_nfsd() should not unregister pernet subsys if the register fails
but should instead unwind from the last successful operation which is
register_filesystem().
Unregistering a failed register_pernet_subsys() call can result in
a kernel GPF as revealed by programmatically injecting an error in
register_pernet_subsys().
Verified the fix handled failure gracefully with no lingering nfsd
entry in /proc/filesystems. This change was introduced by the commit
bd5ae9288d ("nfsd: register pernet ops last, unregister first"),
the original error handling logic was correct.
Fixes: bd5ae9288d ("nfsd: register pernet ops last, unregister first")
Cc: stable@vger.kernel.org
Signed-off-by: Patrick Ho <Patrick.Ho@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1dc1eed46f upstream.
Normally the check at open time suffices, but e.g loop device does set
IOCB_DIRECT after doing its own checks (which are not sufficent for
overlayfs).
Make sure we don't call the underlying filesystem read/write method with
the IOCB_DIRECT if it's not supported.
Reported-by: Huang Jianan <huangjianan@oppo.com>
Fixes: 16914e6fc7 ("ovl: add ovl_read_iter()")
Cc: <stable@vger.kernel.org> # v4.19
Tested-by: Huang Jianan <huangjianan@oppo.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a38a4d51c upstream.
The memory at the end of the controller only accepts 32bit read/write
accesses, but the arm64 memcpy_to/fromio implementation only uses 64bit
(which will be split into two 32bit access) and 8bit leading to incomplete
copies to/from this memory when the buffer is not multiple of 8bytes.
Add a local copy using writel/readl accesses to make sure we use the right
memory access width.
The switch to memcpy_to/fromio was done because of 285133040e
("arm64: Import latest memcpy()/memmove() implementation"), but using memcpy
worked before since it mainly used 32bit memory acceses.
Fixes: 103a5348c2 ("mmc: meson-gx: use memcpy_to/fromio for dram-access-quirk")
Reported-by: Christian Hewitt <christianshewitt@gmail.com>
Suggested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210928073652.434690-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49b2dfc081 upstream.
We don't currently have any kind of real acceleration on Ampere GPUs,
but the TTM memcpy() fallback paths aren't really designed to handle
copies between different devices, such as on Optimus systems, and
result in a kernel OOPS.
A few options were investigated to try and fix this, but didn't work
out, and likely would have resulted in a very unpleasant experience
for users anyway.
This commit adds just enough support for setting up a single channel
connected to a copy engine, which the kernel can use to accelerate
the buffer copies between devices. Userspace has no access to this
incomplete channel support, but it's suitable for TTM's needs.
A more complete implementation of host(fifo) for Ampere GPUs is in
the works, but the required changes are far too invasive that they
would be unsuitable to backport to fix this issue on current kernels.
v2: fix GPFIFO length in RAMFC (reported by Karol)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: <stable@vger.kernel.org> # v5.12+
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210916220406.666454-1-skeggsb@gmail.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05300871c0 upstream.
USB TCPCI Spec, 4.4.3 Mask Registers:
"A masked register will still indicate in the ALERT register, but shall
not set the Alert# pin low."
Thus, the Extended Status will still indicate in ALERT register if vSafe0V
is detected by TCPC even though being masked. In current code, howerer,
this event will not be handled in detection time. Rather it will be
handled when next ALERT event coming(CC evnet, PD event, etc).
Tcpm might transition to a wrong state in this situation. Thus, the vSafe0V
event should not be handled when it's masked.
Fixes: 766c485b86 ("usb: typec: tcpci: Add support to report vSafe0V")
cc: <stable@vger.kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20210926101415.3775058-1-xu.yang_2@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65a205e611 upstream.
A recent change that started reporting break events to the line
discipline caused the tty-buffer insertions to no longer be serialised
by inserting events also from the completion handler for the interrupt
endpoint.
Completion calls for distinct endpoints are not guaranteed to be
serialised. For example, in case a host-controller driver uses
bottom-half completion, the interrupt and bulk-in completion handlers
can end up running in parallel on two CPUs (high-and low-prio tasklets,
respectively) thereby breaking the tty layer's single producer
assumption.
Fix this by holding the read lock also when inserting characters from
the bulk endpoint.
Fixes: 08dff274ed ("cdc-acm: fix BREAK rx code path adding necessary calls")
Cc: stable@vger.kernel.org
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20210929090937.7410-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0560c9c552 upstream.
Async feedback patches broke enumeration on Windows 10 previously fixed
by commit 789ea77310 ("usb: gadget: f_uac2: always increase endpoint
max_packet_size by one audio slot").
While the existing calculation for EP OUT capture for async mode yields
size+1 frame due to uac2_opts->fb_max > 0, playback side lost the +1
feature. Therefore the +1 frame addition must be re-introduced for
playback. Win10 enumerates the device only when both EP IN and EP OUT
max packet sizes are (at least) +1 frame.
Fixes: e89bb42883 ("usb: gadget: u_audio: add real feedback implementation")
Cc: stable <stable@vger.kernel.org>
Tested-by: Henrik Enquist <henrik.enquist@gmail.com>
Tested-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
Link: https://lore.kernel.org/r/20210924080027.5362-1-pavel.hofman@ivitera.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8253a34bfa upstream.
When passing 'phys' in the devicetree to describe the USB PHY phandle
(which is the recommended way according to
Documentation/devicetree/bindings/usb/ci-hdrc-usb2.txt) the
following NULL pointer dereference is observed on i.MX7 and i.MX8MM:
[ 1.489344] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098
[ 1.498170] Mem abort info:
[ 1.500966] ESR = 0x96000044
[ 1.504030] EC = 0x25: DABT (current EL), IL = 32 bits
[ 1.509356] SET = 0, FnV = 0
[ 1.512416] EA = 0, S1PTW = 0
[ 1.515569] FSC = 0x04: level 0 translation fault
[ 1.520458] Data abort info:
[ 1.523349] ISV = 0, ISS = 0x00000044
[ 1.527196] CM = 0, WnR = 1
[ 1.530176] [0000000000000098] user address but active_mm is swapper
[ 1.536544] Internal error: Oops: 96000044 [#1] PREEMPT SMP
[ 1.542125] Modules linked in:
[ 1.545190] CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.14.0-dirty #3
[ 1.551901] Hardware name: Kontron i.MX8MM N801X S (DT)
[ 1.557133] Workqueue: events_unbound deferred_probe_work_func
[ 1.562984] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[ 1.568998] pc : imx7d_charger_detection+0x3f0/0x510
[ 1.573973] lr : imx7d_charger_detection+0x22c/0x510
This happens because the charger functions check for the phy presence
inside the imx_usbmisc_data structure (data->usb_phy), but the chipidea
core populates the usb_phy passed via 'phys' inside 'struct ci_hdrc'
(ci->usb_phy) instead.
This causes the NULL pointer dereference inside imx7d_charger_detection().
Fix it by also searching for 'phys' in case 'fsl,usbphy' is not found.
Tested on a imx7s-warp board.
Fixes: 746f316b75 ("usb: chipidea: introduce imx7d USB charger detection")
Cc: stable@vger.kernel.org
Reported-by: Heiko Thiery <heiko.thiery@gmail.com>
Tested-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Acked-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20210921113754.767631-1-festevam@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4497b40ca8 upstream.
This reverts commit cc8870bf4c.
Since commit cc8870bf4c ("ARM: imx6q: drop of_platform_default_populate()
from init_machine") the following errors are seen on boot:
[ 0.123372] imx6q_suspend_init: failed to find ocram device!
[ 0.123537] imx6_pm_common_init: No DDR LPM support with suspend -19!
, which break suspend/resume on imx6q/dl.
Revert the offeding commit to avoid the regression.
Thanks to Tim Harvey for bisecting this problem.
Cc: stable@vger.kernel.org
Fixes: cc8870bf4c ("ARM: imx6q: drop of_platform_default_populate() from init_machine")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 151a7c12c4 upstream.
This reverts commit b0b524f079.
Commit b0b524f079 ("brcmfmac: use ISO3166 country code and 0 rev
as fallback") changes country setup to directly use ISO3166 country
codes if no more specific code is configured. This was done under
the assumption that brcmfmac firmwares can handle such simple
direct mapping from country codes to firmware ccode values.
Unfortunately this is not true for all chipset/firmware combinations.
E.g. BCM4359/9 devices stop working as access point with this change,
so revert the offending commit to avoid the regression.
Signed-off-by: Soeren Moch <smoch@web.de>
Cc: stable@vger.kernel.org # 5.14.x
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210926201905.211605-1-smoch@web.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a8526a5cd upstream.
Many users are reporting that the Samsung 860 and 870 SSD are having
various issues when combined with AMD/ATI (vendor ID 0x1002) SATA
controllers and only completely disabling NCQ helps to avoid these
issues.
Always disabling NCQ for Samsung 860/870 SSDs regardless of the host
SATA adapter vendor will cause I/O performance degradation with well
behaved adapters. To limit the performance impact to ATI adapters,
introduce the ATA_HORKAGE_NO_NCQ_ON_ATI flag to force disable NCQ
only for these adapters.
Also, two libata.force parameters (noncqati and ncqati) are introduced
to disable and enable the NCQ for the system which equipped with ATI
SATA adapter and Samsung 860 and 870 SSDs. The user can determine NCQ
function to be enabled or disabled according to the demand.
After verifying the chipset from the user reports, the issue appears
on AMD/ATI SB7x0/SB8x0/SB9x0 SATA Controllers and does not appear on
recent AMD SATA adapters. The vendor ID of ATI should be 0x1002.
Therefore, ATA_HORKAGE_NO_NCQ_ON_AMD was modified to
ATA_HORKAGE_NO_NCQ_ON_ATI.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201693
Signed-off-by: Kate Hsuan <hpa@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210903094411.58749-1-hpa@redhat.com
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Krzysztof Olędzki <ole@ans.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02d029a41d upstream.
perf_init_event tries multiple init callbacks and does not reset the
event state between tries. When x86_pmu_event_init runs, it
unconditionally sets the destroy callback to hw_perf_event_destroy. On
the next init attempt after x86_pmu_event_init, in perf_try_init_event,
if the pmu's capabilities includes PERF_PMU_CAP_NO_EXCLUDE, the destroy
callback will be run. However, if the next init didn't set the destroy
callback, hw_perf_event_destroy will be run (since the callback wasn't
reset).
Looking at other pmu init functions, the common pattern is to only set
the destroy callback on a successful init. Resetting the callback on
failure tries to replicate that pattern.
This was discovered after commit f11dd0d805 ("perf/x86/amd/ibs: Extend
PERF_PMU_CAP_NO_EXCLUDE to IBS Op") when the second (and only second)
run of the perf tool after a reboot results in 0 samples being
generated. The extra run of hw_perf_event_destroy results in
active_events having an extra decrement on each perf run. The second run
has active_events == 0 and every subsequent run has active_events < 0.
When active_events == 0, the NMI handler will early-out and not record
any samples.
Signed-off-by: Anand K Mistry <amistry@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210929170405.1.I078b98ee7727f9ae9d6df8262bad7e325e40faf0@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e1fc1553cd ]
Intel PMU MSRs is in msrs_to_save_all[], so add AMD PMU MSRs to have a
consistent behavior between Intel and AMD when using KVM_GET_MSRS,
KVM_SET_MSRS or KVM_GET_MSR_INDEX_LIST.
We have to add legacy and new MSRs to handle guests running without
X86_FEATURE_PERFCTR_CORE.
Signed-off-by: Fares Mehanna <faresx@amazon.de>
Message-Id: <20210915133951.22389-1-faresx@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 37687c403a ]
When exiting SMM, pdpts are loaded again from the guest memory.
This fixes a theoretical bug, when exit from SMM triggers entry to the
nested guest which re-uses some of the migration
code which uses this flag as a workaround for a legacy userspace.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210913140954.165665-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ae232ea460 ]
grow_halt_poll_ns() ignores values between 0 and
halt_poll_ns_grow_start (10000 by default). However,
when we shrink halt_poll_ns we may fall way below
halt_poll_ns_grow_start and endup with halt_poll_ns
values that don't make a lot of sense: like 1 or 9,
or 19.
VCPU1 trace (halt_poll_ns_shrink equals 2):
VCPU1 grow 10000
VCPU1 shrink 5000
VCPU1 shrink 2500
VCPU1 shrink 1250
VCPU1 shrink 625
VCPU1 shrink 312
VCPU1 shrink 156
VCPU1 shrink 78
VCPU1 shrink 39
VCPU1 shrink 19
VCPU1 shrink 9
VCPU1 shrink 4
Mirror what grow_halt_poll_ns() does and set halt_poll_ns
to 0 as soon as new shrink-ed halt_poll_ns value falls
below halt_poll_ns_grow_start.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210902031100.252080-1-senozhatsky@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 01f91acb55 ]
The SMC64 calling convention passes a function identifier in w0 and its
parameters in x1-x17. Given this, there are two deviations in the
SMC64 call performed by the steal_time test: the function identifier is
assigned to a 64 bit register and the parameter is only 32 bits wide.
Align the call with the SMCCC by using a 32 bit register to handle the
function identifier and increasing the parameter width to 64 bits.
Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20210921171121.2148982-3-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 19532869fe ]
Currently, the asan-stack parameter is only passed along if
CFLAGS_KASAN_SHADOW is not empty, which requires KASAN_SHADOW_OFFSET to
be defined in Kconfig so that the value can be checked. In RISC-V's
case, KASAN_SHADOW_OFFSET is not defined in Kconfig, which means that
asan-stack does not get disabled with clang even when CONFIG_KASAN_STACK
is disabled, resulting in large stack warnings with allmodconfig:
drivers/video/fbdev/omap2/omapfb/displays/panel-lgphilips-lb035q02.c:117:12: error: stack frame size (14400) exceeds limit (2048) in function 'lb035q02_connect' [-Werror,-Wframe-larger-than]
static int lb035q02_connect(struct omap_dss_device *dssdev)
^
1 error generated.
Ensure that the value of CONFIG_KASAN_STACK is always passed along to
the compiler so that these warnings do not happen when
CONFIG_KASAN_STACK is disabled.
Link: https://github.com/ClangBuiltLinux/linux/issues/1453
References: 6baec880d7 ("kasan: turn off asan-stack for clang-8 and earlier")
Link: https://lkml.kernel.org/r/20210922205525.570068-1-nathan@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ebaeab2fe8 ]
Idle page tracking can also be used for process address space, not only
file mappings.
Without this change, using with '-i' option for process address space
encounters below errors reported.
$ sudo ./page-types -p $(pidof bash) -i
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
...
Link: https://lkml.kernel.org/r/20210917032826.10669-1-changbin.du@gmail.com
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a647a524a4 ]
rq_qos framework is only applied on request based driver, so:
1) rq_qos_done_bio() needn't to be called for bio based driver
2) rq_qos_done_bio() needn't to be called for bio which isn't tracked,
such as bios ended from error handling code.
Especially in bio_endio():
1) request queue is referred via bio->bi_bdev->bd_disk->queue, which
may be gone since request queue refcount may not be held in above two
cases
2) q->rq_qos may be freed in blk_cleanup_queue() when calling into
__rq_qos_done_bio()
Fix the potential kernel panic by not calling rq_qos_ops->done_bio if
the bio isn't tracked. This way is safe because both ioc_rqos_done_bio()
and blkcg_iolatency_done_bio() are nop if the bio isn't tracked.
Reported-by: Yu Kuai <yukuai3@huawei.com>
Cc: tj@kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20210924110704.1541818-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ba1071f75 ]
Don't perform unaligned loads in __get_next() and __peek_nbyte_next() as
these are forms of undefined behavior:
"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer
is not correctly aligned for the pointed-to type, the behavior is
undefined."
(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
These problems were identified using the undefined behavior sanitizer
(ubsan) with the tools version of the code and perf test.
[ bp: Massage commit message. ]
Signed-off-by: Numfor Mbiziwo-Tiapo <nums@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20210923161843.751834-1-irogers@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b06d893ef2 ]
Address warning:
fs/smbfs_client/smb2pdu.c:2425 create_sd_buf()
warn: struct type mismatch 'smb3_acl vs cifs_acl'
Pointed out by Dan Carpenter via smatch code analysis tool
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b78f26926b ]
Geert reported that the GIC driver locks up on a Renesas system
since 005c34ae4b ("irqchip/gic: Atomically update affinity")
fixed the driver to use writeb_relaxed() instead of writel_relaxed().
As it turns out, the interconnect used on this system mandates
32bit wide accesses for all MMIO transactions, even if the GIC
architecture specifically mandates for some registers to be byte
accessible. Gahhh...
Work around the issue by crudly detecting the offending system,
and falling back to an inefficient RMW+lock implementation.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/CAMuHMdV+Ev47K5NO8XHsanSq5YRMCHn2gWAQyV-q2LpJVy9HiQ@mail.gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fbdac19e64 ]
Setting SCSI logging level with error=3, we saw some errors from enclosues:
[108017.360833] ses 0:0:9:0: tag#641 Done: NEEDS_RETRY Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s
[108017.360838] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00
[108017.427778] ses 0:0:9:0: Power-on or device reset occurred
[108017.427784] ses 0:0:9:0: tag#641 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[108017.427788] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00
[108017.427791] ses 0:0:9:0: tag#641 Sense Key : Unit Attention [current]
[108017.427793] ses 0:0:9:0: tag#641 Add. Sense: Bus device reset function occurred
[108017.427801] ses 0:0:9:0: Failed to get diagnostic page 0x1
[108017.427804] ses 0:0:9:0: Failed to bind enclosure -19
[108017.427895] ses 0:0:10:0: Attached Enclosure device
[108017.427942] ses 0:0:10:0: Attached scsi generic sg18 type 13
Retry if the Send/Receive Diagnostic commands complete with a transient
error status (NOT_READY or UNIT_ATTENTION with ASC 0x29).
Link: https://lore.kernel.org/r/1631849061-10210-2-git-send-email-wenxiong@linux.ibm.com
Reviewed-by: Brian King <brking@linux.ibm.com>
Reviewed-by: James Bottomley <jejb@linux.ibm.com>
Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5445dae29 ]
To avoid race between time out and tear down, in tear down process,
first we quiesce the queue, and then delete the timer and cancel
the time out work for the queue.
This patch merges the admin and io sync ops into the queue teardown logic
as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
out and tear down". There is no teardown_lock in nvme-fc.
Signed-off-by: James Smart <jsmart2021@gmail.com>
Tested-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 555f66d0f8 ]
In case the number of hardware queues changes, we need to update the
tagset and the mapping of ctx to hctx first.
If we try to create and connect the I/O queues first, this operation
will fail (target will reject the connect call due to the wrong number
of queues) and hence we bail out of the recreate function. Then we
will to try the very same operation again, thus we don't make any
progress.
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f5013d412a ]
Fix get_run_delay() to check fscanf() return value to get rid of the
following warning. When fscanf() fails return MIN_RUN_DELAY_NS from
get_run_delay(). Move MIN_RUN_DELAY_NS from steal_time.c to test_util.h
so get_run_delay() and steal_time.c can use it.
lib/test_util.c: In function ‘get_run_delay’:
lib/test_util.c:316:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
316 | fscanf(fp, "%ld %ld ", &val[0], &val[1]);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3a4f0cc693 ]
Fix get_trans_hugepagesz() to check fscanf() return value to get rid
of the following warning:
lib/test_util.c: In function ‘get_trans_hugepagesz’:
lib/test_util.c:138:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
138 | fscanf(f, "%ld", &size);
| ^~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 39a71f712d ]
Fix get_warnings_count() to check fscanf() return value to get rid
of the following warning:
x86_64/mmio_warning_test.c: In function ‘get_warnings_count’:
x86_64/mmio_warning_test.c:85:2: warning: ignoring return value of ‘fscanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
85 | fscanf(f, "%d", &warnings);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8914a7a247 ]
LKP/0Day reported some building errors about kvm, and errors message
are not always same:
- lib/x86_64/processor.c:1083:31: error: ‘KVM_CAP_NESTED_STATE’ undeclared
(first use in this function); did you mean ‘KVM_CAP_PIT_STATE2’?
- lib/test_util.c:189:30: error: ‘MAP_HUGE_16KB’ undeclared (first use
in this function); did you mean ‘MAP_HUGE_16GB’?
Although kvm relies on the khdr, they still be built in parallel when -j
is specified. In this case, it will cause compiling errors.
Here we mark target khdr as NOTPARALLEL to make it be always built
first.
CC: Philip Li <philip.li@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0a5ff77bf0 ]
Couple of fixes to the LBW RR configuration:
1. Add missing configuration of the SM RR registers in the DMA_IF.
2. Remove HBW range that doesn't belong.
3. Add entire gap + DBG area, from end of TPC7 to end of entire
DBG space.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d09ff62c82 ]
As collective wait operation is required only when NIC ports are
available, we disable the option to submit a CS in case all the ports
are disabled, which is the current situation in the upstream driver.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3e08f157c2 ]
Due to FLR scenario when running inside a VM, we must not use indirect
MSI because it might cause some issues on VM destroy.
In a VM we use single MSI mode in contrary to multi MSI mode which is
used in bare-metal.
Hence direct MSI should be used in single MSI mode only.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f81c08f897 ]
testusb' application which uses 'usbtest' driver reports 'unknown speed'
from the function 'find_testdev'. The variable 'entry->speed' was not
updated from the application. The IOCTL mentioned in the FIXME comment can
only report whether the connection is low speed or not. Speed is read using
the IOCTL USBDEVFS_GET_SPEED which reports the proper speed grade. The
call is implemented in the function 'handle_testdev' where the file
descriptor was availble locally. Sample output is given below where 'high
speed' is printed as the connected speed.
sudo ./testusb -a
high speed /dev/bus/usb/001/011 0
/dev/bus/usb/001/011 test 0, 0.000015 secs
/dev/bus/usb/001/011 test 1, 0.194208 secs
/dev/bus/usb/001/011 test 2, 0.077289 secs
/dev/bus/usb/001/011 test 3, 0.170604 secs
/dev/bus/usb/001/011 test 4, 0.108335 secs
/dev/bus/usb/001/011 test 5, 2.788076 secs
/dev/bus/usb/001/011 test 6, 2.594610 secs
/dev/bus/usb/001/011 test 7, 2.905459 secs
/dev/bus/usb/001/011 test 8, 2.795193 secs
/dev/bus/usb/001/011 test 9, 8.372651 secs
/dev/bus/usb/001/011 test 10, 6.919731 secs
/dev/bus/usb/001/011 test 11, 16.372687 secs
/dev/bus/usb/001/011 test 12, 16.375233 secs
/dev/bus/usb/001/011 test 13, 2.977457 secs
/dev/bus/usb/001/011 test 14 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 17, 0.148826 secs
/dev/bus/usb/001/011 test 18, 0.068718 secs
/dev/bus/usb/001/011 test 19, 0.125992 secs
/dev/bus/usb/001/011 test 20, 0.127477 secs
/dev/bus/usb/001/011 test 21 --> 22 (Invalid argument)
/dev/bus/usb/001/011 test 24, 4.133763 secs
/dev/bus/usb/001/011 test 27, 2.140066 secs
/dev/bus/usb/001/011 test 28, 2.120713 secs
/dev/bus/usb/001/011 test 29, 0.507762 secs
Signed-off-by: Faizel K B <faizel.kb@dicortech.com>
Link: https://lore.kernel.org/r/20210902114444.15106-1-faizel.kb@dicortech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 450907424d ]
Smatch checker reported the following error:
drivers/base/power/sysfs.c:833 dpm_sysfs_remove()
warn: sleeping in atomic context
With a calling sequence of:
efct_lio_npiv_drop_nport() <- disables preempt
-> fc_vport_terminate()
-> device_del()
-> dpm_sysfs_remove()
Issue is efct_lio_npiv_drop_nport() is making the fc_vport_terminate() call
while holding a lock w/ ipl raised.
It is unnecessary to hold the lock over this call, shift where the lock is
taken.
Link: https://lore.kernel.org/r/20210907165225.10821-1-jsmart2021@gmail.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 197ae17722 ]
Device manager releases device-specific resources when a driver
disconnects from a device, devm_memunmap_pages and
devm_release_mem_region calls in svm_migrate_fini are redundant.
It causes below warning trace after patch "drm/amdgpu: Split
amdgpu_device_fini into early and late", so remove function
svm_migrate_fini.
BUG: https://gitlab.freedesktop.org/drm/amd/-/issues/1718
WARNING: CPU: 1 PID: 3646 at drivers/base/devres.c:795
devm_release_action+0x51/0x60
Call Trace:
? memunmap_pages+0x360/0x360
svm_migrate_fini+0x2d/0x60 [amdgpu]
kgd2kfd_device_exit+0x23/0xa0 [amdgpu]
amdgpu_amdkfd_device_fini_sw+0x1d/0x30 [amdgpu]
amdgpu_device_fini_sw+0x45/0x290 [amdgpu]
amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
drm_dev_release+0x20/0x40 [drm]
release_nodes+0x196/0x1e0
device_release_driver_internal+0x104/0x1d0
driver_detach+0x47/0x90
bus_remove_driver+0x7a/0xd0
pci_unregister_driver+0x3d/0x90
amdgpu_exit+0x11/0x20 [amdgpu]
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 372d1f3e1b ]
The ext2_error() function syncs the filesystem so it sleeps. The caller
is holding a spinlock so it's not allowed to sleep.
ext2_statfs() <- disables preempt
-> ext2_count_free_blocks()
-> ext2_get_group_desc()
Fix this by using WARN() to print an error message and a stack trace
instead of using ext2_error().
Link: https://lore.kernel.org/r/20210921203233.GA16529@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ede7f84c7 ]
When re-entering the main loop of xenvif_tx_check_gop() a 2nd time, the
special considerations for the head of the SKB no longer apply. Don't
mistakenly report ERROR to the frontend for the first entry in the list,
even if - from all I can tell - this shouldn't matter much as the overall
transmit will need to be considered failed anyway.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cf9579976f ]
MDIO-attached devices might have interrupts and other things that might
need quiesced when we kexec into a new kernel. Things are even more
creepy when those interrupt lines are shared, and in that case it is
absolutely mandatory to disable all interrupt sources.
Moreover, MDIO devices might be DSA switches, and DSA needs its own
shutdown method to unlink from the DSA master, which is a new
requirement that appeared after commit 2f1e8ea726 ("net: dsa: link
interfaces with the DSA master to get rid of lockdep warnings").
So introduce a ->shutdown method in the MDIO device driver structure.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b225baaba ]
When we get an error flushing one device, during a super block commit, we
record the error in the device structure, in the field 'last_flush_error'.
This is used to later check if we should error out the super block commit,
depending on whether the number of flush errors is greater than or equals
to the maximum tolerated device failures for a raid profile.
However if we get a transient device flush error, unmount the filesystem
and later try to mount it, we can fail the mount because we treat that
past error as critical and consider the device is missing. Even if it's
very likely that the error will happen again, as it's probably due to a
hardware related problem, there may be cases where the error might not
happen again. One example is during testing, and a test case like the
new generic/648 from fstests always triggers this. The test cases
generic/019 and generic/475 also trigger this scenario, but very
sporadically.
When this happens we get an error like this:
$ mount /dev/sdc /mnt
mount: /mnt wrong fs type, bad option, bad superblock on /dev/sdc, missing codepage or helper program, or other error.
$ dmesg
(...)
[12918.886926] BTRFS warning (device sdc): chunk 13631488 missing 1 devices, max tolerance is 0 for writable mount
[12918.888293] BTRFS warning (device sdc): writable mount is not allowed due to too many missing devices
[12918.890853] BTRFS error (device sdc): open_ctree failed
The failure happens because when btrfs_check_rw_degradable() is called at
mount time, or at remount from RO to RW time, is sees a non zero value in
a device's ->last_flush_error attribute, and therefore considers that the
device is 'missing'.
Fix this by setting a device's ->last_flush_error to zero when we close a
device, making sure the error is not seen on the next mount attempt. We
only need to track flush errors during the current mount, so that we never
commit a super block if such errors happened.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bbc9a6eb5e ]
There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error.
It has indeed caught several bugs during subpage development.
But the BUG_ON() itself will bring down the whole system which is
an overkill.
Replace it with a WARN() and exit gracefully, so that it won't crash the
whole system while we can still catch the code logic error.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 02579b2ff8 ]
When the back channel enters SEQ4_STATUS_CB_PATH_DOWN state, the client
recovers by sending BIND_CONN_TO_SESSION but the server fails to recover
the back channel and leaves it as NFSD4_CB_DOWN.
Fix by enhancing nfsd4_bind_conn_to_session to probe the back channel
by calling nfsd4_probe_callback.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 196159d278 ]
Add info for getting the firmware directly from the UEFI for the Chuwi Hi10
Plus (CWI527), so that the user does not need to manually install the
firmware in /lib/firmware/silead.
This change will make the touchscreen on these devices work OOTB,
without requiring any manual setup.
Also tweak the min and width/height values a bit for more accurate position
reporting.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210905130210.32810-2-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3bf1669b0e ]
Add touchscreen info for the Chuwi HiBook (CWI514) tablet. This includes
info for getting the firmware directly from the UEFI, so that the user does
not need to manually install the firmware in /lib/firmware/silead.
This change will make the touchscreen on these devices work OOTB,
without requiring any manual setup.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210905130210.32810-1-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3978d81652 ]
afs_d_revalidate() should only be validating the directory entry it is
given and the directory to which that belongs; it shouldn't be validating
the inode/vnode to which that dentry points. Besides, validation need to
be done even if we don't call afs_d_revalidate() - which might be the case
if we're starting from a file descriptor.
In order for afs_d_revalidate() to be fixed, validation points must be
added in some other places. Certain directory operations, such as
afs_unlink(), already check this, but not all and not all file operations
either.
Note that the validation of a vnode not only checks to see if the
attributes we have are correct, but also gets a promise from the server to
notify us if that file gets changed by a third party.
Add the following checks:
- Check the vnode we're going to make a hard link to.
- Check the vnode we're going to move/rename.
- Check the vnode we're going to read from.
- Check the vnode we're going to write to.
- Check the vnode we're going to sync.
- Check the vnode we're going to make a mapped page writable for.
Some of these aren't strictly necessary as we're going to perform a server
operation that might get the attributes anyway from which we can determine
if something changed - though it might not get us a callback promise.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111667354.283156.12720698333342917516.stgit@warthog.procyon.org.uk/
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5457773ef9 ]
Previously zero length transfers submitted to the Rokchip SPI driver would
time out in the SPI layer. This happens because the SPI peripheral does
not trigger a transfer completion interrupt for zero length transfers.
Fix that by completing zero length transfers immediately at start of
transfer.
Signed-off-by: Tobias Schramm <t.schramm@manjaro.org>
Link: https://lore.kernel.org/r/20210827050357.165409-1-t.schramm@manjaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 88a04049c0 upstream.
The cl_data field of a privdata must be allocated and updated before
using in amd_sfh_hid_client_init() function.
Hence handling NULL pointer cl_data accordingly.
Fixes: d46ef750ed ("HID: amd_sfh: Fix potential NULL pointer dereference")
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7fab1c12bd upstream.
The objtool warning that the kvm instruction emulation code triggered
wasn't very useful:
arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception
in that it helpfully tells you which symbol name it had trouble figuring
out the relocation for, but it doesn't actually say what the unknown
symbol type was that triggered it all.
In this case it was because of missing type information (type 0, aka
STT_NOTYPE), but on the whole it really should just have printed that
out as part of the message.
Because if this warning triggers, that's very much the first thing you
want to know - why did reloc2sec_off() return failure for that symbol?
So rather than just saying you can't handle some type of symbol without
saying what the type _was_, just print out the type number too.
Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e9edc188fc upstream.
Syzbot was able to trigger the following warning [1]
No repro found by syzbot yet but I was able to trigger similar issue
by having 2 scripts running in parallel, changing conntrack hash sizes,
and:
for j in `seq 1 1000` ; do unshare -n /bin/true >/dev/null ; done
It would take more than 5 minutes for net_namespace structures
to be cleaned up.
This is because nf_ct_iterate_cleanup() has to restart everytime
a resize happened.
By adding a mutex, we can serialize hash resizes and cleanups
and also make get_next_corpse() faster by skipping over empty
buckets.
Even without resizes in the picture, this patch considerably
speeds up network namespace dismantles.
[1]
INFO: task syz-executor.0:8312 can't die for more than 144 seconds.
task:syz-executor.0 state:R running task stack:25672 pid: 8312 ppid: 6573 flags:0x00004006
Call Trace:
context_switch kernel/sched/core.c:4955 [inline]
__schedule+0x940/0x26f0 kernel/sched/core.c:6236
preempt_schedule_common+0x45/0xc0 kernel/sched/core.c:6408
preempt_schedule_thunk+0x16/0x18 arch/x86/entry/thunk_64.S:35
__local_bh_enable_ip+0x109/0x120 kernel/softirq.c:390
local_bh_enable include/linux/bottom_half.h:32 [inline]
get_next_corpse net/netfilter/nf_conntrack_core.c:2252 [inline]
nf_ct_iterate_cleanup+0x15a/0x450 net/netfilter/nf_conntrack_core.c:2275
nf_conntrack_cleanup_net_list+0x14c/0x4f0 net/netfilter/nf_conntrack_core.c:2469
ops_exit_list+0x10d/0x160 net/core/net_namespace.c:171
setup_net+0x639/0xa30 net/core/net_namespace.c:349
copy_net_ns+0x319/0x760 net/core/net_namespace.c:470
create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110
unshare_nsproxy_namespaces+0xc1/0x1f0 kernel/nsproxy.c:226
ksys_unshare+0x445/0x920 kernel/fork.c:3128
__do_sys_unshare kernel/fork.c:3202 [inline]
__se_sys_unshare kernel/fork.c:3200 [inline]
__x64_sys_unshare+0x2d/0x40 kernel/fork.c:3200
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f63da68e739
RSP: 002b:00007f63d7c05188 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f63da792f80 RCX: 00007f63da68e739
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000040000000
RBP: 00007f63da6e8cc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f63da792f80
R13: 00007fff50b75d3f R14: 00007f63d7c05300 R15: 0000000000022000
Showing all locks held in the system:
1 lock held by khungtaskd/27:
#0: ffffffff8b980020 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446
2 locks held by kworker/u4:2/153:
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic_long_set include/linux/atomic/atomic-long.h:41 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:1198 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:634 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:661 [inline]
#0: ffff888010c69138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x896/0x1690 kernel/workqueue.c:2268
#1: ffffc9000140fdb0 ((kfence_timer).work){+.+.}-{0:0}, at: process_one_work+0x8ca/0x1690 kernel/workqueue.c:2272
1 lock held by systemd-udevd/2970:
1 lock held by in:imklog/6258:
#0: ffff88807f970ff0 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0xe9/0x100 fs/file.c:990
3 locks held by kworker/1:6/8158:
1 lock held by syz-executor.0/8312:
2 locks held by kworker/u4:13/9320:
1 lock held by syz-executor.5/10178:
1 lock held by syz-executor.4/10217:
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7661809d49 upstream.
'kvmalloc()' is a convenience function for people who want to do a
kmalloc() but fall back on vmalloc() if there aren't enough physically
contiguous pages, or if the allocation is larger than what kmalloc()
supports.
However, let's make sure it doesn't get _too_ easy to do crazy things
with it. In particular, don't allow big allocations that could be due
to integer overflow or underflow. So make sure the allocation size fits
in an 'int', to protect against trivial integer conversion issues.
Acked-by: Willy Tarreau <w@1wt.eu>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dcb713d53e upstream.
There are two invocation sites of hso_free_net_device. After
refactoring hso_create_net_device, this parameter is useless.
Remove the bailout in the hso_free_net_device and change the invocation
sites of this function.
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[Backport this cleanup patch to 5.10 and 5.14 in order to keep the
codebase consistent with the 4.14/4.19/5.4 patchseries]
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9523b33cc3 upstream.
This is a nuisance when CONFIG_WERROR is set, so drop the variable
declaration since the code that used it was removed.
../arch/nios2/kernel/setup.c: In function 'setup_arch':
../arch/nios2/kernel/setup.c:152:13: warning: unused variable 'dram_start' [-Wunused-variable]
152 | int dram_start;
Fixes: 7f7bc20bc4 ("nios2: Don't use _end for calculating min_low_pfn")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andreas Oetken <andreas.oetken@siemens.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9f5970767 upstream.
up->corkflag field can be read or written without any lock.
Annotate accesses to avoid possible syzbot/KCSAN reports.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22d65765f2 upstream.
Since the actual_length calculation is performed unsigned, packets
shorter than 7 bytes (e.g. packets without data or otherwise truncated)
or non-received packets ("zero" bytes) can cause buffer overflow.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214437
Fixes: 42337b9d4d958("HID: add driver for U2F Zero built-in LED and RNG")
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb9464e083 upstream.
The error path in ext4_fill_super forget to flush s_error_work before
journal destroy, and it may trigger the follow bug since
flush_stashed_error_work can run concurrently with journal destroy
without any protection for sbi->s_journal.
[32031.740193] EXT4-fs (loop66): get root inode failed
[32031.740484] EXT4-fs (loop66): mount failed
[32031.759805] ------------[ cut here ]------------
[32031.759807] kernel BUG at fs/jbd2/transaction.c:373!
[32031.760075] invalid opcode: 0000 [#1] SMP PTI
[32031.760336] CPU: 5 PID: 1029268 Comm: kworker/5:1 Kdump: loaded
4.18.0
[32031.765112] Call Trace:
[32031.765375] ? __switch_to_asm+0x35/0x70
[32031.765635] ? __switch_to_asm+0x41/0x70
[32031.765893] ? __switch_to_asm+0x35/0x70
[32031.766148] ? __switch_to_asm+0x41/0x70
[32031.766405] ? _cond_resched+0x15/0x40
[32031.766665] jbd2__journal_start+0xf1/0x1f0 [jbd2]
[32031.766934] jbd2_journal_start+0x19/0x20 [jbd2]
[32031.767218] flush_stashed_error_work+0x30/0x90 [ext4]
[32031.767487] process_one_work+0x195/0x390
[32031.767747] worker_thread+0x30/0x390
[32031.768007] ? process_one_work+0x390/0x390
[32031.768265] kthread+0x10d/0x130
[32031.768521] ? kthread_flush_work_fn+0x10/0x10
[32031.768778] ret_from_fork+0x35/0x40
static int start_this_handle(...)
BUG_ON(journal->j_flags & JBD2_UNMOUNT); <---- Trigger this
Besides, after we enable fast commit, ext4_fc_replay can add work to
s_error_work but return success, so the latter journal destroy in
ext4_load_journal can trigger this problem too.
Fix this problem with two steps:
1. Call ext4_commit_super directly in ext4_handle_error for the case
that called from ext4_fc_replay
2. Since it's hard to pair the init and flush for s_error_work, we'd
better add a extras flush_work before journal destroy in
ext4_fill_super
Besides, this patch will call ext4_commit_super in ext4_handle_error for
any nojournal case too. But it seems safe since the reason we call
schedule_work was that we should save error info to sb through journal
if available. Conversely, for the nojournal case, it seems useless delay
commit superblock to s_error_work.
Fixes: c92dc85684 ("ext4: defer saving error info from atomic context")
Fixes: 2d01ddc866 ("ext4: save error info to sb through journal if available")
Cc: stable@kernel.org
Signed-off-by: yangerkun <yangerkun@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210924093917.1953239-1-yangerkun@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42cb447410 upstream.
When ext4_htree_fill_tree() fails, ext4_dx_readdir() can run into an
infinite loop since if info->last_pos != ctx->pos this will reset the
directory scan and reread the failing entry. For example:
1. a dx_dir which has 3 block, block 0 as dx_root block, block 1/2 as
leaf block which own the ext4_dir_entry_2
2. block 1 read ok and call_filldir which will fill the dirent and update
the ctx->pos
3. block 2 read fail, but we has already fill some dirent, so we will
return back to userspace will a positive return val(see ksys_getdents64)
4. the second ext4_dx_readdir will reset the world since info->last_pos
!= ctx->pos, and will also init the curr_hash which pos to block 1
5. So we will read block1 too, and once block2 still read fail, we can
only fill one dirent because the hash of the entry in block1(besides
the last one) won't greater than curr_hash
6. this time, we forget update last_pos too since the read for block2
will fail, and since we has got the one entry, ksys_getdents64 can
return success
7. Latter we will trapped in a loop with step 4~6
Cc: stable@kernel.org
Signed-off-by: yangerkun <yangerkun@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210914111415.3921954-1-yangerkun@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1fd95c05d8 upstream.
If the call to ext4_map_blocks() fails due to an corrupted file
system, ext4_ext_replay_set_iblocks() can get stuck in an infinite
loop. This could be reproduced by running generic/526 with a file
system that has inline_data and fast_commit enabled. The system will
repeatedly log to the console:
EXT4-fs warning (device dm-3): ext4_block_to_path:105: block 1074800922 > max in inode 131076
and the stack that it gets stuck in is:
ext4_block_to_path+0xe3/0x130
ext4_ind_map_blocks+0x93/0x690
ext4_map_blocks+0x100/0x660
skip_hole+0x47/0x70
ext4_ext_replay_set_iblocks+0x223/0x440
ext4_fc_replay_inode+0x29e/0x3b0
ext4_fc_replay+0x278/0x550
do_one_pass+0x646/0xc10
jbd2_journal_recover+0x14a/0x270
jbd2_journal_load+0xc4/0x150
ext4_load_journal+0x1f3/0x490
ext4_fill_super+0x22d4/0x2c00
With this patch, generic/526 still fails, but system is no longer
locking up in a tight loop. It's likely the root casue is that
fast_commit replay is corrupting file systems with inline_data, and we
probably need to add better error handling in the fast commit replay
code path beyond what is done here, which essentially just breaks the
infinite loop without reporting the to the higher levels of the code.
Fixes: 8016E29F4362 ("ext4: fast commit recovery path")
Cc: stable@kernel.org
Cc: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6fed83957f upstream.
When ext4_insert_delayed block receives and recovers from an error from
ext4_es_insert_delayed_block(), e.g., ENOMEM, it does not release the
space it has reserved for that block insertion as it should. One effect
of this bug is that s_dirtyclusters_counter is not decremented and
remains incorrectly elevated until the file system has been unmounted.
This can result in premature ENOSPC returns and apparent loss of free
space.
Another effect of this bug is that
/sys/fs/ext4/<dev>/delayed_allocation_blocks can remain non-zero even
after syncfs has been executed on the filesystem.
Besides, add check for s_dirtyclusters_counter when inode is going to be
evicted and freed. s_dirtyclusters_counter can still keep non-zero until
inode is written back in .evict_inode(), and thus the check is delayed
to .destroy_inode().
Fixes: 51865fda28 ("ext4: let ext4 maintain extent status tree")
Cc: stable@kernel.org
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210823061358.84473-1-jefflexu@linux.alibaba.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2c2f0826e upstream.
Now EXT4_FC_TAG_ADD_RANGE uses ext4_extent to track the
newly-added blocks, but the limit on the max value of
ee_len field is ignored, and it can lead to BUG_ON as
shown below when running command "fallocate -l 128M file"
on a fast_commit-enabled fs:
kernel BUG at fs/ext4/ext4_extents.h:199!
invalid opcode: 0000 [#1] SMP PTI
CPU: 3 PID: 624 Comm: fallocate Not tainted 5.14.0-rc6+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
RIP: 0010:ext4_fc_write_inode_data+0x1f3/0x200
Call Trace:
? ext4_fc_write_inode+0xf2/0x150
ext4_fc_commit+0x93b/0xa00
? ext4_fallocate+0x1ad/0x10d0
ext4_sync_file+0x157/0x340
? ext4_sync_file+0x157/0x340
vfs_fsync_range+0x49/0x80
do_fsync+0x3d/0x70
__x64_sys_fsync+0x14/0x20
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Simply fixing it by limiting the number of blocks
in one EXT4_FC_TAG_ADD_RANGE TLV.
Fixes: aa75f4d3da ("ext4: main fast-commit commit path")
Cc: stable@kernel.org
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210820044505.474318-1-houtao1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75ca6ad408 upstream.
We should use unsigned long long rather than loff_t to avoid
overflow in ext4_max_bitmap_size() for comparison before returning.
w/o this patch sbi->s_bitmap_maxbytes was becoming a negative
value due to overflow of upper_limit (with has_huge_files as true)
Below is a quick test to trigger it on a 64KB pagesize system.
sudo mkfs.ext4 -b 65536 -O ^has_extents,^64bit /dev/loop2
sudo mount /dev/loop2 /mnt
sudo echo "hello" > /mnt/hello -> This will error out with
"echo: write error: File too large"
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/594f409e2c543e90fd836b78188dfa5c575065ba.1622867594.git.riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a89936cce8 upstream.
The tty driver name is used also after registering the driver and must
specifically not be allocated on the stack to avoid leaking information
to user space (or triggering an oops).
Drivers should not try to encode topology information in the tty device
name but this one snuck in through staging without anyone noticing and
another driver has since copied this malpractice.
Fixing the ABI is a separate issue, but this at least plugs the security
hole.
Fixes: ba4dc61fe8 ("Staging: ipack: add support for IP-OCTAL mezzanine board")
Cc: stable@vger.kernel.org # 3.5
Acked-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20210917114622.5412-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2de9d8e0d2 upstream.
When we have a dependency of the form:
Device-A -> Device-C
Device-B
Device-C -> Device-B
Where,
* Indentation denotes "child of" parent in previous line.
* X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
We have cyclic dependency: device-A -> device-C -> device-B -> device-A
fw_devlink current treats device-C -> device-B dependency as an invalid
dependency and doesn't enforce it but leaves the rest of the
dependencies as is.
While the current behavior is necessary, it is not sufficient if the
false dependency in this example is actually device-A -> device-C. When
this is the case, device-C will correctly probe defer waiting for
device-B to be added, but device-A will be incorrectly probe deferred by
fw_devlink waiting on device-C to probe successfully. Due to this, none
of the devices in the cycle will end up probing.
To fix this, we need to go relax all the dependencies in the cycle like
we already do in the other instances where fw_devlink detects cycles.
A real world example of this was reported[1] and analyzed[2].
[1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
[2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
Fixes: f9aa460672 ("driver core: Refactor fw_devlink feature")
Cc: stable <stable@vger.kernel.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915170940.617415-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b2f72cc0a upstream.
In commit b212921b13 ("elf: don't use MAP_FIXED_NOREPLACE for elf
executable mappings") we still leave MAP_FIXED_NOREPLACE in place for
load_elf_interp.
Unfortunately, this will cause kernel to fail to start with:
1 (init): Uhuuh, elf segment at 00003ffff7ffd000 requested but the memory is mapped already
Failed to execute /init (error -17)
The reason is that the elf interpreter (ld.so) has overlapping segments.
readelf -l ld-2.31.so
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x000000000002c94c 0x000000000002c94c R E 0x10000
LOAD 0x000000000002dae0 0x000000000003dae0 0x000000000003dae0
0x00000000000021e8 0x0000000000002320 RW 0x10000
LOAD 0x000000000002fe00 0x000000000003fe00 0x000000000003fe00
0x00000000000011ac 0x0000000000001328 RW 0x10000
The reason for this problem is the same as described in commit
ad55eac74f ("elf: enforce MAP_FIXED on overlaying elf segments").
Not only executable binaries, elf interpreters (e.g. ld.so) can have
overlapping elf segments, so we better drop MAP_FIXED_NOREPLACE and go
back to MAP_FIXED in load_elf_interp.
Fixes: 4ed2863951 ("fs, elf: drop MAP_FIXED usage from elf_map")
Cc: <stable@vger.kernel.org> # v4.19
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Chen Jingwen <chenjingwen6@huawei.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 291073a566 ]
The recent change to make objtool aware of more symbol relocation types
(commit 24ff652573: "objtool: Teach get_alt_entry() about more
relocation types") also added another check, and resulted in this
objtool warning when building kvm on x86:
arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception
The reason seems to be that kvm_fastop_exception() is marked as a global
symbol, which causes the relocation to ke kept around for objtool. And
at the same time, the kvm_fastop_exception definition (which is done as
an inline asm statement) doesn't actually set the type of the global,
which then makes objtool unhappy.
The minimal fix is to just not mark kvm_fastop_exception as being a
global symbol. It's only used in that one compilation unit anyway, so
it was always pointless. That's how all the other local exception table
labels are done.
I'm not entirely happy about the kinds of games that the kvm code plays
with doing its own exception handling, and the fact that it confused
objtool is most definitely a symptom of the code being a bit too subtle
and ad-hoc. But at least this trivial one-liner makes objtool no longer
upset about what is going on.
Fixes: 24ff652573 ("objtool: Teach get_alt_entry() about more relocation types")
Link: https://lore.kernel.org/lkml/CAHk-=wiZwq-0LknKhXN4M+T8jbxn_2i9mcKpO+OaBSSq_Eh7tg@mail.gmail.com/
Cc: Borislav Petkov <bp@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2630cde267 ]
Since commit a7b359fc6a ("sched/fair: Correctly insert cfs_rq's to
list on unthrottle") we add cfs_rqs with no runnable tasks but not fully
decayed into the load (leaf) list. We may ignore adding some ancestors
and therefore breaking tmp_alone_branch invariant. This broke LTP test
cfs_bandwidth01 and it was partially fixed in commit fdaba61ef8
("sched/fair: Ensure that the CFS parent is added after unthrottling").
I noticed the named test still fails even with the fix (but with low
probability, 1 in ~1000 executions of the test). The reason is when
bailing out of unthrottle_cfs_rq early, we may miss adding ancestors of
the unthrottled cfs_rq, thus, not joining tmp_alone_branch properly.
Fix this by adding ancestors if we notice the unthrottled cfs_rq was
added to the load list.
Fixes: a7b359fc6a ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Odin Ugedal <odin@uged.al>
Link: https://lore.kernel.org/r/20210917153037.11176-1-mkoutny@suse.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 24ff652573 ]
Occasionally objtool encounters symbol (as opposed to section)
relocations in .altinstructions. Typically they are the alternatives
written by elf_add_alternative() as encountered on a noinstr
validation run on vmlinux after having already ran objtool on the
individual .o files.
Basically this is the counterpart of commit 44f6a7c075 ("objtool:
Fix seg fault with Clang non-section symbols"), because when these new
assemblers (binutils now also does this) strip the section symbols,
elf_add_reloc_to_insn() is forced to emit symbol based relocations.
As such, teach get_alt_entry() about different relocation types.
Fixes: 9bc0bb5072 ("objtool/x86: Rewrite retpoline thunk calls")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/YVWUvknIEVNkPvnP@hirez.programming.kicks-ass.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 35306eb238 ]
Jann Horn reported that SO_PEERCRED and SO_PEERGROUPS implementations
are racy, as af_unix can concurrently change sk_peer_pid and sk_peer_cred.
In order to fix this issue, this patch adds a new spinlock that needs
to be used whenever these fields are read or written.
Jann also pointed out that l2cap_sock_get_peer_pid_cb() is currently
reading sk->sk_peer_pid which makes no sense, as this field
is only possibly set by AF_UNIX sockets.
We will have to clean this in a separate patch.
This could be done by reverting b48596d1dc "Bluetooth: L2CAP: Add get_peer_pid callback"
or implementing what was truly expected.
Fixes: 109f6e39fa ("af_unix: Allow SO_PEERCRED to work across namespaces.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jann Horn <jannh@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 656ed8b015 ]
When STMMAC is paired with Energy-Efficient Ethernet(EEE) capable PHY,
and the PHY is advertising EEE by default, we need to enable EEE on the
xPCS side too, instead of having user to manually trigger the enabling
config via ethtool.
Fixed this by adding xpcs_config_eee() call in stmmac_eee_init().
Fixes: 7617af3d1a ("net: pcs: Introducing support for DWC xpcs Energy Efficient Ethernet")
Cc: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d88fd1b546 ]
When EEE support was added to the 28nm EPHY it was assumed that it would
be able to support the standard clause 45 over clause 22 register access
method. It turns out that the PHY does not support that, which is the
very reason for using the indirect shadow mode 2 bank 3 access method.
Implement {read,write}_mmd to allow the standard PHY library routines
pertaining to EEE querying and configuration to work correctly on these
PHYs. This forces us to implement a __phy_set_clr_bits() function that
does not grab the MDIO bus lock since the PHY driver's {read,write}_mmd
functions are always called with that lock held.
Fixes: 83ee102a69 ("net: phy: bcm7xxx: add support for 28nm EPHY")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0178839ccc ]
Currently, the firmware compatible features are enabled in PF driver
initialization process, but they are not disabled in PF driver
deinitialization process and firmware keeps these features in enabled
status.
In this case, if load an old PF driver (for example, in VM) which not
support the firmware compatible features, firmware will still send mailbox
message to PF when link status changed and PF will print
"un-supported mailbox message, code = 201".
To fix this problem, disable these firmware compatible features in PF
driver deinitialization process.
Fixes: ed8fb4b262 ("net: hns3: add link change event report")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27bf4af69f ]
Currently, the rx vlan filter will always be disabled before selftest and
be enabled after selftest as the rx vlan filter feature is fixed on in
old device earlier than V3.
However, this feature is not fixed in some new devices and it can be
disabled by user. In this case, it is wrong if rx vlan filter is enabled
after selftest. So fix it.
Fixes: bcc26e8dc4 ("net: hns3: remove unused code in hns3_self_test()")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 108b3c7810 ]
Currently, if function adds an existing unicast mac address, eventhough
driver will not add this address into hardware, but it will return 0 in
function hclge_add_uc_addr_common(). It will cause the state of this
unicast mac address is ACTIVE in driver, but it should be in TO-ADD state.
To fix this problem, function hclge_add_uc_addr_common() returns -EEXIST
if mac address is existing, and delete two error log to avoid printing
them all the time after this modification.
Fixes: 72110b5674 ("net: hns3: return 0 and print warning when hit duplicate MAC")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0472e95ffe ]
HCLGE_FLAG_MQPRIO_ENABLE is supposed to set when enable
multiple TCs with tc mqprio, and HCLGE_FLAG_DCB_ENABLE is
supposed to set when enable multiple TCs with ets. But
the driver mixed the flags when updating the tm configuration.
Furtherly, PFC should be available when HCLGE_FLAG_MQPRIO_ENABLE
too, so remove the unnecessary limitation.
Fixes: 5a5c909174 ("net: hns3: add support for tc mqprio offload")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d82650be60 ]
For destroy mqprio is irreversible in stack, so it's unnecessary
to rollback the tc configuration when destroy mqprio failed.
Otherwise, it may cause the configuration being inconsistent
between driver and netstack.
As the failure is usually caused by reset, and the driver will
restore the configuration after reset, so it can keep the
configuration being consistent between driver and hardware.
Fixes: 5a5c909174 ("net: hns3: add support for tc mqprio offload")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a8e76fefe3 ]
Currently, in function hns3_nic_set_real_num_queue(), the
driver doesn't report the queue count and offset for disabled
tc. If user enables multiple TCs, but only maps user
priorities to partial of them, it may cause the queue range
of the unmapped TC being displayed abnormally.
Fix it by removing the tc enable checking, ensure the queue
count is not zero.
With this change, the tc_en is useless now, so remove it.
Fixes: a75a8efa00 ("net: hns3: Fix tc setup when netdev is first up")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 513e605d7a ]
The ixgbe driver currently generates a NULL pointer dereference with
some machine (online cpus < 63). This is due to the fact that the
maximum value of num_xdp_queues is nr_cpu_ids. Code is in
"ixgbe_set_rss_queues"".
Here's how the problem repeats itself:
Some machine (online cpus < 63), And user set num_queues to 63 through
ethtool. Code is in the "ixgbe_set_channels",
adapter->ring_feature[RING_F_FDIR].limit = count;
It becomes 63.
When user use xdp, "ixgbe_set_rss_queues" will set queues num.
adapter->num_rx_queues = rss_i;
adapter->num_tx_queues = rss_i;
adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
And rss_i's value is from
f = &adapter->ring_feature[RING_F_FDIR];
rss_i = f->indices = f->limit;
So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
for (i = 0; i < adapter->num_rx_queues; i++)
if (adapter->xdp_ring[i]->xsk_umem)
It leads to panic.
Call trace:
[exception RIP: ixgbe_xdp+368]
RIP: ffffffffc02a76a0 RSP: ffff9fe16202f8d0 RFLAGS: 00010297
RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000000001c RDI: ffffffffa94ead90
RBP: ffff92f8f24c0c18 R8: 0000000000000000 R9: 0000000000000000
R10: ffff9fe16202f830 R11: 0000000000000000 R12: ffff92f8f24c0000
R13: ffff9fe16202fc01 R14: 000000000000000a R15: ffffffffc02a7530
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
7 [ffff9fe16202f8f0] dev_xdp_install at ffffffffa89fbbcc
8 [ffff9fe16202f920] dev_change_xdp_fd at ffffffffa8a08808
9 [ffff9fe16202f960] do_setlink at ffffffffa8a20235
10 [ffff9fe16202fa88] rtnl_setlink at ffffffffa8a20384
11 [ffff9fe16202fc78] rtnetlink_rcv_msg at ffffffffa8a1a8dd
12 [ffff9fe16202fcf0] netlink_rcv_skb at ffffffffa8a717eb
13 [ffff9fe16202fd40] netlink_unicast at ffffffffa8a70f88
14 [ffff9fe16202fd80] netlink_sendmsg at ffffffffa8a71319
15 [ffff9fe16202fdf0] sock_sendmsg at ffffffffa89df290
16 [ffff9fe16202fe08] __sys_sendto at ffffffffa89e19c8
17 [ffff9fe16202ff30] __x64_sys_sendto at ffffffffa89e1a64
18 [ffff9fe16202ff38] do_syscall_64 at ffffffffa84042b9
19 [ffff9fe16202ff50] entry_SYSCALL_64_after_hwframe at ffffffffa8c0008c
So I fix ixgbe_max_channels so that it will not allow a setting of queues
to be higher than the num_online_cpus(). And when run to ixgbe_xdp_setup,
take the smaller value of num_rx_queues and num_xdp_queues.
Fixes: 4a9b32f30f ("ixgbe: fix potential RX buffer starvation for AF_XDP")
Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79a7482249 ]
Both cxgb4 and csiostor drivers run on their own independent Physical
Function. But when cxgb4 and csiostor are both being loaded in parallel via
modprobe, there is a race when firmware upgrade is attempted by both the
drivers.
When the cxgb4 driver initiates the firmware upgrade, it halts the firmware
and the chip until upgrade is complete. When the csiostor driver is coming
up in parallel, the firmware mailbox communication fails with timeouts and
the csiostor driver probe fails.
Add a module soft dependency on cxgb4 driver to ensure loading csiostor
triggers cxgb4 to load first when available to avoid the firmware upgrade
race.
Link: https://lore.kernel.org/r/1632759248-15382-1-git-send-email-rahul.lakkireddy@chelsio.com
Fixes: a3667aaed5 ("[SCSI] csiostor: Chelsio FCoE offload driver")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ebc69e897e ]
This reverts commit 2d52c58b9c.
We have had several folks complain that this causes hangs for them, which
is especially problematic as the commit has also hit stable already.
As no resolution seems to be forthcoming right now, revert the patch.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214503
Fixes: 2d52c58b9c ("block, bfq: honor already-setup queue merges")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c23bb54f28 ]
Don't print stats for which we haven't reserved space as it can
cause nasty memory bashing and related bad behaviors.
Fixes: aa620993b1 ("ionic: pull per-q stats work out of queue loops")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 51bb08dd04 ]
An object file cannot be built for both loadable module and built-in
use at the same time:
arm-linux-gnueabi-ld: drivers/net/ethernet/micrel/ks8851_common.o: in function `ks8851_probe_common':
ks8851_common.c:(.text+0xf80): undefined reference to `__this_module'
Change the ks8851_common code to be a standalone module instead,
and use Makefile logic to ensure this is built-in if at least one
of its two users is.
Fixes: 797047f875 ("net: ks8851: Implement Parallel bus operations")
Link: https://lore.kernel.org/netdev/20210125121937.3900988-1-arnd@kernel.org/
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marek Vasut <marex@denx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ced185824c ]
Fix the case where the dst register maps to %rax as otherwise this produces
an incorrect mapping with the implementation in 981f94c3e9 ("bpf: Add
bitwise atomic instructions") as %rax is clobbered given it's part of the
cmpxchg as operand.
The issue is similar to b29dd96b90 ("bpf, x86: Fix BPF_FETCH atomic and/or/
xor with r0 as src") just that the case of dst register was missed.
Before, dst=r0 (%rax) src=r2 (%rsi):
[...]
c5: mov %rax,%r10
c8: mov 0x0(%rax),%rax <---+ (broken)
cc: mov %rax,%r11 |
cf: and %rsi,%r11 |
d2: lock cmpxchg %r11,0x0(%rax) <---+
d8: jne 0x00000000000000c8 |
da: mov %rax,%rsi |
dd: mov %r10,%rax |
[...] |
|
After, dst=r0 (%rax) src=r2 (%rsi): |
|
[...] |
da: mov %rax,%r10 |
dd: mov 0x0(%r10),%rax <---+ (fixed)
e1: mov %rax,%r11 |
e4: and %rsi,%r11 |
e7: lock cmpxchg %r11,0x0(%r10) <---+
ed: jne 0x00000000000000dd
ef: mov %rax,%rsi
f2: mov %r10,%rax
[...]
The remaining combinations were fine as-is though:
After, dst=r9 (%r15) src=r0 (%rax):
[...]
dc: mov %rax,%r10
df: mov 0x0(%r15),%rax
e3: mov %rax,%r11
e6: and %r10,%r11
e9: lock cmpxchg %r11,0x0(%r15)
ef: jne 0x00000000000000df _
f1: mov %rax,%r10 | (unneeded, but
f4: mov %r10,%rax _| not a problem)
[...]
After, dst=r9 (%r15) src=r4 (%rcx):
[...]
de: mov %rax,%r10
e1: mov 0x0(%r15),%rax
e5: mov %rax,%r11
e8: and %rcx,%r11
eb: lock cmpxchg %r11,0x0(%r15)
f1: jne 0x00000000000000e1
f3: mov %rax,%rcx
f6: mov %r10,%rax
[...]
The case of dst == src register is rejected by the verifier and
therefore not supported, but x86 JIT also handles this case just
fine.
After, dst=r0 (%rax) src=r0 (%rax):
[...]
eb: mov %rax,%r10
ee: mov 0x0(%r10),%rax
f2: mov %rax,%r11
f5: and %r10,%r11
f8: lock cmpxchg %r11,0x0(%r10)
fe: jne 0x00000000000000ee
100: mov %rax,%r10
103: mov %r10,%rax
[...]
Fixes: 981f94c3e9 ("bpf: Add bitwise atomic instructions")
Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d888eaac4f ]
When building bpf selftest with make -j, I'm randomly getting build failures
such as this one:
In file included from progs/bpf_flow.c:19:
[...]/tools/testing/selftests/bpf/tools/include/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found
#include "bpf_helper_defs.h"
^~~~~~~~~~~~~~~~~~~
The file that fails the build varies between runs but it's always in the
progs/ subdir.
The reason is a missing make dependency on libbpf for the .o files in
progs/. There was a dependency before commit 3ac2e20fba but that commit
removed it to prevent unneeded rebuilds. However, that only works if libbpf
has been built already; the 'wildcard' prerequisite does not trigger when
there's no bpf_helper_defs.h generated yet.
Keep the libbpf as an order-only prerequisite to satisfy both goals. It is
always built before the progs/ objects but it does not trigger unnecessary
rebuilds by itself.
Fixes: 3ac2e20fba ("selftests/bpf: BPF object files should depend only on libbpf headers")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/ee84ab66436fba05a197f952af23c98d90eb6243.1632758415.git.jbenc@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bcfd367c28 ]
When a BPF object is compiled without BTF info (without -g),
trying to link such objects using bpftool causes a SIGSEGV due to
btf__get_nr_types accessing obj->btf which is NULL. Fix this by
checking for the NULL pointer, and return error.
Reproducer:
$ cat a.bpf.c
extern int foo(void);
int bar(void) { return foo(); }
$ cat b.bpf.c
int foo(void) { return 0; }
$ clang -O2 -target bpf -c a.bpf.c
$ clang -O2 -target bpf -c b.bpf.c
$ bpftool gen obj out a.bpf.o b.bpf.o
Segmentation fault (core dumped)
After fix:
$ bpftool gen obj out a.bpf.o b.bpf.o
libbpf: failed to find BTF info for object 'a.bpf.o'
Error: failed to link 'a.bpf.o': Unknown error -22 (-22)
Fixes: a46349227c (libbpf: Add linker extern resolution support for functions and global variables)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a98ae12fb ]
When introducing CAP_BPF, bpf_jit_charge_modmem() was not changed to treat
programs with CAP_BPF as privileged for the purpose of JIT memory allocation.
This means that a program without CAP_BPF can block a program with CAP_BPF
from loading a program.
Fix this by checking bpf_capable() in bpf_jit_charge_modmem().
Fixes: 2c78ee898d ("bpf: Implement CAP_BPF")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210922111153.19843-1-lmb@cloudflare.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 51032e6f17 ]
The e100_get_regs function is used to implement a simple register dump
for the e100 device. The data is broken into a couple of MAC control
registers, and then a series of PHY registers, followed by a memory dump
buffer.
The total length of the register dump is defined as (1 + E100_PHY_REGS)
* sizeof(u32) + sizeof(nic->mem->dump_buf).
The logic for filling in the PHY registers uses a convoluted inverted
count for loop which counts from E100_PHY_REGS (0x1C) down to 0, and
assigns the slots 1 + E100_PHY_REGS - i. The first loop iteration will
fill in [1] and the final loop iteration will fill in [1 + 0x1C]. This
is actually one more than the supposed number of PHY registers.
The memory dump buffer is then filled into the space at
[2 + E100_PHY_REGS] which will cause that memcpy to assign 4 bytes past
the total size.
The end result is that we overrun the total buffer size allocated by the
kernel, which could lead to a panic or other issues due to memory
corruption.
It is difficult to determine the actual total number of registers
here. The only 8255x datasheet I could find indicates there are 28 total
MDI registers. However, we're reading 29 here, and reading them in
reverse!
In addition, the ethtool e100 register dump interface appears to read
the first PHY register to determine if the device is in MDI or MDIx
mode. This doesn't appear to be documented anywhere within the 8255x
datasheet. I can only assume it must be in register 28 (the extra
register we're reading here).
Lets not change any of the intended meaning of what we copy here. Just
extend the space by 4 bytes to account for the extra register and
continue copying the data out in the same order.
Change the E100_PHY_REGS value to be the correct total (29) so that the
total register dump size is calculated properly. Fix the offset for
where we copy the dump buffer so that it doesn't overrun the total size.
Re-write the for loop to use counting up instead of the convoluted
down-counting. Correct the mdio_read offset to use the 0-based register
offsets, but maintain the bizarre reverse ordering so that we have the
ABI expected by applications like ethtool. This requires and additional
subtraction of 1. It seems a bit odd but it makes the flow of assignment
into the register buffer easier to follow.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Felicitas Hetzelt <felicitashetzelt@gmail.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4329c8dc11 ]
commit abf9b90205 ("e100: cleanup unneeded math") tried to simplify
e100_get_regs_len and remove a double 'divide and then multiply'
calculation that the e100_reg_regs_len function did.
This change broke the size calculation entirely as it failed to account
for the fact that the numbered registers are actually 4 bytes wide and
not 1 byte. This resulted in a significant under allocation of the
register buffer used by e100_get_regs.
Fix this by properly multiplying the register count by u32 first before
adding the size of the dump buffer.
Fixes: abf9b90205 ("e100: cleanup unneeded math")
Reported-by: Felicitas Hetzelt <felicitashetzelt@gmail.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b9c587fed6 ]
Same members of the Marvell Ethernet switches impose MTU restrictions
on ports used for connecting to the CPU or another switch for DSA. If
the MTU is set too low, tagged frames will be discarded. Ensure the
worst case tagger overhead is included in setting the MTU for DSA and
CPU ports.
Fixes: 1baf0fac10 ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
Reported by: 曹煜 <cao88yu@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b92ce2f54c ]
The MTU passed to the DSA driver is the payload size, typically 1500.
However, the switch uses the frame size when applying restrictions.
Adjust the MTU with the size of the Ethernet header and the frame
checksum. The VLAN header also needs to be included when the frame
size it per port, but not when it is global.
Fixes: 1baf0fac10 ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
Reported by: 曹煜 <cao88yu@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe23036192 ]
The datasheets suggests the 6161 uses a per port setting for jumbo
frames. Testing has however shown this is not correct, it uses the old
style chip wide MTU control. Change the ops in the 6161 structure to
reflect this.
Fixes: 1baf0fac10 ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
Reported by: 曹煜 <cao88yu@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c83ff01864 ]
Currently we blow up in trace_dma_fence_init, when calling into
get_driver_name or get_timeline_name, since both the engine and context
might be NULL(or contain some garbage address) in the case of newly
allocated slab objects via the request ctor. Note that we also use
SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
freed, but delay freeing the underlying page by an RCU grace period.
With this scheme requests can be re-allocated, at the same time as they
are also being read by some lockless RCU lookup mechanism.
In the ctor case, which is only called for new slab objects(i.e allocate
new page and call the ctor for each object) it's safe to reset the
context/engine prior to calling into dma_fence_init, since we can be
certain that no one is doing an RCU lookup which might depend on peeking
at the engine/context, like in active_engine(), since the object can't
yet be externally visible.
In the recycled case(which might also be externally visible) the request
refcount always transitions from 0->1 after we set the context/engine
etc, which should ensure it's valid to dereference the engine for
example, when doing an RCU list-walk, so long as we can also increment
the refcount first. If the refcount is already zero, then the request is
considered complete/released. If it's non-zero, then the request might
be in the process of being re-allocated, or potentially still in flight,
however after successfully incrementing the refcount, it's possible to
carefully inspect the request state, to determine if the request is
still what we were looking for. Note that all externally visible
requests returned to the cache must have zero refcount.
One possible fix then is to move dma_fence_init out from the request
ctor. Originally this was how it was done, but it was moved in:
commit 855e39e65c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Feb 3 09:41:48 2020 +0000
drm/i915: Initialise basic fence before acquiring seqno
where it looks like intel_timeline_get_seqno() relied on some of the
rq->fence state, but that is no longer the case since:
commit 12ca695d2c
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date: Tue Mar 23 16:49:50 2021 +0100
drm/i915: Do not share hwsp across contexts any more, v8.
intel_timeline_get_seqno() could also be cleaned up slightly by dropping
the request argument.
Moving dma_fence_init back out of the ctor, should ensure we have enough
of the request initialised in case of trace_dma_fence_init.
Functionally this should be the same, and is effectively what we were
already open coding before, except now we also assign the fence->lock
and fence->ops, but since these are invariant for recycled
requests(which might be externally visible), and will therefore already
hold the same value, it shouldn't matter.
An alternative fix, since we don't yet have a fully initialised request
when in the ctor, is just setting the context/engine as NULL, but this
does require adding some extra handling in get_driver_name etc.
v2(Daniel):
- Try to make the commit message less confusing
Fixes: 855e39e65c ("drm/i915: Initialise basic fence before acquiring seqno")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Michael Mason <michael.w.mason@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210921134202.3803151-1-matthew.auld@intel.com
(cherry picked from commit be988eaee1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ab8a447bc ]
After commit 05b35e7eb9 ("smsc95xx: add phylib support"), link changes
are no longer propagated to usbnet. As a result, rx URB allocation won't
happen until there is a packet sent out first (this might never happen,
e.g. running just ssh server with a static IP). Fix by triggering usbnet
EVENT_LINK_CHANGE.
Fixes: 05b35e7eb9 ("smsc95xx: add phylib support")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 597aa16c78 ]
Multipath RTA_FLOW is embedded in nexthop. Dump it in fib_add_nexthop()
to get the length of rtnexthop correct.
Fixes: b0f6019363 ("ipv4: Refactor nexthop attributes in fib_dump_info")
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 325fd36ae7 ]
The enetc phylink .mac_config handler intends to clear the IFMODE field
(bits 1:0) of the PM0_IF_MODE register, but incorrectly clears all the
other fields instead.
For normal operation, the bug was inconsequential, due to the fact that
we write the PM0_IF_MODE register in two stages, first in
phylink .mac_config (which incorrectly cleared out a bunch of stuff),
then we update the speed and duplex to the correct values in
phylink .mac_link_up.
Judging by the code (not tested), it looks like maybe loopback mode was
broken, since this is one of the settings in PM0_IF_MODE which is
incorrectly cleared.
Fixes: c76a97218d ("net: enetc: force the RGMII speed and duplex instead of operating in inband mode")
Reported-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 540effa7f2 ]
For both local and remote sensors all the supported ICs can report an
"undervoltage lockout" condition which means the conversion wasn't
properly performed due to insufficient power supply voltage and so the
measurement results can't be trusted.
Fixes: 9410700b88 ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924093011.26083-2-fercerpav@gmail.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 14351f08ed ]
gcc 8.3 and 5.4 throw this:
In function 'modify_qp_init_to_rtr',
././include/linux/compiler_types.h:322:38: error: call to '__compiletime_assert_1859' declared with attribute error: FIELD_PREP: value too large for the field
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
[..]
drivers/infiniband/hw/hns/hns_roce_common.h:91:52: note: in expansion of macro 'FIELD_PREP'
*((__le32 *)ptr + (field_h) / 32) |= cpu_to_le32(FIELD_PREP( \
^~~~~~~~~~
drivers/infiniband/hw/hns/hns_roce_common.h:95:39: note: in expansion of macro '_hr_reg_write'
#define hr_reg_write(ptr, field, val) _hr_reg_write(ptr, field, val)
^~~~~~~~~~~~~
drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4412:2: note: in expansion of macro 'hr_reg_write'
hr_reg_write(context, QPC_LP_PKTN_INI, lp_pktn_ini);
Because gcc has miscalculated the constantness of lp_pktn_ini:
mtu = ib_mtu_enum_to_int(ib_mtu);
if (WARN_ON(mtu < 0)) [..]
lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / mtu);
Since mtu is limited to {256,512,1024,2048,4096} lp_pktn_ini is between 4
and 8 which is compatible with the 4 bit field in the FIELD_PREP.
Work around this broken compiler by adding a 'can never be true'
constraint on lp_pktn_ini's value which clears out the problem.
Fixes: f0cb411aad ("RDMA/hns: Use new interface to modify QP context")
Link: https://lore.kernel.org/r/0-v1-c773ecb137bc+11f-hns_gcc8_jgg@nvidia.com
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3f4a08909e ]
current Linux refuses to change the 'backup' bit of MPTCP endpoints, i.e.
using MPTCP_PM_CMD_SET_FLAGS, unless it finds (at least) one subflow that
matches the endpoint address. There is no reason for that, so we can just
ignore the return value of mptcp_nl_addr_backup(). In this way, endpoints
can reconfigure their 'backup' flag even if no MPTCP sockets are open (or
more generally, in case the MP_PRIO message is not sent out).
Fixes: 0f9f696a50 ("mptcp: add set_flags command in PM netlink")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ea1300b9df ]
mptcp_token_get_sock() may return a mptcp socket that is in
a different net namespace than the socket that received the token value.
The mptcp syncookie code path had an explicit check for this,
this moves the test into mptcp_token_get_sock() function.
Eventually token.c should be converted to pernet storage, but
such change is not suitable for net tree.
Fixes: 2c5ebd001d ("mptcp: refactor token container")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f7e745f8e9 ]
We should always check if skb_header_pointer's return is NULL before
using it, otherwise it may cause null-ptr-deref, as syzbot reported:
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:sctp_rcv_ootb net/sctp/input.c:705 [inline]
RIP: 0010:sctp_rcv+0x1d84/0x3220 net/sctp/input.c:196
Call Trace:
<IRQ>
sctp6_rcv+0x38/0x60 net/sctp/ipv6.c:1109
ip6_protocol_deliver_rcu+0x2e9/0x1ca0 net/ipv6/ip6_input.c:422
ip6_input_finish+0x62/0x170 net/ipv6/ip6_input.c:463
NF_HOOK include/linux/netfilter.h:307 [inline]
NF_HOOK include/linux/netfilter.h:301 [inline]
ip6_input+0x9c/0xd0 net/ipv6/ip6_input.c:472
dst_input include/net/dst.h:460 [inline]
ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
NF_HOOK include/linux/netfilter.h:307 [inline]
NF_HOOK include/linux/netfilter.h:301 [inline]
ipv6_rcv+0x28c/0x3c0 net/ipv6/ip6_input.c:297
Fixes: 3acb50c18d ("sctp: delay as much as possible skb_linearize")
Reported-by: syzbot+581aff2ae6b860625116@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 04f41c68f1 ]
There are many instances of PHYs that depend on a switch to supply a
resource (Eg: interrupts). Switches also expects the PHYs to be probed
by their specific drivers as soon as they are added. If that doesn't
happen, then the switch would force the use of generic PHY drivers for
the PHY even if the PHY might have specific driver available.
fw_devlink=on by design can cause delayed probes of PHY. To avoid, this
we need to set the FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD for the switch's
fwnode before the PHYs are added. The most generic way to do this is to
set this flag for the parent of MDIO busses which is typically the
switch.
For more context:
https://lore.kernel.org/lkml/YTll0i6Rz3WAAYzs@lunn.ch/#t
Fixes: ea718c6990 ("Revert "Revert "driver core: Set fw_devlink=on by default""")
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915170940.617415-4-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5501765a02 ]
If a parent device is also a supplier to a child device, fw_devlink=on by
design delays the probe() of the child device until the probe() of the
parent finishes successfully.
However, some drivers of such parent devices (where parent is also a
supplier) expect the child device to finish probing successfully as soon as
they are added using device_add() and before the probe() of the parent
device has completed successfully. One example of such a case is discussed
in the link mentioned below.
Add a flag to make fw_devlink=on not enforce these supplier-consumer
relationships, so these drivers can continue working.
Link: https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@mail.gmail.com/
Fixes: ea718c6990 ("Revert "Revert "driver core: Set fw_devlink=on by default""")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915170940.617415-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 313bbd1990 ]
Thomas explained in https://lore.kernel.org/r/87mtoeb4hb.ffs@tglx
that our handling of the hrtimer here is wrong: If the timer fires
late (e.g. due to vCPU scheduling, as reported by Dmitry/syzbot)
then it tries to actually rearm the timer at the next deadline,
which might be in the past already:
1 2 3 N N+1
| | | ... | |
^ intended to fire here (1)
^ next deadline here (2)
^ actually fired here
The next time it fires, it's later, but will still try to schedule
for the next deadline (now 3), etc. until it catches up with N,
but that might take a long time, causing stalls etc.
Now, all of this is simulation, so we just have to fix it, but
note that the behaviour is wrong even per spec, since there's no
value then in sending all those beacons unaligned - they should be
aligned to the TBTT (1, 2, 3, ... in the picture), and if we're a
bit (or a lot) late, then just resume at that point.
Therefore, change the code to use hrtimer_forward_now() which will
ensure that the next firing of the timer would be at N+1 (in the
picture), i.e. the next interval point after the current time.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot+0e964fad69a9c462bc1e@syzkaller.appspotmail.com
Fixes: 01e59e467e ("mac80211_hwsim: hrtimer beacon")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210915112936.544f383472eb.I3f9712009027aa09244b65399bf18bf482a8c4f1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe94bac626 ]
In ieee80211_amsdu_aggregate() set a pointer frag_tail point to the
end of skb_shinfo(head)->frag_list, and use it to bind other skb in
the end of this function. But when execute ieee80211_amsdu_aggregate()
->ieee80211_amsdu_realloc_pad()->pskb_expand_head(), the address of
skb_shinfo(head)->frag_list will be changed. However, the
ieee80211_amsdu_aggregate() not update frag_tail after call
pskb_expand_head(). That will cause the second skb can't bind to the
head skb appropriately.So we update the address of frag_tail to fix it.
Fixes: 6e0456b545 ("mac80211: add A-MSDU tx support")
Signed-off-by: Chih-Kang Chang <gary.chang@realtek.com>
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://lore.kernel.org/r/20210830073240.12736-1-pkshih@realtek.com
[reword comment]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 98d46b021f ]
This reverts commit d333322361 ("mac80211: do not use low data rates for
data frames with no ack flag").
Returning false early in rate_control_send_low breaks sending broadcast
packets, since rate control will not select a rate for it.
Before re-introducing a fixed version of this patch, we should probably also
make some changes to rate control to be more conservative in selecting rates
for no-ack packets and also prevent using probing rates on them, since we won't
get any feedback.
Fixes: d333322361 ("mac80211: do not use low data rates for data frames with no ack flag")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210906083559.9109-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b53deef054 ]
iptables/nftables has two types of log modules:
1. backend, e.g. nf_log_syslog, which implement the functionality
2. frontend, e.g. xt_LOG or nft_log, which call the functionality
provided by backend based on nf_tables or xtables rule set.
Problem is that the request_module() call to load the backed in
nf_logger_find_get() might happen with nftables transaction mutex held
in case the call path is via nf_tables/nft_compat.
This can cause deadlocks (see 'Fixes' tags for details).
The chosen solution as to let modprobe deal with this by adding 'pre: '
soft dep tag to xt_LOG (to load the syslog backend) and xt_NFLOG (to
load nflog backend).
Eric reports that this breaks on systems with older modprobe that
doesn't support softdeps.
Another, similar issue occurs when someone either insmods xt_(NF)LOG
directly or unloads the backend module (possible if no log frontend
is in use): because the frontend module is already loaded, modprobe is
not invoked again so the softdep isn't evaluated.
Add a workaround: If nf_logger_find_get() returns -ENOENT and call
is not via nft_compat, load the backend explicitly and try again.
Else, let nft_compat ask for deferred request_module via nf_tables
infra.
Softdeps are kept in-place, so with newer modprobe the dependencies
are resolved from userspace.
Fixes: cefa31a9d4 ("netfilter: nft_log: perform module load from nf_tables")
Fixes: a38b5b56d6 ("netfilter: nf_log: add module softdeps")
Reported-and-tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a499b03bf3 ]
syzbot reports following UAF:
BUG: KASAN: use-after-free in memcmp+0x18f/0x1c0 lib/string.c:955
nla_strcmp+0xf2/0x130 lib/nlattr.c:836
nft_table_lookup.part.0+0x1a2/0x460 net/netfilter/nf_tables_api.c:570
nft_table_lookup net/netfilter/nf_tables_api.c:4064 [inline]
nf_tables_getset+0x1b3/0x860 net/netfilter/nf_tables_api.c:4064
nfnetlink_rcv_msg+0x659/0x13f0 net/netfilter/nfnetlink.c:285
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504
Problem is that all get operations are lockless, so the commit_mutex
held by nft_rcv_nl_event() isn't enough to stop a parallel GET request
from doing read-accesses to the table object even after synchronize_rcu().
To avoid this, unlink the table first and store the table objects in
on-stack scratch space.
Fixes: 6001a930ce ("netfilter: nftables: introduce table ownership")
Reported-and-tested-by: syzbot+f31660cf279b0557160c@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5b1e985f76 ]
Due to duplicate reset flags, CQP commands are processed during reset.
This leads CQP failures such as below:
irdma0: [Delete Local MAC Entry Cmd Error][op_code=49] status=-27 waiting=1 completion_err=0 maj=0x0 min=0x0
Remove the redundant flag and set the correct reset flag so CPQ is paused
during reset
Fixes: 8498a30e1b ("RDMA/irdma: Register auxiliary driver and implement private channel OPs")
Link: https://lore.kernel.org/r/20210916191222.824-2-shiraz.saleem@intel.com
Reported-by: LiLiang <liali@redhat.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6fab7af6b ]
Fan speed minimum can be enforced from sysfs. For example, setting
current fan speed to 20 is used to enforce fan speed to be at 100%
speed, 19 - to be not below 90% speed, etcetera. This feature provides
ability to limit fan speed according to some system wise
considerations, like absence of some replaceable units or high system
ambient temperature.
Request for changing fan minimum speed is configuration request and can
be set only through 'sysfs' write procedure. In this situation value of
argument 'state' is above nominal fan speed maximum.
Return non-zero code in this case to avoid
thermal_cooling_device_stats_update() call, because in this case
statistics update violates thermal statistics table range.
The issues is observed in case kernel is configured with option
CONFIG_THERMAL_STATISTICS.
Here is the trace from KASAN:
[ 159.506659] BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x7d/0xb0
[ 159.516016] Read of size 4 at addr ffff888116163840 by task hw-management.s/7444
[ 159.545625] Call Trace:
[ 159.548366] dump_stack+0x92/0xc1
[ 159.552084] ? thermal_cooling_device_stats_update+0x7d/0xb0
[ 159.635869] thermal_zone_device_update+0x345/0x780
[ 159.688711] thermal_zone_device_set_mode+0x7d/0xc0
[ 159.694174] mlxsw_thermal_modules_init+0x48f/0x590 [mlxsw_core]
[ 159.700972] ? mlxsw_thermal_set_cur_state+0x5a0/0x5a0 [mlxsw_core]
[ 159.731827] mlxsw_thermal_init+0x763/0x880 [mlxsw_core]
[ 160.070233] RIP: 0033:0x7fd995909970
[ 160.074239] Code: 73 01 c3 48 8b 0d 28 d5 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 99 2d 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ..
[ 160.095242] RSP: 002b:00007fff54f5d938 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 160.103722] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007fd995909970
[ 160.111710] RDX: 0000000000000013 RSI: 0000000001906008 RDI: 0000000000000001
[ 160.119699] RBP: 0000000001906008 R08: 00007fd995bc9760 R09: 00007fd996210700
[ 160.127687] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000013
[ 160.135673] R13: 0000000000000001 R14: 00007fd995bc8600 R15: 0000000000000013
[ 160.143671]
[ 160.145338] Allocated by task 2924:
[ 160.149242] kasan_save_stack+0x19/0x40
[ 160.153541] __kasan_kmalloc+0x7f/0xa0
[ 160.157743] __kmalloc+0x1a2/0x2b0
[ 160.161552] thermal_cooling_device_setup_sysfs+0xf9/0x1a0
[ 160.167687] __thermal_cooling_device_register+0x1b5/0x500
[ 160.173833] devm_thermal_of_cooling_device_register+0x60/0xa0
[ 160.180356] mlxreg_fan_probe+0x474/0x5e0 [mlxreg_fan]
[ 160.248140]
[ 160.249807] The buggy address belongs to the object at ffff888116163400
[ 160.249807] which belongs to the cache kmalloc-1k of size 1024
[ 160.263814] The buggy address is located 64 bytes to the right of
[ 160.263814] 1024-byte region [ffff888116163400, ffff888116163800)
[ 160.277536] The buggy address belongs to the page:
[ 160.282898] page:0000000012275840 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888116167000 pfn:0x116160
[ 160.294872] head:0000000012275840 order:3 compound_mapcount:0 compound_pincount:0
[ 160.303251] flags: 0x200000000010200(slab|head|node=0|zone=2)
[ 160.309694] raw: 0200000000010200 ffffea00046f7208 ffffea0004928208 ffff88810004dbc0
[ 160.318367] raw: ffff888116167000 00000000000a0006 00000001ffffffff 0000000000000000
[ 160.327033] page dumped because: kasan: bad access detected
[ 160.333270]
[ 160.334937] Memory state around the buggy address:
[ 160.356469] >ffff888116163800: fc ..
Fixes: 65afb4c8e7 ("hwmon: (mlxreg-fan) Add support for Mellanox FAN driver")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20210916183151.869427-1-vadimp@nvidia.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca465e1f1f ]
If cma_listen_on_all() fails it leaves the per-device ID still on the
listen_list but the state is not set to RDMA_CM_ADDR_BOUND.
When the cmid is eventually destroyed cma_cancel_listens() is not called
due to the wrong state, however the per-device IDs are still holding the
refcount preventing the ID from being destroyed, thus deadlocking:
task:rping state:D stack: 0 pid:19605 ppid: 47036 flags:0x00000084
Call Trace:
__schedule+0x29a/0x780
? free_unref_page_commit+0x9b/0x110
schedule+0x3c/0xa0
schedule_timeout+0x215/0x2b0
? __flush_work+0x19e/0x1e0
wait_for_completion+0x8d/0xf0
_destroy_id+0x144/0x210 [rdma_cm]
ucma_close_id+0x2b/0x40 [rdma_ucm]
__destroy_id+0x93/0x2c0 [rdma_ucm]
? __xa_erase+0x4a/0xa0
ucma_destroy_id+0x9a/0x120 [rdma_ucm]
ucma_write+0xb8/0x130 [rdma_ucm]
vfs_write+0xb4/0x250
ksys_write+0xb5/0xd0
? syscall_trace_enter.isra.19+0x123/0x190
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Ensure that cma_listen_on_all() atomically unwinds its action under the
lock during error.
Fixes: c80a0c52d8 ("RDMA/cma: Add missing error handling of listen_id")
Link: https://lore.kernel.org/r/20210913093344.17230-1-thomas.liu@ucloud.cn
Signed-off-by: Tao Liu <thomas.liu@ucloud.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2cc74e1ee3 ]
ROCE uses IGMP for Multicast instead of the native Infiniband system where
joins are required in order to post messages on the Multicast group. On
Ethernet one can send Multicast messages to arbitrary addresses without
the need to subscribe to a group.
So ROCE correctly does not send IGMP joins during rdma_join_multicast().
F.e. in cma_iboe_join_multicast() we see:
if (addr->sa_family == AF_INET) {
if (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) {
ib.rec.hop_limit = IPV6_DEFAULT_HOPLIMIT;
if (!send_only) {
err = cma_igmp_send(ndev, &ib.rec.mgid,
true);
}
}
} else {
So the IGMP join is suppressed as it is unnecessary.
However no such check is done in destroy_mc(). And therefore leaving a
sendonly multicast group will send an IGMP leave.
This means that the following scenario can lead to a multicast receiver
unexpectedly being unsubscribed from a MC group:
1. Sender thread does a sendonly join on MC group X. No IGMP join
is sent.
2. Receiver thread does a regular join on the same MC Group x.
IGMP join is sent and the receiver begins to get messages.
3. Sender thread terminates and destroys MC group X.
IGMP leave is sent and the receiver no longer receives data.
This patch adds the same logic for sendonly joins to destroy_mc() that is
also used in cma_iboe_join_multicast().
Fixes: ab15c95a17 ("IB/core: Support for CMA multicast join flags")
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2109081340540.668072@gentwo.de
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 356ed64991 ]
Currently if a function ptr in struct_ops has a return value, its
caller will get a random return value from it, because the return
value of related BPF_PROG_TYPE_STRUCT_OPS prog is just dropped.
So adding a new flag BPF_TRAMP_F_RET_FENTRY_RET to tell bpf trampoline
to save and return the return value of struct_ops prog if ret_size of
the function ptr is greater than 0. Also restricting the flag to be
used alone.
Fixes: 85d33df357 ("bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914023351.3664499-1-houtao1@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 69e73dbfda ]
ip_vs_conn_tab_bits may be provided by the user through the
conn_tab_bits module parameter. If this value is greater than 31, or
less than 0, the shift operator used to derive tab_size causes undefined
behaviour.
Fix this checking ip_vs_conn_tab_bits value to be in the range specified
in ipvs Kconfig. If not, simply use default value.
Fixes: 6f7edb4881 ("IPVS: Allow boot time change of hash size")
Reported-by: Yi Chen <yiche@redhat.com>
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 98122e63a7 upstream.
On GFX9+, format modifiers are always enabled and ensure the
frame-buffers can be scanned out at ADDFB2 time.
On GFX8-, format modifiers are not supported and no other check
is performed. This means ADDFB2 IOCTLs will succeed even if the
tiling isn't supported for scan-out, and will result in garbage
displayed on screen [1].
Fix this by adding a check for tiling flags for GFX8 and older.
The check is taken from radeonsi in Mesa (see how is_displayable
is populated in gfx6_compute_surface).
Changes in v2: use drm_WARN_ONCE instead of drm_WARN (Michel)
[1]: https://github.com/swaywm/wlroots/issues/3185
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26db706a6d upstream.
In the s2idle stress test sdma resume fail occasionally,in the
failed case GPU is in the gfxoff state.This issue may introduce
by firmware miss handle doorbell S/R and now temporary fix the issue
by forcing exit gfxoff for sdma resume.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 083fa05bba upstream.
[Why]
ASSR is dependent on Signed PSP Verstage to enable Content
Protection for eDP panels. Unsigned PSP verstage is used
during development phase causing ASSR to FAIL.
As a result, link training is performed with
DP_PANEL_MODE_DEFAULT instead of DP_PANEL_MODE_EDP for
eDP panels that causes display flicker on some panels.
[How]
- Do not change panel mode, if ASSR is disabled
- Just report and continue to perform eDP link training
with right settings further.
Signed-off-by: Praful Swarnakar <Praful.Swarnakar@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 41e76c6a3c upstream.
commit fad7cd3310 ("nbd: add the check to prevent overflow in
__nbd_ioctl()") raised an issue from the fallback helpers added in
commit f0907827a8 ("compiler.h: enable builtin overflow checkers and
add fallback code")
ERROR: modpost: "__divdi3" [drivers/block/nbd.ko] undefined!
As Stephen Rothwell notes:
The added check_mul_overflow() call is being passed 64 bit values.
COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW is not set for this build (see
include/linux/overflow.h).
Specifically, the helpers for checking whether the results of a
multiplication overflowed (__unsigned_mul_overflow,
__signed_add_overflow) use the division operator when
!COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW. This is problematic for 64b
operands on 32b hosts.
This was fixed upstream by
commit 76ae847497 ("Documentation: raise minimum supported version of
GCC to 5.1")
which is not suitable to be backported to stable.
Further, __builtin_mul_overflow() would emit a libcall to a
compiler-rt-only symbol when compiling with clang < 14 for 32b targets.
ld.lld: error: undefined symbol: __mulodi4
In order to keep stable buildable with GCC 4.9 and clang < 14, modify
struct nbd_config to instead track the number of bits of the block size;
reconstructing the block size using runtime checked shifts that are not
problematic for those compilers and in a ways that can be backported to
stable.
In nbd_set_size, we do validate that the value of blksize must be a
power of two (POT) and is in the range of [512, PAGE_SIZE] (both
inclusive).
This does modify the debugfs interface.
Cc: stable@vger.kernel.org
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://github.com/ClangBuiltLinux/linux/issues/1438
Link: https://lore.kernel.org/all/20210909182525.372ee687@canb.auug.org.au/
Link: https://lore.kernel.org/stable/CAHk-=whiQBofgis_rkniz8GBP9wZtSZdcDEffgSLO62BUGV3gg@mail.gmail.com/
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Suggested-by: Kees Cook <keescook@chromium.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20210920232533.4092046-1-ndesaulniers@google.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 305d568b72 upstream.
The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
on the same id_priv. While this cannot happen without going through the
work, it violates the invariant that the same address resolution
background request cannot be active twice.
CPU 1 CPU 2
rdma_resolve_addr():
RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
rdma_resolve_ip(addr_handler) #1
process_one_req(): for #1
addr_handler():
RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
mutex_unlock(&id_priv->handler_mutex);
[.. handler still running ..]
rdma_resolve_addr():
RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
rdma_resolve_ip(addr_handler)
!! two requests are now on the req_list
rdma_destroy_id():
destroy_id_handler_unlock():
_destroy_id():
cma_cancel_operation():
rdma_addr_cancel()
// process_one_req() self removes it
spin_lock_bh(&lock);
cancel_delayed_work(&req->work);
if (!list_empty(&req->list)) == true
! rdma_addr_cancel() returns after process_on_req #1 is done
kfree(id_priv)
process_one_req(): for #2
addr_handler():
mutex_lock(&id_priv->handler_mutex);
!! Use after free on id_priv
rdma_addr_cancel() expects there to be one req on the list and only
cancels the first one. The self-removal behavior of the work only happens
after the handler has returned. This yields a situations where the
req_list can have two reqs for the same "handle" but rdma_addr_cancel()
only cancels the first one.
The second req remains active beyond rdma_destroy_id() and will
use-after-free id_priv once it inevitably triggers.
Fix this by remembering if the id_priv has called rdma_resolve_ip() and
always cancel before calling it again. This ensures the req_list never
gets more than one item in it and doesn't cost anything in the normal flow
that never uses this strange error path.
Link: https://lore.kernel.org/r/0-v1-3bc675b8006d+22-syz_cancel_uaf_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: e51060f08a ("IB: IP address based RDMA connection manager")
Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc0bdc5afa upstream.
If the state is not idle then rdma_bind_addr() will immediately fail and
no change to global state should happen.
For instance if the state is already RDMA_CM_LISTEN then this will corrupt
the src_addr and would cause the test in cma_cancel_operation():
if (cma_any_addr(cma_src_addr(id_priv)) && !id_priv->cma_dev)
To view a mangled src_addr, eg with a IPv6 loopback address but an IPv4
family, failing the test.
This would manifest as this trace from syzkaller:
BUG: KASAN: use-after-free in __list_add_valid+0x93/0xa0 lib/list_debug.c:26
Read of size 8 at addr ffff8881546491e0 by task syz-executor.1/32204
CPU: 1 PID: 32204 Comm: syz-executor.1 Not tainted 5.12.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
print_address_description.constprop.0.cold+0x5b/0x2f8 mm/kasan/report.c:232
__kasan_report mm/kasan/report.c:399 [inline]
kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
__list_add_valid+0x93/0xa0 lib/list_debug.c:26
__list_add include/linux/list.h:67 [inline]
list_add_tail include/linux/list.h:100 [inline]
cma_listen_on_all drivers/infiniband/core/cma.c:2557 [inline]
rdma_listen+0x787/0xe00 drivers/infiniband/core/cma.c:3751
ucma_listen+0x16a/0x210 drivers/infiniband/core/ucma.c:1102
ucma_write+0x259/0x350 drivers/infiniband/core/ucma.c:1732
vfs_write+0x28e/0xa30 fs/read_write.c:603
ksys_write+0x1ee/0x250 fs/read_write.c:658
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
Which is indicating that an rdma_id_private was destroyed without doing
cma_cancel_listens().
Instead of trying to re-use the src_addr memory to indirectly create an
any address build one explicitly on the stack and bind to that as any
other normal flow would do.
Link: https://lore.kernel.org/r/0-v1-9fbb33f5e201+2a-cma_listen_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: 732d41c545 ("RDMA/cma: Make the locking for automatic state transition more clear")
Reported-by: syzbot+6bb0528b13611047209c@syzkaller.appspotmail.com
Tested-by: Hao Sun <sunhao.th@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c49d1850d upstream.
When updating the host's mask for its MSR_IA32_TSX_CTRL user return entry,
clear the mask in the found uret MSR instead of vmx->guest_uret_msrs[i].
Modifying guest_uret_msrs directly is completely broken as 'i' does not
point at the MSR_IA32_TSX_CTRL entry. In fact, it's guaranteed to be an
out-of-bounds accesses as is always set to kvm_nr_uret_msrs in a prior
loop. By sheer dumb luck, the fallout is limited to "only" failing to
preserve the host's TSX_CTRL_CPUID_CLEAR. The out-of-bounds access is
benign as it's guaranteed to clear a bit in a guest MSR value, which are
always zero at vCPU creation on both x86-64 and i386.
Cc: stable@vger.kernel.org
Fixes: 8ea8b8d6f8 ("KVM: VMX: Use common x86's uret MSR list as the one true list")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210926015545.281083-1-zhenzhong.duan@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d68bad6d8 upstream.
Windows Server 2022 with Hyper-V role enabled failed to boot on KVM when
enlightened VMCS is advertised. Debugging revealed there are two exposed
secondary controls it is not happy with: SECONDARY_EXEC_ENABLE_VMFUNC and
SECONDARY_EXEC_SHADOW_VMCS. These controls are known to be unsupported,
as there are no corresponding fields in eVMCSv1 (see the comment above
EVMCS1_UNSUPPORTED_2NDEXEC definition).
Previously, commit 31de3d2500 ("x86/kvm/hyper-v: move VMX controls
sanitization out of nested_enable_evmcs()") introduced the required
filtering mechanism for VMX MSRs but for some reason put only known
to be problematic (and not full EVMCS1_UNSUPPORTED_* lists) controls
there.
Note, Windows Server 2022 seems to have gained some sanity check for VMX
MSRs: it doesn't even try to launch a guest when there's something it
doesn't like, nested_evmcs_check_controls() mechanism can't catch the
problem.
Let's be bold this time and instead of playing whack-a-mole just filter out
all unsupported controls from VMX MSRs.
Fixes: 31de3d2500 ("x86/kvm/hyper-v: move VMX controls sanitization out of nested_enable_evmcs()")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210907163530.110066-1-vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8a747d088 upstream.
Check whether a CPUID entry's index is significant before checking for a
matching index to hack-a-fix an undefined behavior bug due to consuming
uninitialized data. RESET/INIT emulation uses kvm_cpuid() to retrieve
CPUID.0x1, which does _not_ have a significant index, and fails to
initialize the dummy variable that doubles as EBX/ECX/EDX output _and_
ECX, a.k.a. index, input.
Practically speaking, it's _extremely_ unlikely any compiler will yield
code that causes problems, as the compiler would need to inline the
kvm_cpuid() call to detect the uninitialized data, and intentionally hose
the kernel, e.g. insert ud2, instead of simply ignoring the result of
the index comparison.
Although the sketchy "dummy" pattern was introduced in SVM by commit
66f7b72e11 ("KVM: x86: Make register state after reset conform to
specification"), it wasn't actually broken until commit 7ff6c03503
("KVM: x86: Remove stateful CPUID handling") arbitrarily swapped the
order of operations such that "index" was checked before the significant
flag.
Avoid consuming uninitialized data by reverting to checking the flag
before the index purely so that the fix can be easily backported; the
offending RESET/INIT code has been refactored, moved, and consolidated
from vendor code to common x86 since the bug was introduced. A future
patch will directly address the bad RESET/INIT behavior.
The undefined behavior was detected by syzbot + KernelMemorySanitizer.
BUG: KMSAN: uninit-value in cpuid_entry2_find arch/x86/kvm/cpuid.c:68
BUG: KMSAN: uninit-value in kvm_find_cpuid_entry arch/x86/kvm/cpuid.c:1103
BUG: KMSAN: uninit-value in kvm_cpuid+0x456/0x28f0 arch/x86/kvm/cpuid.c:1183
cpuid_entry2_find arch/x86/kvm/cpuid.c:68 [inline]
kvm_find_cpuid_entry arch/x86/kvm/cpuid.c:1103 [inline]
kvm_cpuid+0x456/0x28f0 arch/x86/kvm/cpuid.c:1183
kvm_vcpu_reset+0x13fb/0x1c20 arch/x86/kvm/x86.c:10885
kvm_apic_accept_events+0x58f/0x8c0 arch/x86/kvm/lapic.c:2923
vcpu_enter_guest+0xfd2/0x6d80 arch/x86/kvm/x86.c:9534
vcpu_run+0x7f5/0x18d0 arch/x86/kvm/x86.c:9788
kvm_arch_vcpu_ioctl_run+0x245b/0x2d10 arch/x86/kvm/x86.c:10020
Local variable ----dummy@kvm_vcpu_reset created at:
kvm_vcpu_reset+0x1fb/0x1c20 arch/x86/kvm/x86.c:10812
kvm_apic_accept_events+0x58f/0x8c0 arch/x86/kvm/lapic.c:2923
Reported-by: syzbot+f3985126b746b3d59c9d@syzkaller.appspotmail.com
Reported-by: Alexander Potapenko <glider@google.com>
Fixes: 2a24be79b6 ("KVM: VMX: Set EDX at INIT with CPUID.0x1, Family-Model-Stepping")
Fixes: 7ff6c03503 ("KVM: x86: Remove stateful CPUID handling")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20210929222426.1855730-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03a6e84069 upstream.
Explicitly zero the guest's CR3 and mark it available+dirty at RESET/INIT.
Per Intel's SDM and AMD's APM, CR3 is zeroed at both RESET and INIT. For
RESET, this is a nop as vcpu is zero-allocated. For INIT, the bug has
likely escaped notice because no firmware/kernel puts its page tables root
at PA=0, let alone relies on INIT to get the desired CR3 for such page
tables.
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210921000303.400537-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f9b68f57c upstream.
KASAN reports the following issue:
BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
Read of size 8 at addr ffffc9001364f638 by task qemu-kvm/4798
CPU: 0 PID: 4798 Comm: qemu-kvm Tainted: G X --------- ---
Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0081C 07/13/2020
Call Trace:
dump_stack+0xa5/0xe6
print_address_description.constprop.0+0x18/0x130
? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
__kasan_report.cold+0x7f/0x114
? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
kasan_report+0x38/0x50
kasan_check_range+0xf5/0x1d0
kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm]
? kvm_arch_exit+0x110/0x110 [kvm]
? sched_clock+0x5/0x10
ioapic_write_indirect+0x59f/0x9e0 [kvm]
? static_obj+0xc0/0xc0
? __lock_acquired+0x1d2/0x8c0
? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm]
The problem appears to be that 'vcpu_bitmap' is allocated as a single long
on stack and it should really be KVM_MAX_VCPUS long. We also seem to clear
the lower 16 bits of it with bitmap_zero() for no particular reason (my
guess would be that 'bitmap' and 'vcpu_bitmap' variables in
kvm_bitmap_or_dest_vcpus() caused the confusion: while the later is indeed
16-bit long, the later should accommodate all possible vCPUs).
Fixes: 7ee30bc132 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs")
Fixes: 9a2ae9f6b6 ("KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmap")
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210827092516.1027264-7-vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 773e89ab00 upstream.
hv_clock is preallocated to have only HVC_BOOT_ARRAY_SIZE (64) elements;
if the PTP_SYS_OFFSET_PRECISE ioctl is executed on vCPUs whose index is
64 of higher, retrieving the struct pvclock_vcpu_time_info pointer with
"src = &hv_clock[cpu].pvti" will result in an out-of-bounds access and
a wild pointer. Change it to "this_cpu_pvti()" which is guaranteed to
be valid.
Fixes: 95a3d4454b ("Switch kvmclock data to a PER_CPU variable")
Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Message-Id: <1632892429-101194-3-git-send-email-zelin.deng@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 943c15ac1b upstream.
If driver read val value sufficient for
(val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-1-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multi-line alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f36b88173 upstream.
If driver read val value sufficient for
(val & 0x08) && (!(val & 0x80)) && ((val & 0x7) == ((val >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-2-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multipline alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd4d747ef0 upstream.
If driver read tmp value sufficient for
(tmp & 0x08) && (!(tmp & 0x80)) && ((tmp & 0x7) == ((tmp >> 4) & 0x7))
from device then Null pointer dereference occurs.
(It is possible if tmp = 0b0xyz1xyz, where same literals mean same numbers)
Also lm75[] does not serve a purpose anymore after switching to
devm_i2c_new_dummy_device() in w83791d_detect_subclients().
The patch fixes possible NULL pointer dereference by removing lm75[].
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210921155153.28098-3-lutovinova@ispras.ru
[groeck: Dropped unnecessary continuation lines, fixed multi-line alignments]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2938b2978a upstream.
Function i2c_smbus_read_byte_data() can return a negative error number
instead of the data read if I2C transaction failed for whatever reason.
Lack of error checking can lead to serious issues on production
hardware, e.g. errors treated as temperatures produce spurious critical
temperature-crossed-threshold errors in BMC logs for OCP server
hardware. The patch was tested with Mellanox OCP Mezzanine card
emulating TMP421 protocol for temperature sensing which sometimes leads
to I2C protocol error during early boot up stage.
Fixes: 9410700b88 ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924093011.26083-1-fercerpav@gmail.com
[groeck: dropped unnecessary line breaks]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80f6e3080b upstream.
If the file size is almost S64_MAX, the calculated number of Merkle tree
levels exceeds FS_VERITY_MAX_LEVELS, causing FS_IOC_ENABLE_VERITY to
fail. This is unintentional, since as the comment above the definition
of FS_VERITY_MAX_LEVELS states, it is enough for over U64_MAX bytes of
data using SHA-256 and 4K blocks. (Specifically, 4096*128**8 >= 2**64.)
The bug is actually that when the number of blocks in the first level is
calculated from i_size, there is a signed integer overflow due to i_size
being signed. Fix this by treating i_size as unsigned.
This was found by the new test "generic: test fs-verity EFBIG scenarios"
(https://lkml.kernel.org/r/b1d116cd4d0ea74b9cd86f349c672021e005a75c.1631558495.git.boris@bur.io).
This didn't affect ext4 or f2fs since those have a smaller maximum file
size, but it did affect btrfs which allows files up to S64_MAX bytes.
Reported-by: Boris Burkov <boris@bur.io>
Fixes: 3fda4c617e ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl")
Fixes: fd2d1acfca ("fs-verity: add the hook for file ->open()")
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Boris Burkov <boris@bur.io>
Link: https://lore.kernel.org/r/20210916203424.113376-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f060db9937 upstream.
When ACPI NFIT table is failing to populate correct numa information
on arm64, dax_kmem will get NUMA_NO_NODE from the NFIT driver.
Without this patch, pmem can't be probed as RAM devices on arm64 guest:
$ndctl create-namespace -fe namespace0.0 --mode=devdax --map=dev -s 1g -a 128M
kmem dax0.0: rejecting DAX region [mem 0x240400000-0x2bfffffff] with invalid node: -1
kmem: probe of dax0.0 failed with error -22
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jia He <justin.he@arm.com>
Cc: <stable@vger.kernel.org>
Fixes: c221c0b030 ("device-dax: "Hotplug" persistent memory for use like normal RAM")
Link: https://lore.kernel.org/r/20210922152919.6940-1-justin.he@arm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad7cc2d41b upstream.
This patch initializes and enables speaker output on the Lenovo Legion 7i
15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 series of laptops using the
HDA verb sequence specific to each model.
Speaker automute is suppressed for the Lenovo Legion 7i 15IMHG05 to avoid
breaking speaker output on resume and when devices are unplugged from its
headphone jack.
Thanks to: Andreas Holzer, Vincent Morel, sycxyc, Max Christian Pohle and
all others that helped.
[ minor coding style fixes by tiwai ]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208555
Signed-off-by: Cameron Berkenpas <cam@neo-zeon.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210913212627.339362-1-cam@neo-zeon.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb1bcf5ed5 upstream.
In MOTU protocol v2/v3, first two data chunks across 2nd and 3rd data
channels includes message bytes from device. The total size of message
is 48 bits per data block.
The 'data_block_message' tracepoints event produced by ALSA firewire-motu
driver exposes the sequence of messages to userspace in 64 bit storage,
however lower 32 bits are actually available since current implementation
truncates 16 bits in upper of the message as a result of bit shift
operation within 32 bit storage.
This commit fixes the bug by perform the bit shift in 64 bit storage.
Fixes: c6b0b9e65f ("ALSA: firewire-motu: add tracepoints for messages for unique protocol")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20210920110734.27161-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f7d6779df6 ]
This gurantees no more work on the ring can be submitted
to hardware in suspend/resume case, otherwise a potential
race will occur and the ring will get no chance to stay
empty before suspend.
v2: Call drm_sched_resubmit_job before drm_sched_start to
restart jobs from the pending list.
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 067f44c8b4 ]
In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop
scheduler in s3 test, otherwise, fence related failure will arrive
after resume. To fix this and for a better clean up, move drm_sched_fini
from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and
should never be called in hw_fini.
v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init,
to keep sw_init and sw_fini paired.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1668
Fixes: 8d35a25961 ("drm/amdgpu: adjust fence driver enable sequence")
Suggested-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8d35a25961 ]
Fence driver was enabled per ring when sw init on per IP block before.
Change to enable all the fence driver at the same time after
amdgpu_device_ip_init finished.
Rename some function related to fence to make it reasonable for read.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3b0c406124 ]
This issue happens when a userspace program does an ioctl
FBIOPUT_VSCREENINFO passing the fb_var_screeninfo struct
containing only the fields xres, yres, and bits_per_pixel
with values.
If this struct is the same as the previous ioctl, the
vc_resize() detects it and doesn't call the resize_screen(),
leaving the fb_var_screeninfo incomplete. And this leads to
the updatescrollmode() calculates a wrong value to
fbcon_display->vrows, which makes the real_y() return a
wrong value of y, and that value, eventually, causes
the imageblit to access an out-of-bound address value.
To solve this issue I made the resize_screen() be called
even if the screen does not need any resizing, so it will
"fix and fill" the fb_var_screeninfo independently.
Cc: stable <stable@vger.kernel.org> # after 5.15-rc2 is out, give it time to bake
Reported-and-tested-by: syzbot+858dc7a2f7ef07c2c219@syzkaller.appspotmail.com
Signed-off-by: Igor Matheus Andrade Torrente <igormtorrente@gmail.com>
Link: https://lore.kernel.org/r/20210628134509.15895-1-igormtorrente@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4da8b12188 ]
If the 'perf iostat' user specifies two or more iio_root_ports and also
specifies the cpu(s) by -C which is not *connected to all* the above iio
ports, the iostat_print_metric() will run into trouble:
For example:
$ perf iostat list
S0-uncore_iio_0<0000:16>
S1-uncore_iio_0<0000:97> # <--- CPU 1 is located in the socket S0
$ perf iostat 0000:16,0000:97 -C 1 -- ls
port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound
Write(MB) ../perf-iostat: line 12: 104418 Segmentation fault
(core dumped) perf stat --iostat$DELIMITER$*
The core-dump stack says, in the above corner case, the returned
(struct perf_counts_values *) count will be NULL, and the caller
iostat_print_metric() apparently doesn't not handle this case.
433 struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
434
435 if (count->run && count->ena) {
(gdb) p count
$1 = (struct perf_counts_values *) 0x0
The deeper reason is that there are actually no statistics from the user
specified pair "iostat 0000:X, -C (disconnected) Y ", but let's fix it with
minimum cost by adding a NULL check in the user space.
Fixes: f9ed693e8b ("perf stat: Enable iostat mode for x86 platforms")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-2-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e4fe5d7349 ]
An iostate use case like "perf iostat 0000:16,0000:97 -- ls" should be
implemented to work in system-wide mode to ensure that the output from
print_header() is consistent with the user documentation perf-iostat.txt,
rather than incorrectly assuming that the kernel does not support it:
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) \
for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
/bin/dmesg | grep -i perf may provide additional information.
This error is easily fixed by assigning system-wide mode by default
for IOSTAT_RUN only when the target cpu_list is unspecified.
Fixes: f07952b179 ("perf stat: Basic support for iostat in perf")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-1-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d46ef750ed ]
devm_add_action_or_reset() can suddenly invoke amd_mp2_pci_remove() at
registration that will cause NULL pointer dereference since
corresponding data is not initialized yet. The patch moves
initialization of data before devm_add_action_or_reset().
Found by Linux Driver Verification project (linuxtesting.org).
[jkosina@suse.cz: rebase]
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Acked-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit adfc8f9d2f ]
SERIAL_CORE_CONSOLE depends on TTY so EARLY_PRINTK should also
depend on TTY so that it does not select SERIAL_CORE_CONSOLE
inadvertently.
WARNING: unmet direct dependencies detected for SERIAL_CORE_CONSOLE
Depends on [n]: TTY [=n] && HAS_IOMEM [=y]
Selected by [y]:
- EARLY_PRINTK [=y]
Fixes: e8bf5bc776 ("nios2: add early printk support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 505d9dcb0f ]
There are three bugs in this code:
1) If we ccp_init_data() fails for &src then we need to free aad.
Use goto e_aad instead of goto e_ctx.
2) The label to free the &final_wa was named incorrectly as "e_tag" but
it should have been "e_final_wa". One error path leaked &final_wa.
3) The &tag was leaked on one error path. In that case, I added a free
before the goto because the resource was local to that block.
Fixes: 36cf515b9b ("crypto: ccp - Enable support for AES GCM on v5 CCPs")
Reported-by: "minihanshen(沈明航)" <minihanshen@tencent.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: John Allen <john.allen@amd.com>
Tested-by: John Allen <john.allen@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d2b59bd4b0 ]
Commit 0b9902c1fc ("s390/qeth: fix deadlock during recovery") removed
taking discipline_mutex inside qeth_do_reset(), fixing potential
deadlocks. An error path was missed though, that still takes
discipline_mutex and thus has the original deadlock potential.
Intermittent deadlocks were seen when a qeth channel path is configured
offline, causing a race between qeth_do_reset and ccwgroup_remove.
Call qeth_set_offline() directly in the qeth_do_reset() error case and
then a new variant of ccwgroup_set_offline(), without taking
discipline_mutex.
Fixes: b41b554c1e ("s390/qeth: fix locking for discipline setup / removal")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee909d0b1d ]
Problem: qeth_close_dev_handler is a worker that tries to acquire
card->discipline_mutex via drv->set_offline() in ccwgroup_set_offline().
Since commit b41b554c1e
("s390/qeth: fix locking for discipline setup / removal")
qeth_remove_discipline() is called under card->discipline_mutex and
cancels the work and waits for it to finish.
STOPLAN reception with reason code IPA_RC_VEPA_TO_VEB_TRANSITION is the
only situation that schedules close_dev_work. In that situation scheduling
qeth recovery will also result in an offline interface, when resetting the
isolation mode fails, if the external switch is still set to VEB.
And since commit 0b9902c1fc ("s390/qeth: fix deadlock during recovery")
qeth recovery does not aquire card->discipline_mutex anymore.
So we accept the longer pathlength of qeth_schedule_recovery in this
error situation and re-use the existing function.
As a side-benefit this changes the hwtrap to behave like during recovery
instead of like during a user-triggered set_offline.
Fixes: b41b554c1e ("s390/qeth: fix locking for discipline setup / removal")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Acked-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 72a3c58d18 ]
Any link state change that's done prior to net device registration
isn't reflected on the state, thus the operational state is left
obsolete, with 'UNKNOWN' status.
To resolve the issue, query link state from FW upon open operations
to ensure operational state is updated.
Fixes: c27a02cd94 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d36a97736b ]
pmic_gpio_child_to_parent_hwirq() and
gpiochip_populate_parent_fwspec_fourcell() translate a pinctrl-
spmi-gpio irqspec to an SPMI controller irqspec. When they do
this, they use a fixed SPMI slave ID of 0 and a fixed GPIO
peripheral offset of 0xC0 (corresponding to SPMI address 0xC000).
This translation results in an incorrect irqspec for secondary
PMICs that don't have a slave ID of 0 as well as for PMIC chips
which have GPIO peripherals located at a base address other than
0xC000.
Correct this issue by passing the slave ID of the pinctrl-spmi-
gpio device's parent in the SPMI controller irqspec and by
calculating the peripheral ID base from the device tree 'reg'
property of the pinctrl-spmi-gpio device.
Signed-off-by: David Collins <collinsd@codeaurora.org>
Signed-off-by: satya priya <skakit@codeaurora.org>
Fixes: ca69e2d165 ("qcom: spmi-gpio: add support for hierarchical IRQ chip")
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/1631798498-10864-2-git-send-email-skakit@codeaurora.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 96fafe7c65 ]
The kernel test robot flagged an warning for ".../efc_device.c:932:6:
warning: cast to smaller integer type 'enum efc_nport_topology' from 'void
*'"
For the topology events, the "arg" field is generically defined as a void *
and is used to pass different arguments. Most of the arguments are pointers
to data structures. But for the EFC_EVT_NPORT_TOPOLOGY_NOTIFY event, the
argument is an enum value, and the code is typecasting the void * to an
enum generating the warning.
Fix by converting the EFC_EVT_NPORT_TOPOLOGY_NOTIFY event to pass a pointer
to the enum, thus it's a straight-forward pointer dereference in the event
handler.
Link: https://lore.kernel.org/r/20210830231050.5951-1-jsmart2021@gmail.com
Fixes: 202bfdffae ("scsi: elx: libefc: FC node ELS and state handling")
Reported-by: kernel test robot <lkp@intel.com>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c590fa80b3 ]
There is no defer probe when adding platform component to
snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card()
-> snd_soc_bind_card()
-> snd_soc_add_pcm_runtime()
-> adding cpu dai
-> adding codec dai
-> adding platform component.
So if the platform component is not ready at that time, then the
sound card still registered successfully, but platform component
is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register
platform component before cpu dai to avoid such issue.
Fixes: 2856448686 ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1630665006-31437-6-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee8ccc2eb5 ]
There is no defer probe when adding platform component to
snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card()
-> snd_soc_bind_card()
-> snd_soc_add_pcm_runtime()
-> adding cpu dai
-> adding codec dai
-> adding platform component.
So if the platform component is not ready at that time, then the
sound card still registered successfully, but platform component
is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register
platform component before cpu dai to avoid such issue.
Fixes: a2388a498a ("ASoC: fsl: Add S/PDIF CPU DAI driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1630665006-31437-5-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0adf292069 ]
There is no defer probe when adding platform component to
snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card()
-> snd_soc_bind_card()
-> snd_soc_add_pcm_runtime()
-> adding cpu dai
-> adding codec dai
-> adding platform component.
So if the platform component is not ready at that time, then the
sound card still registered successfully, but platform component
is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register
platform component before cpu dai to avoid such issue.
Fixes: 47a70e6fc9 ("ASoC: Add MICFIL SoC Digital Audio Interface driver.")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1630665006-31437-4-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f12ce92e98 ]
There is no defer probe when adding platform component to
snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card()
-> snd_soc_bind_card()
-> snd_soc_add_pcm_runtime()
-> adding cpu dai
-> adding codec dai
-> adding platform component.
So if the platform component is not ready at that time, then the
sound card still registered successfully, but platform component
is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register
platform component before cpu dai to avoid such issue.
Fixes: 43d24e76b6 ("ASoC: fsl_esai: Add ESAI CPU DAI driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1630665006-31437-3-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9c3ad33b5a ]
There is no defer probe when adding platform component to
snd_soc_pcm_runtime(rtd), the code is in snd_soc_add_pcm_runtime()
snd_soc_register_card()
-> snd_soc_bind_card()
-> snd_soc_add_pcm_runtime()
-> adding cpu dai
-> adding codec dai
-> adding platform component.
So if the platform component is not ready at that time, then the
sound card still registered successfully, but platform component
is empty, the sound card can't be used.
As there is defer probe checking for cpu dai component, then register
platform component before cpu dai to avoid such issue.
Fixes: 4355082149 ("ASoC: Add SAI SoC Digital Audio Interface driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1630665006-31437-2-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ad02c27d8 ]
The use of a macro named 'RST' conflicts with one of the same name
in arch/mips/include/asm/mach-rc32434/rb.h. This causes build
warnings on some MIPS builds.
Change the names of the JPEG marker constants to be in their own
namespace to fix these build warnings and to prevent other similar
problems in the future.
Fixes these build warnings:
In file included from ../drivers/media/platform/s5p-jpeg/jpeg-hw-exynos3250.c:14:
../drivers/media/platform/s5p-jpeg/jpeg-core.h:43: warning: "RST" redefined
43 | #define RST 0xd0
|
../arch/mips/include/asm/mach-rc32434/rb.h:13: note: this is the location of the previous definition
13 | #define RST (1 << 15)
In file included from ../drivers/media/platform/s5p-jpeg/jpeg-hw-s5p.c:13:
../drivers/media/platform/s5p-jpeg/jpeg-core.h:43: warning: "RST" redefined
43 | #define RST 0xd0
../arch/mips/include/asm/mach-rc32434/rb.h:13: note: this is the location of the previous definition
13 | #define RST (1 << 15)
In file included from ../drivers/media/platform/s5p-jpeg/jpeg-hw-exynos4.c:12:
../drivers/media/platform/s5p-jpeg/jpeg-core.h:43: warning: "RST" redefined
43 | #define RST 0xd0
../arch/mips/include/asm/mach-rc32434/rb.h:13: note: this is the location of the previous definition
13 | #define RST (1 << 15)
In file included from ../drivers/media/platform/s5p-jpeg/jpeg-core.c:31:
../drivers/media/platform/s5p-jpeg/jpeg-core.h:43: warning: "RST" redefined
43 | #define RST 0xd0
../arch/mips/include/asm/mach-rc32434/rb.h:13: note: this is the location of the previous definition
13 | #define RST (1 << 15)
Also update the kernel-doc so that the word "marker" is not
repeated.
Link: https://lore.kernel.org/linux-media/20210907044022.30602-1-rdunlap@infradead.org
Fixes: bb677f3ac4 ("[media] Exynos4 JPEG codec v4l2 driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Andrzej Pietrasiewicz <andrzejtp2010@gmail.com>
Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 132c88614f ]
Tiled formats requires full rows being allocated (even for Chroma
planes). When the number of Luma tiles is odd, we need to round up
to twice the tile width in order to roundup the number of Chroma
tiles.
This was notice with a crash running BA1_FT_C compliance test using
sunxi tiles using GStreamer. Cedrus driver would allocate 9 rows for
Luma, but only 4.5 rows for Chroma, causing userspace to crash.
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Fixes: 50e761516f ("media: platform: Add Cedrus VPU decoder driver")
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit f0e8a206a2 upstream.
For Isochronous endpoints, the SS companion descriptor's
wBytesPerInterval field is required to reserve bus time in order
to transmit the required payload during the service interval.
If left at 0, the UAC2 function is unable to transact data on its
playback or capture endpoints in SuperSpeed mode.
Since f_uac2 currently does not support any bursting this value can
be exactly equal to the calculated wMaxPacketSize.
Tested with Windows 10 as a host.
Fixes: f8cb3d556b ("usb: f_uac2: adds support for SS and SSP")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210909174811.12534-3-jackp@codeaurora.org
[jackp: Backport to 5.14 with minor conflict resolution]
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 595091a142 upstream.
The f_uac2 function fails to enumerate when connected in SuperSpeed
due to the feedback endpoint missing the companion descriptor.
Add a new ss_epin_fback_desc_comp descriptor and append it behind the
ss_epin_fback_desc both in the static definition of the ss_audio_desc
structure as well as its dynamic construction in setup_headers().
Fixes: 24f779dac8 ("usb: gadget: f_uac2/u_audio: add feedback endpoint support")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210909174811.12534-2-jackp@codeaurora.org
[jackp: Backport to 5.14 with minor conflict resolution]
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0bd46e22c5 upstream.
This was intended to limit the number of characters printed from
"subsys->serial" to NVMET_SN_MAX_SIZE. But accidentally the width
specifier was used instead of the precision specifier so it only
affects the alignment and not the number of characters printed.
Fixes: f04064814c ("nvmet: fixup buffer overrun in nvmet_subsys_attr_serial()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d5f6545934 upstream.
In commit b7213ffa0e ("qnx4: avoid stringop-overread errors") I tried
to teach gcc about how the directory entry structure can be two
different things depending on a status flag. It made the code clearer,
and it seemed to make gcc happy.
However, Arnd points to a gcc bug, where despite using two different
members of a union, gcc then gets confused, and uses the size of one of
the members to decide if a string overrun happens. And not necessarily
the rigth one.
End result: with some configurations, gcc-11 will still complain about
the source buffer size being overread:
fs/qnx4/dir.c: In function 'qnx4_readdir':
fs/qnx4/dir.c:76:32: error: 'strnlen' specified bound [16, 48] exceeds source size 1 [-Werror=stringop-overread]
76 | size = strnlen(name, size);
| ^~~~~~~~~~~~~~~~~~~
fs/qnx4/dir.c:26:22: note: source object declared here
26 | char de_name;
| ^~~~~~~
because gcc will get confused about which union member entry is actually
getting accessed, even when the source code is very clear about it. Gcc
internally will have combined two "redundant" pointers (pointing to
different union elements that are at the same offset), and takes the
size checking from one or the other - not necessarily the right one.
This is clearly a gcc bug, but we can work around it fairly easily. The
biggest thing here is the big honking comment about why we do what we
do.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578#c6
Reported-and-tested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c32dfec6c1 upstream.
Some CP2102 do not support event-insertion mode but return no error when
attempting to enable it.
This means that any event escape characters in the input stream will not
be escaped by the device and consequently regular data may be
interpreted as escape sequences and be removed from the stream by the
driver.
The reporter's device has batch number DCL00X etched into it and as
discovered by the SHA2017 Badge team, counterfeit devices with that
marking can be detected by sending malformed vendor requests. [1][2]
Tests confirm that the possibly counterfeit CP2102 returns a single byte
in response to a malformed two-byte part-number request, while an
original CP2102 returns two bytes. Assume that every CP2102 that behaves
this way also does not support event-insertion mode (e.g. cannot report
parity errors).
[1] https://mobile.twitter.com/sha2017badge/status/1167902087289532418
[2] https://hackaday.com/2017/08/14/hands-on-with-the-shacamp-2017-badge/#comment-3903376
Reported-by: Malte Di Donato <malte@neo-soft.org>
Tested-by: Malte Di Donato <malte@neo-soft.org>
Fixes: a7207e9835 ("USB: serial: cp210x: add support for line-status events")
Cc: stable@vger.kernel.org # 5.9
Link: https://lore.kernel.org/r/20210922113100.20888-1-johan@kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c8a3b5bd9 upstream.
This lets us avoid doing unnecessary work on hardware that does not
support MTE, and will allow us to freely use MTE instructions in the
code called by mte_thread_switch().
Since this would mean that we do a redundant check in
mte_check_tfsr_el1(), remove it and add two checks now required in its
callers. This also avoids an unnecessary DSB+ISB sequence on the syscall
exit path for hardware not supporting MTE.
Fixes: 65812c6921 ("arm64: mte: Enable async tag check fault")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I02fd000d1ef2c86c7d2952a7f099b254ec227a5d
Link: https://lore.kernel.org/r/20210915190336.398390-1-pcc@google.com
[catalin.marinas@arm.com: adjust the commit log slightly]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b4bd25667 upstream.
After upgrading to Linux 5.13.3 I noticed my laptop would shutdown due
to overheat (when it should not). It turned out this was due to commit
fe6a6de669 ("thermal/drivers/int340x/processor_thermal: Fix tcc setting").
What happens is this drivers uses a global variable to keep track of the
tcc offset (tcc_offset_save) and uses it on resume. The issue is this
variable is initialized to 0, but is only set in
tcc_offset_degree_celsius_store, i.e. when the tcc offset is explicitly
set by userspace. If that does not happen, the resume path will set the
offset to 0 (in my case the h/w default being 3, the offset would become
too low after a suspend/resume cycle).
The issue did not arise before commit fe6a6de669, as the function
setting the offset would return if the offset was 0. This is no longer
the case (rightfully).
Fix this by not applying the offset if it wasn't saved before, reverting
back to the old logic. A better approach will come later, but this will
be easier to apply to stable kernels.
The logic to restore the offset after a resume was there long before
commit fe6a6de669, but as a value of 0 was considered invalid I'm
referencing the commit that made the issue possible in the Fixes tag
instead.
Fixes: fe6a6de669 ("thermal/drivers/int340x/processor_thermal: Fix tcc setting")
Cc: stable@vger.kernel.org
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Tested-by: Srinivas Pandruvada <srinivas.pI andruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20210909085613.5577-2-atenart@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8aa83e6395 upstream.
Commit in Fixes introduced early_reserve_memory() to do all needed
initial memblock_reserve() calls in one function. Unfortunately, the call
of early_reserve_memory() is done too late for Xen dom0, as in some
cases a Xen hook called by e820__memory_setup() will need those memory
reservations to have happened already.
Move the call of early_reserve_memory() before the call of
e820__memory_setup() in order to avoid such problems.
Fixes: a799c2bd29 ("x86/setup: Consolidate early memory reservations")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210920120421.29276-1-jgross@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit aba5daeb64 ]
FD uses xyarray__entry that may return NULL if an index is out of
bounds. If NULL is returned then a segv happens as FD unconditionally
dereferences the pointer. This was happening in a case of with perf
iostat as shown below. The fix is to make FD an "int*" rather than an
int and handle the NULL case as either invalid input or a closed fd.
$ sudo gdb --args perf stat --iostat list
...
Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
50 {
(gdb) bt
#0 perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50
#1 0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410,
threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792
#2 0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0)
at util/evsel.c:2045
#3 0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0)
at util/evsel.c:2065
#4 0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0,
config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590
#5 0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
at builtin-stat.c:833
#6 0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0)
at builtin-stat.c:1048
#7 0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534
#8 0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3,
argv=0x7fffffffe4a0) at perf.c:313
#9 0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365
#10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409
#11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539
...
(gdb) c
Continuing.
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/).
/bin/dmesg | grep -i perf may provide additional information.
Program received signal SIGSEGV, Segmentation fault.
0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166
166 if (FD(evsel, cpu, thread) >= 0)
v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was
backward.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210918054440.2350466-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit efafec27c5 ]
Without CONFIG_PM enabled, the SET_RUNTIME_PM_OPS() macro ends up being
empty, and the only use of tegra_slink_runtime_{resume,suspend} goes
away, resulting in
drivers/spi/spi-tegra20-slink.c:1200:12: error: ‘tegra_slink_runtime_resume’ defined but not used [-Werror=unused-function]
1200 | static int tegra_slink_runtime_resume(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/spi/spi-tegra20-slink.c:1188:12: error: ‘tegra_slink_runtime_suspend’ defined but not used [-Werror=unused-function]
1188 | static int tegra_slink_runtime_suspend(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
mark the functions __maybe_unused to make the build happy.
This hits the alpha allmodconfig build (and others).
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3c0d2a46c0 ]
tx timeout and slot time are currently specified in units of HZ. On
Alpha, HZ is defined as 1024. When building alpha:allmodconfig, this
results in the following error message.
drivers/net/hamradio/6pack.c: In function 'sixpack_open':
drivers/net/hamradio/6pack.c:71:41: error:
unsigned conversion from 'int' to 'unsigned char'
changes value from '256' to '0'
In the 6PACK protocol, tx timeout is specified in units of 10 ms and
transmitted over the wire:
https://www.linux-ax25.org/wiki/6PACK
Defining a value dependent on HZ doesn't really make sense, and
presumably comes from the (very historical) situation where HZ was
originally 100.
Note that the SIXP_SLOTTIME use explicitly is about 10ms granularity:
mod_timer(&sp->tx_t, jiffies + ((when + 1) * HZ) / 100);
and the SIXP_TXDELAY walue is sent as a byte over the wire.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 35a3f4ef0a ]
Some drivers pass a pointer to volatile data to virt_to_bus() and
virt_to_phys(), and that works fine. One exception is alpha. This
results in a number of compile errors such as
drivers/net/wan/lmc/lmc_main.c: In function 'lmc_softreset':
drivers/net/wan/lmc/lmc_main.c:1782:50: error:
passing argument 1 of 'virt_to_bus' discards 'volatile'
qualifier from pointer target type
drivers/atm/ambassador.c: In function 'do_loader_command':
drivers/atm/ambassador.c:1747:58: error:
passing argument 1 of 'virt_to_bus' discards 'volatile'
qualifier from pointer target type
Declare the parameter of virt_to_phys and virt_to_bus as pointer to
volatile to fix the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd51a57eb5 ]
This patch allows panel orientation quirks from DRM core to be
used. They attach a DRM connector property "panel orientation"
which indicates in which direction the panel has been mounted.
Some machines have the internal screen mounted with a rotation.
Since the panel orientation quirks need the native mode from the
EDID, check for it in amdgpu_dm_connector_ddc_get_modes.
Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 71ae30997a ]
[Why]
If link training is aborted, it shall be retried if sink is present.
[How]
Check hpd status to find out whether sink is present or not. If sink is
present, then link training shall be tried again with same settings.
Otherwise, link training shall be aborted.
Reviewed-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4e00a434a0 ]
[Why]
Intermittently, there presents two occurrences of 0 stream
commits in a single HPD event. Current HDCP sequence does
not consider such scenerio, and will thus disable HDCP.
[How]
Add condition check to include stream remove and re-enable
case for HDCP enable.
Reviewed-by: Bhawanpreet Lakha <bhawanpreet.lakha@amd.com>
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fb932dfeb8 ]
On some GPUs the PCIe atomic requirement for KFD depends on the MEC
firmware version. Add a firmware version check for this. The minimum
firmware version that works without atomics can be updated in the
device_info structure for each GPU type.
Move PCIe atomic detection from kgd2kfd_probe into kgd2kfd_device_init
because the MEC firmware is not loaded yet at the probe stage.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7213ffa0e ]
The qnx4 directory entries are 64-byte blocks that have different
contents depending on the a status byte that is in the last byte of the
block.
In particular, a directory entry can be either a "link info" entry with
a 48-byte name and pointers to the real inode information, or an "inode
entry" with a smaller 16-byte name and the full inode information.
But the code was written to always just treat the directory name as if
it was part of that "inode entry", and just extend the name to the
longer case if the status byte said it was a link entry.
That work just fine and gives the right results, but now that gcc is
tracking data structure accesses much more, the code can trigger a
compiler error about using up to 48 bytes (the long name) in a structure
that only has that shorter name in it:
fs/qnx4/dir.c: In function ‘qnx4_readdir’:
fs/qnx4/dir.c:51:32: error: ‘strnlen’ specified bound 48 exceeds source size 16 [-Werror=stringop-overread]
51 | size = strnlen(de->di_fname, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from fs/qnx4/qnx4.h:3,
from fs/qnx4/dir.c:16:
include/uapi/linux/qnx4_fs.h:45:25: note: source object declared here
45 | char di_fname[QNX4_SHORT_NAME_MAX];
| ^~~~~~~~
which is because the source code doesn't really make this whole "one of
two different types" explicit.
Fix this by introducing a very explicit union of the two types, and
basically explaining to the compiler what is really going on.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fc7c028dcd ]
The sparc mdesc code does pointer games with 'struct mdesc_hdr', but
didn't describe to the compiler how that header is then followed by the
data that the header describes.
As a result, gcc is now unhappy since it does stricter pointer range
tracking, and doesn't understand about how these things work. This
results in various errors like:
arch/sparc/kernel/mdesc.c: In function ‘mdesc_node_by_name’:
arch/sparc/kernel/mdesc.c:647:22: error: ‘strcmp’ reading 1 or more bytes from a region of size 0 [-Werror=stringop-overread]
647 | if (!strcmp(names + ep[ret].name_offset, name))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
which are easily avoided by just describing 'struct mdesc_hdr' better,
and making the node_block() helper function look into that unsized
data[] that follows the header.
This makes the sparc64 build happy again at least for my cross-compiler
version (gcc version 11.2.1).
Link: https://lore.kernel.org/lkml/CAHk-=wi4NW3NC0xWykkw=6LnjQD6D_rtRtxY9g8gQAJXtQMi8A@mail.gmail.com/
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dff2d13114 ]
gcc 11.x reports the following compiler warning/error.
drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
arch/m68k/include/asm/string.h:72:25: error:
'__builtin_memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread]
Use absolute_pointer() to work around the problem.
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f6b5f1a569 ]
absolute_pointer() disassociates a pointer from its originating symbol
type and context. Use it to prevent compiler warnings/errors such as
drivers/net/ethernet/i825xx/82596.c: In function 'i82596_probe':
arch/m68k/include/asm/string.h:72:25: error:
'__builtin_memcpy' reading 6 bytes from a region of size 0 [-Werror=stringop-overread]
Such warnings may be reported by gcc 11.x for string and memory
operations on fixed addresses.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3df49967f6 ]
When the integrity profile is unregistered there can still be integrity
reads queued up which could see a NULL verify_fn as shown by the race
window below:
CPU0 CPU1
process_one_work nvme_validate_ns
bio_integrity_verify_fn nvme_update_ns_info
nvme_update_disk_info
blk_integrity_unregister
---set queue->integrity as 0
bio_integrity_process
--access bi->profile->verify_fn(bi is a pointer of queue->integity)
Before calling blk_integrity_unregister in nvme_update_disk_info, we must
make sure that there is no work item in the kintegrityd_wq. Just call
blk_flush_integrity to flush the work queue so the bug can be resolved.
Signed-off-by: Lihong Kou <koulihong@huawei.com>
[hch: split up and shortened the changelog]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Link: https://lore.kernel.org/r/20210914070657.87677-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7bbee36d71 ]
In amdgpu_dm_atomic_check, dc_validate_global_state is called. On
failure this logs a warning to the kernel journal. However warnings
shouldn't be used for atomic test-only commit failures: user-space
might be perfoming a lot of atomic test-only commits to find the
best hardware configuration.
Downgrade the log to a regular DRM atomic message. While at it, use
the new device-aware logging infrastructure.
This fixes error messages in the kernel when running gamescope [1].
[1]: https://github.com/Plagman/gamescope/issues/245
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 59583f7476 ]
Commit 53b7670e57 ("sparc: factor the dma coherent mapping into
helper") lost the page align for the calls to dma_make_coherent and
srmmu_unmapiorange. The latter cannot handle a non page aligned len
argument.
Signed-off-by: Andreas Larsson <andreas@gaisler.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9817d763db ]
We should always destroy cm_id before destroy qp to avoid to get cma
event after qp was destroyed, which may lead to use after free.
In RDMA connection establishment error flow, don't destroy qp in cm
event handler.Just report cm_error to upper level, qp will be destroy
in nvme_rdma_alloc_queue() after destroy cm id.
Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79f528afa9 ]
nvme_update_ana_state() has a deficiency that results in a failure to
properly update the ana state for a namespace in the following case:
NSIDs in ctrl->namespaces: 1, 3, 4
NSIDs in desc->nsids: 1, 2, 3, 4
Loop iteration 0:
ns index = 0, n = 0, ns->head->ns_id = 1, nsid = 1, MATCH.
Loop iteration 1:
ns index = 1, n = 1, ns->head->ns_id = 3, nsid = 2, NO MATCH.
Loop iteration 2:
ns index = 2, n = 2, ns->head->ns_id = 4, nsid = 4, MATCH.
Where the update to the ANA state of NSID 3 is missed. To fix this
increment n and retry the update with the same ns when ns->head->ns_id is
higher than nsid,
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8480ed9c2b ]
Today the Xen ballooning is done via delayed work in a workqueue. This
might result in workqueue hangups being reported in case of large
amounts of memory are being ballooned in one go (here 16GB):
BUG: workqueue lockup - pool cpus=6 node=0 flags=0x0 nice=0 stuck for 64s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=2/256 refcnt=3
in-flight: 229:balloon_process
pending: cache_reap
workqueue events_freezable_power_: flags=0x84
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
pending: disk_events_workfn
workqueue mm_percpu_wq: flags=0x8
pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
pending: vmstat_update
pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=64s workers=3 idle: 2222 43
This can easily be avoided by using a dedicated kernel thread for doing
the ballooning work.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210827123206.15429-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d9a7e9df73 ]
If HWP has been already been enabled by BIOS, it may be
necessary to override some kernel command line parameters.
Once it has been enabled it requires a reset to be disabled.
Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 510e1a724a ]
For some drivers, that use the DMA API. This error message can be reached
several millions of times per second, causing spam to the kernel's printk
buffer and bringing the CPU usage up to 100% (so, it should be rate
limited). However, since there is at least one driver that is in the
mainline and suffers from the error condition, it is more useful to
err_printk() here instead of just rate limiting the error message (in hopes
that it will make it easier for other drivers that suffer from this issue
to be spotted).
Link: https://lkml.kernel.org/r/fd67fbac-64bf-f0ea-01e1-5938ccfab9d0@arm.com
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1a89856fb ]
m68k builds fail widely with errors such as
arch/m68k/include/asm/raw_io.h:20:19: error:
cast to pointer from integer of different size
arch/m68k/include/asm/raw_io.h:30:32: error:
cast to pointer from integer of different size [-Werror=int-to-p
On m68k, io functions are defined as macros. The problem is seen if the
macro parameter variable size differs from the size of a pointer. Cast
the parameter of all io macros to unsigned long before casting it to
a pointer to fix the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20210907060729.2391992-1-linux@roeck-us.net
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67f3b2f822 ]
blk-mq can't run allocating driver tag and updating ->rqs[tag]
atomically, meantime blk-mq doesn't clear ->rqs[tag] after the driver
tag is released.
So there is chance to iterating over one stale request just after the
tag is allocated and before updating ->rqs[tag].
scsi_host_busy_iter() calls scsi_host_check_in_flight() to count scsi
in-flight requests after scsi host is blocked, so no new scsi command can
be marked as SCMD_STATE_INFLIGHT. However, driver tag allocation still can
be run by blk-mq core. One request is marked as SCMD_STATE_INFLIGHT,
but this request may have been kept in another slot of ->rqs[], meantime
the slot can be allocated out but ->rqs[] isn't updated yet. Then this
in-flight request is counted twice as SCMD_STATE_INFLIGHT. This way causes
trouble in handling scsi error.
Fixes the issue by not iterating over stale request.
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Reported-by: luojiaxing <luojiaxing@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210906065003.439019-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 08dad2f4d5 ]
The Synopsys Ethernet IP uses the CSR clock as a base clock for MDC.
The divisor used is set in the MAC_MDIO_Address register field CR
(Clock Rate)
The divisor is there to change the CSR clock into a clock that falls
below the IEEE 802.3 specified max frequency of 2.5MHz.
If the CSR clock is 300MHz, the code falls back to using the reset
value in the MAC_MDIO_Address register, as described in the comment
above this code.
However, 300MHz is actually an allowed value and the proper divider
can be estimated quite easily (it's just 1Hz difference!)
A CSR frequency of 300MHz with the maximum clock rate value of 0x5
(STMMAC_CSR_250_300M, a divisor of 124) gives somewhere around
~2.42MHz which is below the IEEE 802.3 specified maximum.
For the ARTPEC-8 SoC, the CSR clock is this problematic 300MHz,
and unfortunately, the reset-value of the MAC_MDIO_Address CR field
is 0x0.
This leads to a clock rate of zero and a divisor of 42, and gives an
MDC frequency of ~7.14MHz.
Allow CSR clock of 300MHz by making the comparison inclusive.
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d82d5303c4 ]
plat_dev->dev->platform_data is released by platform_device_unregister(),
use of pclk and hclk is a use-after-free. Since device unregister won't
need a clk device we adjust the function call sequence to fix this issue.
[ 31.261225] BUG: KASAN: use-after-free in macb_remove+0x77/0xc6 [macb_pci]
[ 31.275563] Freed by task 306:
[ 30.276782] platform_device_release+0x25/0x80
Suggested-by: Nicolas Ferre <Nicolas.Ferre@microchip.com>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ea269a6f72 ]
Currently changes to the advertising state via ethtool do not cause any
reselection of the configured interface mode after the SFP is already
inserted and initially configured.
While it is not typical to change the advertised link modes for an
interface using an SFP in certain use cases it is desirable. In the case
of a SFP port that is capable of handling both SFP and SFP+ modules it
will automatically select between 1G and 10G modes depending on the
supported mode of the SFP. However if the SFP module is capable of
working in multiple modes (e.g. a SFP+ DAC that can operate at 1G or
10G), one end of the cable may be attached to a SFP 1000base-x port thus
the SFP+ end must be manually configured to the 1000base-x mode in order
for the link to be established.
This change causes the ethtool setting of advertised mode changes to
reselect the interface mode so that the link can be established.
Additionally when a module is inserted the advertising mode is reset to
match the supported modes of the module.
Signed-off-by: Nathan Rossi <nathan.rossi@digi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cdb31c29d3 ]
There's no reason to punt it unconditionally, we just need to ensure that
the submit lock grabbing is conditional.
Fixes: 05f3fb3c53 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9990da93d2 ]
For each provided buffer, we allocate a struct io_buffer to hold the
data associated with it. As a large number of buffers can be provided,
account that data with memcg.
Fixes: ddf0322db7 ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bd99c71bd1 ]
If poll arming and poll completion runs in parallel, there maybe races.
For instance, run io_poll_add in iowq and io_poll_task_func in original
context, then:
iowq original context
io_poll_add
vfs_poll
(interruption happens
tw queued to original
context) io_poll_task_func
generate cqe
del from cancel_hash[]
if !poll.done
insert to cancel_hash[]
The entry left in cancel_hash[], similar case for fast poll.
Fix it by set poll.done = true when del from cancel_hash[].
Fixes: 5082620fb2 ("io_uring: terminate multishot poll for CQ ring overflow")
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210922101238.7177-2-haoxu@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7df835a32a ]
Commit b0140891a8 ("md: Fix race when creating a new md device.")
not only moved assigning mddev->gendisk before calling add_disk, which
fixes the races described in the commit log, but also added a
mddev->open_mutex critical section over add_disk and creation of the
md kobj. Adding a kobject after add_disk is racy vs deleting the gendisk
right after adding it, but md already prevents against that by holding
a mddev->active reference.
On the other hand taking this lock added a lock order reversal with what
is not disk->open_mutex (used to be bdev->bd_mutex when the commit was
added) for partition devices, which need that lock for the internal open
for the partition scan, and a recent commit also takes it for
non-partitioned devices, leading to further lockdep splatter.
Fixes: b0140891a8 ("md: Fix race when creating a new md device.")
Fixes: d626338735 ("block: support delayed holder registration")
Reported-by: syzbot+fadc0aaf497e6a493b9f@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: syzbot+fadc0aaf497e6a493b9f@syzkaller.appspotmail.com
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e946d3c887 ]
The problem is the mismatched types between "ctx->total_len" which is
an unsigned int, "rc" which is an int, and "ctx->rc" which is a
ssize_t. The code does:
ctx->rc = (rc == 0) ? ctx->total_len : rc;
We want "ctx->rc" to store the negative "rc" error code. But what
happens is that "rc" is type promoted to a high unsigned int and
'ctx->rc" will store the high positive value instead of a negative
value.
The fix is to change "rc" from an int to a ssize_t.
Fixes: c610c4b619 ("CIFS: Add asynchronous write support through kernel AIO")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1bb30b20b4 ]
After printing the list of thermal governors, then this function prints
a newline character. The problem is that "size" has not been updated
after printing the last governor. This means that it can write one
character (the NUL terminator) beyond the end of the buffer.
Get rid of the "size" variable and just use "PAGE_SIZE - count" directly.
Fixes: 1b4f48494e ("thermal: core: group functions related to governor handling")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210916131342.GB25094@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 298ba0e3d4 ]
Various places in the nvme code that rely on ctrl->namespace to be
ordered. Ensure that the namespae is inserted into the list at the
right position from the start instead of sorting it after the fact.
Fixes: 540c801c65 ("NVMe: Implement namespace list scanning")
Reported-by: Anton Eidelman <anton.eidelman@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e371af033c ]
When the controller sends us multiple r2t PDUs in a single
request we need to account for it correctly as our send/recv
context run concurrently (i.e. we get a new r2t with r2t_offset
before we updated our iterator and req->data_sent marker). This
can cause wrong offsets to be sent to the controller.
To fix that, we will first know that this may happen only in
the send sequence of the last page, hence we will take
the r2t_offset to the h2c PDU data_offset, and in
nvme_tcp_try_send_data loop, we make sure to increment
the request markers also when we completed a PDU but
we are expecting more r2t PDUs as we still did not send
the entire data of the request.
Fixes: 825619b09a ("nvme-tcp: fix possible use-after-completion")
Reported-by: Nowak, Lukasz <Lukasz.Nowak@Dell.com>
Tested-by: Nowak, Lukasz <Lukasz.Nowak@Dell.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4ffd5df9d ]
The function __bad_area_nosemaphore() calls kernelmode_fixup_or_oops()
with the parameter @signal being actually @pkey, which will send a
signal numbered with the argument in @pkey.
This bug can be triggered when the kernel fails to access user-given
memory pages that are protected by a pkey, so it can go down the
do_user_addr_fault() path and pass the !user_mode() check in
__bad_area_nosemaphore().
Most cases will simply run the kernel fixup code to make an -EFAULT. But
when another condition current->thread.sig_on_uaccess_err is met, which
is only used to emulate vsyscall, the kernel will generate the wrong
signal.
Add a new parameter @pkey to kernelmode_fixup_or_oops() to fix this.
[ bp: Massage commit message, fix build error as reported by the 0day
bot: https://lkml.kernel.org/r/202109202245.APvuT8BX-lkp@intel.com ]
Fixes: 5042d40a26 ("x86/fault: Bypass no_context() for implicit kernel faults from usermode")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jiashuo Liang <liangjs@pku.edu.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20210730030152.249106-1-liangjs@pku.edu.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 34331739e1 ]
Earlier successes leave 'ret' in a non error state, so these errors are
not reported. Set ret to -EINVAL before going to the error handler.
This addresses two issues reported by smatch:
drivers/fpga/machxo2-spi.c:229 machxo2_write_init()
warn: missing error code 'ret'
drivers/fpga/machxo2-spi.c:316 machxo2_write_complete()
warn: missing error code 'ret'
[mdf@kernel.org: Reworded commit message]
Fixes: 88fb3a0023 ("fpga: lattice machxo2: Add Lattice MachXO2 support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 06e49073df ]
'set_signals()' in synclink_gt.c conflicts with an exported symbol
in arch/um/, so change set_signals() to set_gtsignals(). Keep
the function names similar by also changing get_signals() to
get_gtsignals().
../drivers/tty/synclink_gt.c:442:13: error: conflicting types for ‘set_signals’
static void set_signals(struct slgt_info *info);
^~~~~~~~~~~
In file included from ../include/linux/irqflags.h:16:0,
from ../include/linux/spinlock.h:58,
from ../include/linux/mm_types.h:9,
from ../include/linux/buildid.h:5,
from ../include/linux/module.h:14,
from ../drivers/tty/synclink_gt.c:46:
../arch/um/include/asm/irqflags.h:6:5: note: previous declaration of ‘set_signals’ was here
int set_signals(int enable);
^~~~~~~~~~~
Fixes: 705b6c7b34 ("[PATCH] new driver synclink_gt")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210902003806.17054-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef7ae7f746 ]
Commit 356ba2a8bc ("scsi: target: tcmu: Make pgr_support and alua_support
attributes writable") introduced support for changeable alua_support and
pgr_support target attributes. These can only be changed if the backstore
is user-backed, otherwise the kernel returns -EINVAL.
This triggers a warning in the targetcli/rtslib code when performing a
target restore that includes non-userbacked backstores:
# targetctl restore
Storage Object block/storage1: Cannot set attribute alua_support:
[Errno 22] Invalid argument, skipped
Storage Object block/storage1: Cannot set attribute pgr_support:
[Errno 22] Invalid argument, skipped
Fix this warning by returning an error code only if we are really going to
flip the PGR/ALUA bit in the transport_flags field, otherwise we will do
nothing and return success.
Return ENOSYS instead of EINVAL if the pgr/alua attributes can not be
changed, this way it will be possible for userspace to understand if the
operation failed because an invalid value has been passed to strtobool() or
because the attributes are fixed.
Fixes: 356ba2a8bc ("scsi: target: tcmu: Make pgr_support and alua_support attributes writable")
Link: https://lore.kernel.org/r/20210906151809.52811-1-mlombard@redhat.com
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f63251184a ]
For xnack off, restore work dma unmap previous system memory page, and
dma map the updated system memory page to update GPU mapping, this is
not dma mapping leaking, remove the WARN_ONCE for dma mapping leaking.
prange->dma_addr store the VRAM page pfn after the range migrated to
VRAM, should not dma unmap VRAM page when updating GPU mapping or
remove prange. Add helper svm_is_valid_dma_mapping_addr to check VRAM
page and error cases.
Mask out SVM_RANGE_VRAM_DOMAIN flag in dma_addr before calling amdgpu vm
update to avoid BUG_ON(*addr & 0xFFFF00000000003FULL), and set it again
immediately after. This flag is used to know the type of page later to
dma unmapping system memory page.
Fixes: 1d5dbfe6c0 ("drm/amdkfd: classify and map mixed svm range pages in GPU")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2f617f4df8 ]
Restore retry fault or prefetch range, or restore svm range after
eviction to map range to GPU with correct read or write access
permission.
Range may includes multiple VMAs, update GPU page table with offset of
prange, number of pages for each VMA according VMA access permission.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4d88c339c4 ]
After fixing hibernation resume flow, another usecase was found which
should be explicitly handled - resume when device is in "down" state.
Invoke aq_nic_init jointly with aq_nic_start only if ndev was already
up during suspend/hibernate. We still need to perform nic_deinit() if
caller requests for it, to handle the freeze/resume scenarios.
Fixes: 57f780f1c4 ("atlantic: Fix driver resume flow.")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fdbccea419 ]
Driver doesn't support aRFS for encapsulated packets, return early error
in such a case.
Fixes: 1eb8c695bd ("net/mlx4_en: Add accelerated RFS support")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit acc64f52af ]
The blamed commit made the fatally incorrect assumption that ports which
aren't in the FORWARDING STP state should not have packets forwarded
towards them, and that is all that needs to be done.
However, that logic alone permits BLOCKING ports to forward to
FORWARDING ports, which of course allows packet storms to occur when
there is an L2 loop.
The ocelot_get_bridge_fwd_mask should not only ask "what can the bridge
do for you", but "what can you do for the bridge". This way, only
FORWARDING ports forward to the other FORWARDING ports from the same
bridging domain, and we are still compatible with the idea of multiple
bridges.
Fixes: df291e54cc ("net: ocelot: support multiple bridges")
Suggested-by: Colin Foster <colin.foster@in-advantage.com>
Reported-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e68daf61ed ]
Sometimes multiple CLS_REPLACE calls are issued for the same connection.
rhashtable_insert_fast does not check for these duplicates, so multiple
hardware flow entries can be created.
Fix this by checking for an existing entry early
Fixes: 502e84e238 ("net: ethernet: mtk_eth_soc: add flow offloading support")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 31339440b2 ]
Currently autoloading for SPI devices does not use the DT ID table, it uses
SPI modalises. Supporting OF modalises is going to be difficult if not
impractical, an attempt was made but has been reverted, so ensure that
module autoloading works for this driver by adding the part name used in
the compatible to the list of SPI IDs.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3106a08475 ]
syzkaller discovered memory leaks [1] that can be reduced to the
following commands:
# ip nexthop add id 1 blackhole
# devlink dev reload pci/0000:06:00.0
As part of the reload flow, mlxsw will unregister its netdevs and then
unregister from the nexthop notification chain. Before unregistering
from the notification chain, mlxsw will receive delete notifications for
nexthop objects using netdevs registered by mlxsw or their uppers. mlxsw
will not receive notifications for nexthops using netdevs that are not
dismantled as part of the reload flow. For example, the blackhole
nexthop above that internally uses the loopback netdev as its nexthop
device.
One way to fix this problem is to have listeners flush their nexthop
tables after unregistering from the notification chain. This is
error-prone as evident by this patch and also not symmetric with the
registration path where a listener receives a dump of all the existing
nexthops.
Therefore, fix this problem by replaying delete notifications for the
listener being unregistered. This is symmetric to the registration path
and also consistent with the netdev notification chain.
The above means that unregister_nexthop_notifier(), like
register_nexthop_notifier(), will have to take RTNL in order to iterate
over the existing nexthops and that any callers of the function cannot
hold RTNL. This is true for mlxsw and netdevsim, but not for the VXLAN
driver. To avoid a deadlock, change the latter to unregister its nexthop
listener without holding RTNL, making it symmetric to the registration
path.
[1]
unreferenced object 0xffff88806173d600 (size 512):
comm "syz-executor.0", pid 1290, jiffies 4295583142 (age 143.507s)
hex dump (first 32 bytes):
41 9d 1e 60 80 88 ff ff 08 d6 73 61 80 88 ff ff A..`......sa....
08 d6 73 61 80 88 ff ff 01 00 00 00 00 00 00 00 ..sa............
backtrace:
[<ffffffff81a6b576>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[<ffffffff81a6b576>] slab_post_alloc_hook+0x96/0x490 mm/slab.h:522
[<ffffffff81a716d3>] slab_alloc_node mm/slub.c:3206 [inline]
[<ffffffff81a716d3>] slab_alloc mm/slub.c:3214 [inline]
[<ffffffff81a716d3>] kmem_cache_alloc_trace+0x163/0x370 mm/slub.c:3231
[<ffffffff82e8681a>] kmalloc include/linux/slab.h:591 [inline]
[<ffffffff82e8681a>] kzalloc include/linux/slab.h:721 [inline]
[<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_group_create drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4918 [inline]
[<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_new drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5054 [inline]
[<ffffffff82e8681a>] mlxsw_sp_nexthop_obj_event+0x59a/0x2910 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:5239
[<ffffffff813ef67d>] notifier_call_chain+0xbd/0x210 kernel/notifier.c:83
[<ffffffff813f0662>] blocking_notifier_call_chain kernel/notifier.c:318 [inline]
[<ffffffff813f0662>] blocking_notifier_call_chain+0x72/0xa0 kernel/notifier.c:306
[<ffffffff8384b9c6>] call_nexthop_notifiers+0x156/0x310 net/ipv4/nexthop.c:244
[<ffffffff83852bd8>] insert_nexthop net/ipv4/nexthop.c:2336 [inline]
[<ffffffff83852bd8>] nexthop_add net/ipv4/nexthop.c:2644 [inline]
[<ffffffff83852bd8>] rtm_new_nexthop+0x14e8/0x4d10 net/ipv4/nexthop.c:2913
[<ffffffff833e9a78>] rtnetlink_rcv_msg+0x448/0xbf0 net/core/rtnetlink.c:5572
[<ffffffff83608703>] netlink_rcv_skb+0x173/0x480 net/netlink/af_netlink.c:2504
[<ffffffff833de032>] rtnetlink_rcv+0x22/0x30 net/core/rtnetlink.c:5590
[<ffffffff836069de>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
[<ffffffff836069de>] netlink_unicast+0x5ae/0x7f0 net/netlink/af_netlink.c:1340
[<ffffffff83607501>] netlink_sendmsg+0x8e1/0xe30 net/netlink/af_netlink.c:1929
[<ffffffff832fde84>] sock_sendmsg_nosec net/socket.c:704 [inline]
[<ffffffff832fde84>] sock_sendmsg net/socket.c:724 [inline]
[<ffffffff832fde84>] ____sys_sendmsg+0x874/0x9f0 net/socket.c:2409
[<ffffffff83304a44>] ___sys_sendmsg+0x104/0x170 net/socket.c:2463
[<ffffffff83304c01>] __sys_sendmsg+0x111/0x1f0 net/socket.c:2492
[<ffffffff83304d5d>] __do_sys_sendmsg net/socket.c:2501 [inline]
[<ffffffff83304d5d>] __se_sys_sendmsg net/socket.c:2499 [inline]
[<ffffffff83304d5d>] __x64_sys_sendmsg+0x7d/0xc0 net/socket.c:2499
Fixes: 2a014b200b ("mlxsw: spectrum_router: Add support for nexthop objects")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1ea7812326 ]
If the HW device is during recovery, the HW resources will never return,
hence we shouldn't wait for the CID (HW context ID) bitmaps to clear.
This fix speeds up the error recovery flow.
Fixes: 64515dc899 ("qed: Add infrastructure for error detection and recovery")
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cef0d022f5 ]
Commit 8dcb7a15a5 ("gpiolib: acpi: Take into account debounce settings")
made the gpiolib-acpi code call gpio_set_debounce_timeout() when requesting
GPIOs.
This in itself is fine, but it also made gpio_set_debounce_timeout()
errors fatal, causing the requesting of the GPIO to fail. This is causing
regressions. E.g. on a HP ElitePad 1000 G2 various _AEI specified GPIO
ACPI event sources specify a debouncy timeout of 20 ms, but the
pinctrl-baytrail.c only supports certain fixed values, the closest
ones being 12 or 24 ms and pinctrl-baytrail.c responds with -EINVAL
when specified a value which is not one of the fixed values.
This is causing the acpi_request_own_gpiod() call to fail for 3
ACPI event sources on the HP ElitePad 1000 G2, which in turn is causing
e.g. the battery charging vs discharging status to never get updated,
even though a charger has been plugged-in or unplugged.
Make gpio_set_debounce_timeout() errors non fatal, warning about the
failure instead, to fix this regression.
Note we should probably also fix various pinctrl drivers to just
pick the first bigger discrete value rather then returning -EINVAL but
this will need to be done on a per driver basis, where as this fix
at least gets us back to where things were before and thus restores
functionality on devices where this was lost due to
gpio_set_debounce_timeout() errors.
Fixes: 8dcb7a15a5 ("gpiolib: acpi: Take into account debounce settings")
Depends-on: 2e2b496ceb ("gpiolib: acpi: Extract acpi_request_own_gpiod() helper")
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 248f064af2 ]
When qeth_set_online() calls qeth_clear_working_pool_list() to roll
back after an error exit from qeth_hardsetup_card(), we are at risk of
accessing card->qdio.in_q before it was allocated by
qeth_alloc_qdio_queues() via qeth_mpc_initialize().
qeth_clear_working_pool_list() then dereferences NULL, and by writing to
queue->bufs[i].pool_entry scribbles all over the CPU's lowcore.
Resulting in a crash when those lowcore areas are used next (eg. on
the next machine-check interrupt).
Such a scenario would typically happen when the device is first set
online and its queues aren't allocated yet. An early IO error or certain
misconfigs (eg. mismatched transport mode, bad portno) then cause us to
error out from qeth_hardsetup_card() with card->qdio.in_q still being
NULL.
Fix it by checking the pointer for NULL before accessing it.
Note that we also have (rare) paths inside qeth_mpc_initialize() where
a configuration change can cause us to free the existing queues,
expecting that subsequent code will allocate them again. If we then
error out before that re-allocation happens, the same bug occurs.
Fixes: eff73e16ee ("s390/qeth: tolerate pre-filled RX buffer")
Reported-by: Stefan Raspl <raspl@linux.ibm.com>
Root-caused-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 96c8395e21 ]
During the v5.13 cycle we updated the SPI subsystem to generate OF style
modaliases for SPI devices, replacing the old Linux style modalises we
used to generate based on spi_device_id which are the DT style name with
the vendor removed. Unfortunately this means that we start only
reporting OF style modalises and not the old ones and there is nothing
that ensures that drivers list every possible OF compatible string in
their OF ID table. The result is that there are systems which have been
relying on loading modules based on the old style that are now broken,
as found by Russell King with spi-nor on Macchiatobin.
spi-nor is a particularly problematic case for this, it only lists a
single generic DT compatible jedec,spi-nor in the driver but supports a
huge raft of device specific compatibles, with a large set of part
numbers many of which are offered by multiple vendors. Russell's
searches of upstream device trees has turned up examples with vendor
names written in non-standard ways too. To make matters worse up until
8ff16cf77c ("Documentation: devicetree: m25p80: add "nor-jedec"
binding") the generic compatible was not part of the binding so there
are device trees out there written to that binding version which don't
list it all. The sheer number of parts supported together with our
previous approach of ignoring the vendor ID makes robustly fixing this
by adding compatibles to the spi-nor driver seem problematic, the
current DT binding document does not list all the parts supported by the
driver at the minute (further patches will fix this).
I've also investigated supporting both formats of modalias
simultaneously but that doesn't seem possible, especially without
breaking our userspace ABI which is obviously not viable.
Instead revert the relevant changes for now:
e09f2ab8ee ("spi: update modalias_show after of_device_uevent_modalias support")
3ce6c9e261 ("spi: add of_device_uevent_modalias support")
This will unfortunately mean that any system which had started having
modules autoload based on the OF compatibles for drivers that list
things there but not in the spi_device_ids will now not have those
modules load which is itself a regression. Since it affects a narrower
time window and the particularly problematic spi-nor driver may be
critical to system boot on smaller systems this seems the best of a
series of bad options. I will start an audit of SPI drivers to identify
and fix cases where things won't autoload using spi_device_id, this is
not great but seems to be the best way forward that anyone has been able
to identify.
Thanks to Russell for both his report and the additional diagnostic and
analysis work he has done here, the detailed research above was his
work.
Fixes: e09f2ab8ee ("spi: update modalias_show after of_device_uevent_modalias support")
Fixes: 3ce6c9e261 ("spi: add of_device_uevent_modalias support")
Reported-by: Russell King (Oracle) <linux@armlinux.org.uk>
Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: Andreas Schwab <schwab@suse.de>
Cc: Marco Felsch <m.felsch@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0e3dbf765f ]
During initialization of a signal testcase, features declared as required
are properly checked against the running system but no action is then taken
to effectively skip such a testcase.
Fix core signals test logic to abort initialization and report such a
testcase as skipped to the KSelfTest framework.
Fixes: f96bf43403 ("kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210920121228.35368-1-cristian.marussi@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 74b6d7d133 ]
The Linux device model permits both the ->shutdown and ->remove driver
methods to get called during a shutdown procedure. Example: a DSA switch
which sits on an SPI bus, and the SPI bus driver calls this on its
->shutdown method:
spi_unregister_controller
-> device_for_each_child(&ctlr->dev, NULL, __unregister);
-> spi_unregister_device(to_spi_device(dev));
-> device_del(&spi->dev);
So this is a simple pattern which can theoretically appear on any bus,
although the only other buses on which I've been able to find it are
I2C:
i2c_del_adapter
-> device_for_each_child(&adap->dev, NULL, __unregister_client);
-> i2c_unregister_device(client);
-> device_unregister(&client->dev);
The implication of this pattern is that devices on these buses can be
unregistered after having been shut down. The drivers for these devices
might choose to return early either from ->remove or ->shutdown if the
other callback has already run once, and they might choose that the
->shutdown method should only perform a subset of the teardown done by
->remove (to avoid unnecessary delays when rebooting).
So in other words, the device driver may choose on ->remove to not
do anything (therefore to not unregister an MDIO bus it has registered
on ->probe), because this ->remove is actually triggered by the
device_shutdown path, and its ->shutdown method has already run and done
the minimally required cleanup.
This used to be fine until the blamed commit, but now, the following
BUG_ON triggers:
void mdiobus_free(struct mii_bus *bus)
{
/* For compatibility with error handling in drivers. */
if (bus->state == MDIOBUS_ALLOCATED) {
kfree(bus);
return;
}
BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
bus->state = MDIOBUS_RELEASED;
put_device(&bus->dev);
}
In other words, there is an attempt to free an MDIO bus which was not
unregistered. The attempt to free it comes from the devres release
callbacks of the SPI device, which are executed after the device is
unregistered.
I'm not saying that the fact that MDIO buses allocated using devres
would automatically get unregistered wasn't strange. I'm just saying
that the commit didn't care about auditing existing call paths in the
kernel, and now, the following code sequences are potentially buggy:
(a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device
located on a bus that unregisters its children on shutdown. After
the blamed patch, either both the alloc and the register should use
devres, or none should.
(b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no
mdiobus_unregister at all in the remove path. After the blamed
patch, nobody unregisters the MDIO bus anymore, so this is even more
buggy than the previous case which needs a specific bus
configuration to be seen, this one is an unconditional bug.
In this case, the Realtek drivers fall under category (b). To solve it,
we can register the MDIO bus under devres too, which restores the
previous behavior.
Fixes: ac3a68d566 ("net: phy: don't abuse devres in devm_mdiobus_register()")
Reported-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reported-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5135e96a3d ]
The Linux device model permits both the ->shutdown and ->remove driver
methods to get called during a shutdown procedure. Example: a DSA switch
which sits on an SPI bus, and the SPI bus driver calls this on its
->shutdown method:
spi_unregister_controller
-> device_for_each_child(&ctlr->dev, NULL, __unregister);
-> spi_unregister_device(to_spi_device(dev));
-> device_del(&spi->dev);
So this is a simple pattern which can theoretically appear on any bus,
although the only other buses on which I've been able to find it are
I2C:
i2c_del_adapter
-> device_for_each_child(&adap->dev, NULL, __unregister_client);
-> i2c_unregister_device(client);
-> device_unregister(&client->dev);
The implication of this pattern is that devices on these buses can be
unregistered after having been shut down. The drivers for these devices
might choose to return early either from ->remove or ->shutdown if the
other callback has already run once, and they might choose that the
->shutdown method should only perform a subset of the teardown done by
->remove (to avoid unnecessary delays when rebooting).
So in other words, the device driver may choose on ->remove to not
do anything (therefore to not unregister an MDIO bus it has registered
on ->probe), because this ->remove is actually triggered by the
device_shutdown path, and its ->shutdown method has already run and done
the minimally required cleanup.
This used to be fine until the blamed commit, but now, the following
BUG_ON triggers:
void mdiobus_free(struct mii_bus *bus)
{
/* For compatibility with error handling in drivers. */
if (bus->state == MDIOBUS_ALLOCATED) {
kfree(bus);
return;
}
BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
bus->state = MDIOBUS_RELEASED;
put_device(&bus->dev);
}
In other words, there is an attempt to free an MDIO bus which was not
unregistered. The attempt to free it comes from the devres release
callbacks of the SPI device, which are executed after the device is
unregistered.
I'm not saying that the fact that MDIO buses allocated using devres
would automatically get unregistered wasn't strange. I'm just saying
that the commit didn't care about auditing existing call paths in the
kernel, and now, the following code sequences are potentially buggy:
(a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device
located on a bus that unregisters its children on shutdown. After
the blamed patch, either both the alloc and the register should use
devres, or none should.
(b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no
mdiobus_unregister at all in the remove path. After the blamed
patch, nobody unregisters the MDIO bus anymore, so this is even more
buggy than the previous case which needs a specific bus
configuration to be seen, this one is an unconditional bug.
In this case, DSA falls into category (a), it tries to be helpful and
registers an MDIO bus on behalf of the switch, which might be on such a
bus. I've no idea why it does it under devres.
It does this on probe:
if (!ds->slave_mii_bus && ds->ops->phy_read)
alloc and register mdio bus
and this on remove:
if (ds->slave_mii_bus && ds->ops->phy_read)
unregister mdio bus
I _could_ imagine using devres because the condition used on remove is
different than the condition used on probe. So strictly speaking, DSA
cannot determine whether the ds->slave_mii_bus it sees on remove is the
ds->slave_mii_bus that _it_ has allocated on probe. Using devres would
have solved that problem. But nonetheless, the existing code already
proceeds to unregister the MDIO bus, even though it might be
unregistering an MDIO bus it has never registered. So I can only guess
that no driver that implements ds->ops->phy_read also allocates and
registers ds->slave_mii_bus itself.
So in that case, if unregistering is fine, freeing must be fine too.
Stop using devres and free the MDIO bus manually. This will make devres
stop attempting to free a still registered MDIO bus on ->shutdown.
Fixes: ac3a68d566 ("net: phy: don't abuse devres in devm_mdiobus_register()")
Reported-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e5845aa0ea ]
Since the blamed commit, dsa_tree_teardown_switches() was split into two
smaller functions, dsa_tree_teardown_switches and dsa_tree_teardown_ports.
However, the error path of dsa_tree_setup stopped calling dsa_tree_teardown_ports.
Fixes: a57d8c217a ("net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a18cee4791 ]
The abort_work is scheduled when a connection was detected to be
out-of-sync after a link failure. The work calls smc_conn_kill(),
which calls smc_close_active_abort() and that might end up calling
smc_close_cancel_work().
smc_close_cancel_work() cancels any pending close_work and tx_work but
needs to release the sock_lock before and acquires the sock_lock again
afterwards. So when the sock_lock was NOT acquired before then it may
be held after the abort_work completes. Thats why the sock_lock is
acquired before the call to smc_conn_kill() in __smc_lgr_terminate(),
but this is missing in smc_conn_abort_work().
Fix that by acquiring the sock_lock first and release it after the
call to smc_conn_kill().
Fixes: b286a0651e ("net/smc: handle incoming CDC validation message")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5126b9d3d4 ]
hclge_get_reset_status() should return the tqp reset status.
However, if the CMDQ fails, the caller will take it as tqp reset
success status by mistake. Therefore, uses a parameters to get
the tqp reset status instead.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ef39d63260 ]
The input parameters may not be reliable, so check the vlan id before
using it, otherwise may set wrong vlan id into hardware.
Fixes: dc8131d846 ("net: hns3: Fix for packet loss due wrong filter config in VLAN tbls")
Signed-off-by: liaoguojia <liaoguojia@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 63b1279d99 ]
The input parameters may not be reliable. Before using the
queue id, we should check this parameter. Otherwise, memory
overwriting may occur.
Fixes: d341001846 ("net: hns3: refactor the mailbox message between PF and VF")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 311c0aaa9b ]
vport_id include PF and VFs, vport_id = 0 means PF, other values mean VFs.
So the actual vf id is equal to vport_id minus 1.
Some VF print logs are actually vport, and logs of vf id actually use
vport id, so this patch fixes them.
Fixes: ac887be5b0 ("net: hns3: change print level of RAS error log from warning to error")
Fixes: adcf738b80 ("net: hns3: cleanup some print format warning")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 91bc0d5272 ]
The vf id from ethtool is added 1 before configured to driver.
So it's necessary to minus 1 when printing it, in order to
keep consistent with user's configuration.
Fixes: dd74f815dd ("net: hns3: Add support for rule add/delete for flow director")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e184cec5e2 ]
When user change rss 'hfunc' without set rss 'hkey' by ethtool
-X command, the driver will ignore the 'hfunc' for the hkey is
NULL. It's unreasonable. So fix it.
Fixes: 46a3df9f97 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes: 374ad29176 ("net: hns3: Add RSS general configuration support for VF")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5bed8b0704 ]
The smallest TX ring size we support must fit a TX SKB with MAX_SKB_FRAGS
+ 1. Because the first TX BD for a packet is always a long TX BD, we
need an extra TX BD to fit this packet. Define BNXT_MIN_TX_DESC_CNT with
this value to make this more clear. The current code uses a minimum
that is off by 1. Fix it using this constant.
The tx_wake_thresh to determine when to wake up the TX queue is half the
ring size but we must have at least BNXT_MIN_TX_DESC_CNT for the next
packet which may have maximum fragments. So the comparison of the
available TX BDs with tx_wake_thresh should be >= instead of > in the
current code. Otherwise, at the smallest ring size, we will never wake
up the TX queue and will cause TX timeout.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3765996e4f ]
The process will cause napi.state to contain NAPI_STATE_SCHED and
not in the poll_list, which will cause napi_disable() to get stuck.
The prefix "NAPI_STATE_" is removed in the figure below, and
NAPI_STATE_HASHED is ignored in napi.state.
CPU0 | CPU1 | napi.state
===============================================================================
napi_disable() | | SCHED | NPSVC
napi_enable() | |
{ | |
smp_mb__before_atomic(); | |
clear_bit(SCHED, &n->state); | | NPSVC
| napi_schedule_prep() | SCHED | NPSVC
| napi_poll() |
| napi_complete_done() |
| { |
| if (n->state & (NPSVC | | (1)
| _BUSY_POLL))) |
| return false; |
| ................ |
| } | SCHED | NPSVC
| |
clear_bit(NPSVC, &n->state); | | SCHED
} | |
| |
napi_schedule_prep() | | SCHED | MISSED (2)
(1) Here return direct. Because of NAPI_STATE_NPSVC exists.
(2) NAPI_STATE_SCHED exists. So not add napi.poll_list to sd->poll_list
Since NAPI_STATE_SCHED already exists and napi is not in the
sd->poll_list queue, NAPI_STATE_SCHED cannot be cleared and will always
exist.
1. This will cause this queue to no longer receive packets.
2. If you encounter napi_disable under the protection of rtnl_lock, it
will cause the entire rtnl_lock to be locked, affecting the overall
system.
This patch uses cmpxchg to implement napi_enable(), which ensures that
there will be no race due to the separation of clear two bits.
Fixes: 2d8bff1269 ("netpoll: Close race condition between poll_one_napi and napi_disable")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 029497e66b ]
Due to the inclusion of nvmem handling into the mac-address getter
function of_get_mac_address() by
commit d01f449c00 ("of_net: add NVMEM support to of_get_mac_address")
it is now possible to get a -EPROBE_DEFER return code. Which did cause
bgmac to assign a random ethernet address.
This exact issue happened on my Meraki MR32. The nvmem provider is
an EEPROM (at24c64) which gets instantiated once the module
driver is loaded... This happens once the filesystem becomes available.
With this patch, bgmac_probe() will propagate the -EPROBE_DEFER error.
Then the driver subsystem will reschedule the probe at a later time.
Cc: Petr Štetiar <ynezz@true.cz>
Cc: Michael Walle <michael@walle.cc>
Fixes: d01f449c00 ("of_net: add NVMEM support to of_get_mac_address")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fd292c189a ]
Commit 86f8b1c01a ("net: dsa: Do not make user port errors fatal")
decided it was fine to ignore errors on certain ports that fail to
probe, and go on with the ports that do probe fine.
Commit fb6ec87f72 ("net: dsa: Fix type was not set for devlink port")
noticed that devlink_port_type_eth_set(dlp, dp->slave); does not get
called, and devlink notices after a timeout of 3600 seconds and prints a
WARN_ON. So it went ahead to unregister the devlink port. And because
there exists an UNUSED port flavour, we actually re-register the devlink
port as UNUSED.
Commit 08156ba430 ("net: dsa: Add devlink port regions support to
DSA") added devlink port regions, which are set up by the driver and not
by DSA.
When we trigger the devlink port deregistration and reregistration as
unused, devlink now prints another WARN_ON, from here:
devlink_port_unregister:
WARN_ON(!list_empty(&devlink_port->region_list));
So the port still has regions, which makes sense, because they were set
up by the driver, and the driver doesn't know we're unregistering the
devlink port.
Somebody needs to tear them down, and optionally (actually it would be
nice, to be consistent) set them up again for the new devlink port.
But DSA's layering stays in our way quite badly here.
The options I've considered are:
1. Introduce a function in devlink to just change a port's type and
flavour. No dice, devlink keeps a lot of state, it really wants the
port to not be registered when you set its parameters, so changing
anything can only be done by destroying what we currently have and
recreating it.
2. Make DSA cache the parameters passed to dsa_devlink_port_region_create,
and the region returned, keep those in a list, then when the devlink
port unregister needs to take place, the existing devlink regions are
destroyed by DSA, and we replay the creation of new regions using the
cached parameters. Problem: mv88e6xxx keeps the region pointers in
chip->ports[port].region, and these will remain stale after DSA frees
them. There are many things DSA can do, but updating mv88e6xxx's
private pointers is not one of them.
3. Just let the driver do it (i.e. introduce a very specific method
called ds->ops->port_reinit_as_unused, which unregisters its devlink
port devlink regions, then the old devlink port, then registers the
new one, then the devlink port regions for it). While it does work,
as opposed to the others, it's pretty horrible from an API
perspective and we can do better.
4. Introduce a new pair of methods, ->port_setup and ->port_teardown,
which in the case of mv88e6xxx must register and unregister the
devlink port regions. Call these 2 methods when the port must be
reinitialized as unused.
Naturally, I went for the 4th approach.
Fixes: 08156ba430 ("net: dsa: Add devlink port regions support to DSA")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f7afa05c9 ]
The only struct dim_sample member that does not get
initialized by dim_update_sample() is comp_ctr. (There
is special API to initialize comp_ctr:
dim_update_sample_with_comps(), and it is currently used
only for RDMA.) comp_ctr is used to compute curr_stats->cmps
and curr_stats->cpe_ratio (see dim_calc_stats()) which in
turn are consumed by the rdma_dim_*() API. Therefore,
functionally, the net_dim*() API consumers are not affected.
Nevertheless, fix the computation of statistics based
on an uninitialized variable, even if the mentioned statistics
are not used at the moment.
Fixes: ae0e6a5d16 ("enetc: Add adaptive interrupt coalescing")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7237a494de ]
irq_set_affinity_hit() stores a reference to the cpumask_t
parameter in the irq descriptor, and that reference can be
accessed later from irq_affinity_hint_proc_show(). Since
the cpu_mask parameter passed to irq_set_affinity_hit() has
only temporary storage (it's on the stack memory), later
accesses to it are illegal. Thus reads from the corresponding
procfs affinity_hint file can result in paging request oops.
The issue is fixed by the get_cpu_mask() helper, which provides
a permanent storage for the cpumask_t parameter.
Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit afd92d82c9 ]
We try to use build_skb() if we had sufficient tailroom. But we forget
to release the unused pages chained via private in big mode which will
leak pages. Fixing this by release the pages after building the skb in
big mode.
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Fixes: fb32856b16 ("virtio-net: page_to_skb() use build_skb when there's sufficient tailroom")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 89c485c7a3 ]
Dai Ngo reports that, since the XDR overhaul, the NLM server crashes
when the TEST procedure wants to return NLM_DENIED. There is a bug
in svcxdr_encode_owner() that none of our standard test cases found.
Replace the open-coded function with a call to an appropriate
pre-fabricated XDR helper.
Reported-by: Dai Ngo <Dai.Ngo@oracle.com>
Fixes: a6a63ca565 ("lockd: Common NLM XDR helpers")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d37e1cab2 ]
When an afs file or directory is modified locally such that the total file
size is extended, i_blocks needs to be recalculated too.
Fix this by making afs_write_end() and afs_edit_dir_add() call
afs_set_i_size() rather than setting inode->i_size directly as that also
recalculates inode->i_blocks.
This can be tested by creating and writing into directories and files and
then examining them with du. Without this change, directories show a 4
blocks (they start out at 2048 bytes) and files show 0 blocks; with this
change, they should show a number of blocks proportional to the file size
rounded up to 1024.
Fixes: 31143d5d51 ("AFS: implement basic file write support")
Fixes: 63a4681ff3 ("afs: Locally edit directory data for mkdir/create/unlink/...")
Reported-by: Markus Suvanto <markus.suvanto@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163113612442.352844.11162345591911691150.stgit@warthog.procyon.org.uk/
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b537a3c217 ]
AFS-3 has two data fetch RPC variants, FS.FetchData and FS.FetchData64, and
Linux's afs client switches between them when talking to a non-YFS server
if the read size, the file position or the sum of the two have the upper 32
bits set of the 64-bit value.
This is a problem, however, since the file position and length fields of
FS.FetchData are *signed* 32-bit values.
Fix this by capturing the capability bits obtained from the fileserver when
it's sent an FS.GetCapabilities RPC, rather than just discarding them, and
then picking out the VICED_CAPABILITY_64BITFILES flag. This can then be
used to decide whether to use FS.FetchData or FS.FetchData64 - and also
FS.StoreData or FS.StoreData64 - rather than using upper_32_bits() to
switch on the parameter values.
This capabilities flag could also be used to limit the maximum size of the
file, but all servers must be checked for that.
Note that the issue does not exist with FS.StoreData - that uses *unsigned*
32-bit values. It's also not a problem with Auristor servers as its
YFS.FetchData64 op uses unsigned 64-bit values.
This can be tested by cloning a git repo through an OpenAFS client to an
OpenAFS server and then doing "git status" on it from a Linux afs
client[1]. Provided the clone has a pack file that's in the 2G-4G range,
the git status will show errors like:
error: packfile .git/objects/pack/pack-5e813c51d12b6847bbc0fcd97c2bca66da50079c.pack does not match index
error: packfile .git/objects/pack/pack-5e813c51d12b6847bbc0fcd97c2bca66da50079c.pack does not match index
This can be observed in the server's FileLog with something like the
following appearing:
Sun Aug 29 19:31:39 2021 SRXAFS_FetchData, Fid = 2303380852.491776.3263114, Host 192.168.11.201:7001, Id 1001
Sun Aug 29 19:31:39 2021 CheckRights: len=0, for host=192.168.11.201:7001
Sun Aug 29 19:31:39 2021 FetchData_RXStyle: Pos 18446744071815340032, Len 3154
Sun Aug 29 19:31:39 2021 FetchData_RXStyle: file size 2400758866
...
Sun Aug 29 19:31:40 2021 SRXAFS_FetchData returns 5
Note the file position of 18446744071815340032. This is the requested file
position sign-extended.
Fixes: b9b1f8d593 ("AFS: write support fixes")
Reported-by: Markus Suvanto <markus.suvanto@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
cc: openafs-devel@openafs.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217#c9 [1]
Link: https://lore.kernel.org/r/951332.1631308745@warthog.procyon.org.uk/
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 63d49d843e ]
The AFS filesystem is currently triggering the silly-rename cleanup from
afs_d_revalidate() when it sees that a dentry has been changed by a third
party[1]. It should not be doing this as the cleanup includes deleting the
silly-rename target file on iput.
Fix this by removing the places in the d_revalidate handling that validate
anything other than the directory and the dirent. It probably should not
be looking to validate the target inode of the dentry also.
This includes removing the point in afs_d_revalidate() where the inode that
a dentry used to point to was marked as being deleted (AFS_VNODE_DELETED).
We don't know it got deleted. It could have been renamed or it could have
hard links remaining.
This was reproduced by cloning a git repo onto an afs volume on one
machine, switching to another machine and doing "git status", then
switching back to the first and doing "git status". The second status
would show weird output due to ".git/index" getting deleted by the above
mentioned mechanism.
A simpler way to do it is to do:
machine 1: touch a
machine 2: touch b; mv -f b a
machine 1: stat a
on an afs volume. The bug shows up as the stat failing with ENOENT and the
file server log showing that machine 1 deleted "a".
Fixes: 79ddbfa500 ("afs: Implement sillyrename for unlink and rename")
Reported-by: Markus Suvanto <markus.suvanto@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214217#c4 [1]
Link: https://lore.kernel.org/r/163111668100.283156.3851669884664475428.stgit@warthog.procyon.org.uk/
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 581b2027af ]
There's a loop in afs_extend_writeback() that adds extra pages to a write
we want to make to improve the efficiency of the writeback by making it
larger. This loop stops, however, if we hit a page we can't write back
from immediately, but it doesn't get rid of the page ref we speculatively
acquired.
This was caused by the removal of the cleanup loop when the code switched
from using find_get_pages_contig() to xarray scanning as the latter only
gets a single page at a time, not a batch.
Fix this by putting the page on a ref on an early break from the loop.
Unfortunately, we can't just add that page to the pagevec we're employing
as we'll go through that and add those pages to the RPC call.
This was found by the generic/074 test. It leaks ~4GiB of RAM each time it
is run - which can be observed with "top".
Fixes: e87b03f583 ("afs: Prepare for use of THPs")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-tested-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111666635.283156.177701903478910460.stgit@warthog.procyon.org.uk/
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 22b70e6f2d upstream.
A noted side-effect of commit 0c6c2d3615 ("arm64: Generate cpucaps.h")
is that cpucaps are now sorted, changing the enumeration order. This
assumed no dependencies between cpucaps, which turned out not to be true
in one case. UNMAP_KERNEL_AT_EL0 currently needs to be processed after
WORKAROUND_CAVIUM_27456. ThunderX systems are incompatible with KPTI, so
unmap_kernel_at_el0() bails if WORKAROUND_CAVIUM_27456 is set. But because
of the sorting, WORKAROUND_CAVIUM_27456 will not yet have been considered
when unmap_kernel_at_el0() checks for it, so the kernel tries to
run w/ KPTI - and quickly falls over.
Because all ThunderX implementations have homogeneous CPUs, we can remove
this dependency by just checking the current CPU for the erratum.
Fixes: 0c6c2d3615 ("arm64: Generate cpucaps.h")
Cc: <stable@vger.kernel.org> # 5.13.x
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210923145002.3394558-1-dann.frazier@canonical.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c3c8e88c8 upstream.
There have been reports of approximately a 0.9%-1.7% failure rate in SMU
communication timeouts with s0i3 entry on some OEM designs. Currently
the design in amd-pmc is to try every 100us for up to 20ms.
However the GPU driver which also communicates with the SMU using a
mailbox register which the driver polls every 1us for up to 2000ms.
In the GPU driver this was increased by commit 055162645a ("drm/amd/pm:
increase time out value when sending msg to SMU")
Increase the maximum timeout used by amd-pmc to 2000ms to match this
behavior. This has been shown to improve the stability for machines
that randomly have failures.
Cc: stable@kernel.org
Reported-by: Julian Sikorski <belegdol@gmail.com>
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1629
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Link: https://lore.kernel.org/r/20210914020115.655-1-mario.limonciello@amd.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8f69b16ee upstream.
If resource allocation and registration fail for a muxed tty device
(e.g. if there are no more minor numbers) the driver should not try to
deregister the never-registered (or already-deregistered) tty.
Fix up the error handling to avoid dereferencing a NULL pointer when
attempting to remove the character device.
Fixes: 72dc1c096c ("HSO: add option hso driver")
Cc: stable@vger.kernel.org # 2.6.27
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ab39d3cef5 upstream.
Update the current state as boot state during dpm initialization.
During the subsequent initialization, set_power_state gets called to
transition to the final power state. set_power_state refers to values
from the current state and without current state populated, it could
result in NULL pointer dereference.
For ex: on platforms where PCI speed change is supported through ACPI
ATCS method, the link speed of current state needs to be queried before
deciding on changing to final power state's link speed. The logic to query
ATCS-support was broken on certain platforms. The issue became visible
when broken ATCS-support logic got fixed with commit
f9b7f3703f ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)").
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1698
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7215e90981 upstream.
Reporting zones on a SCSI device sometimes fail with the following error:
[76248.516390] ata16.00: invalid transfer count 131328
[76248.523618] sd 15:0:0:0: [sda] REPORT ZONES start lba 536870912 failed
The error (from drivers/ata/libata-scsi.c:ata_scsi_zbc_in_xlat()) indicates
that buffer size is not aligned to SECTOR_SIZE.
This happens when the __vmalloc() failed. Consider we are reporting 4096
zones, then we will have "bufsize = roundup((4096 + 1) * 64,
SECTOR_SIZE)" = (513 * 512) = 262656. Then, __vmalloc() failure halves
the bufsize to 131328, which is no longer aligned to SECTOR_SIZE.
Use rounddown() to ensure the size is always aligned to SECTOR_SIZE and fix
the comment as well.
Link: https://lore.kernel.org/r/20210906140642.2267569-1-naohiro.aota@wdc.com
Fixes: 23a50861ad ("scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer()")
Cc: stable@vger.kernel.org # 5.5+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74e1eb3b4a upstream.
Driver's tx_empty callback should signal when the transmit shift register
is empty. So when the last character has been sent.
STAT_TX_FIFO_EMP bit signals only that HW transmit FIFO is empty, which
happens when the last byte is loaded into transmit shift register.
STAT_TX_EMP bit signals when the both HW transmit FIFO and transmit shift
register are empty.
So replace STAT_TX_FIFO_EMP check by STAT_TX_EMP in mvebu_uart_tx_empty()
callback function.
Fixes: 30530791a7 ("serial: mvebu-uart: initial support for Armada-3700 serial port")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Pali Rohár <pali@kernel.org>
Link: https://lore.kernel.org/r/20210911132017.25505-1-pali@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 563f23b002 upstream.
The resilient nexthop group torture tests in fib_nexthop.sh exposed a
possible division by zero while replacing a resilient group [1]. The
division by zero occurs when the data path sees a resilient nexthop
group with zero buckets.
The tests replace a resilient nexthop group in a loop while traffic is
forwarded through it. The tests do not specify the number of buckets
while performing the replacement, resulting in the kernel allocating a
stub resilient table (i.e, 'struct nh_res_table') with zero buckets.
This table should never be visible to the data path, but the old nexthop
group (i.e., 'oldg') might still be used by the data path when the stub
table is assigned to it.
Fix this by only assigning the stub table to the old nexthop group after
making sure the group is no longer used by the data path.
Tested with fib_nexthops.sh:
Tests passed: 222
Tests failed: 0
[1]
divide error: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 1850 Comm: ping Not tainted 5.14.0-custom-10271-ga86eb53057fe #1107
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014
RIP: 0010:nexthop_select_path+0x2d2/0x1a80
[...]
Call Trace:
fib_select_multipath+0x79b/0x1530
fib_select_path+0x8fb/0x1c10
ip_route_output_key_hash_rcu+0x1198/0x2da0
ip_route_output_key_hash+0x190/0x340
ip_route_output_flow+0x21/0x120
raw_sendmsg+0x91d/0x2e10
inet_sendmsg+0x9e/0xe0
__sys_sendto+0x23d/0x360
__x64_sys_sendto+0xe1/0x1b0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Cc: stable@vger.kernel.org
Fixes: 283a72a559 ("nexthop: Add implementation of resilient next-hop groups")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8646e53633 upstream.
Invoke rseq's NOTIFY_RESUME handler when processing the flag prior to
transferring to a KVM guest, which is roughly equivalent to an exit to
userspace and processes many of the same pending actions. While the task
cannot be in an rseq critical section as the KVM path is reachable only
by via ioctl(KVM_RUN), the side effects that apply to rseq outside of a
critical section still apply, e.g. the current CPU needs to be updated if
the task is migrated.
Clearing TIF_NOTIFY_RESUME without informing rseq can lead to segfaults
and other badness in userspace VMMs that use rseq in combination with KVM,
e.g. due to the CPU ID being stale after task migration.
Fixes: 72c3c0fe54 ("x86/kvm: Use generic xfer to guest work function")
Reported-by: Peter Foley <pefoley@google.com>
Bisected-by: Doug Evans <dje@google.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210901203030.1292304-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58877b0824 upstream.
It has been observed with certain PCIe USB cards (like Inateck connected
to AM64 EVM or J7200 EVM) that as soon as the primary roothub is
registered, port status change is handled even before xHC is running
leading to cold plug USB devices not detected. For such cases, registering
both the root hubs along with the second HCD is required. Add support for
deferring roothub registration in usb_add_hcd(), so that both primary and
secondary roothubs are registered along with the second HCD.
CC: stable@vger.kernel.org # 5.4+
Suggested-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Chris Chiu <chris.chiu@canonical.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20210909064200.16216-2-kishon@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8cfac9a674 upstream.
After we start to do core soft reset while usb role switch,
the phy init is invoked at every switch to device mode, but
its counter part de-init is missing, this causes the actual
phy init can not be done when we really want to re-init phy
like system resume, because the counter maintained by phy
core is not 0. considering phy init is actually redundant for
role switch, so move out the phy init from core soft reset to
dwc3 core init where is the only place required.
Fixes: f88359e158 ("usb: dwc3: core: Do core softreset when switch mode")
Cc: <stable@vger.kernel.org>
Tested-by: faqiang.zhu <faqiang.zhu@nxp.com>
Tested-by: John Stultz <john.stultz@linaro.org> #HiKey960
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Li Jun <jun.li@nxp.com>
Link: https://lore.kernel.org/r/1631068099-13559-1-git-send-email-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 41f6731838 upstream.
When polling for a setup or clear of a register field we were sleeping
in atomic context but using a very tight sleep interval.
Since the use cases for this poll mechanism are only in setup and
stop paths, and in practice this poll is not used most of the times
but needs to be there to comply to hardware setup times, remove the
sleep time and make the poll loop tighter.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rui Miguel Silva <rui.silva@linaro.org>
Link: https://lore.kernel.org/r/20210727100516.4190681-3-rui.silva@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b564171ade upstream.
Currently cgroup freezer is used to freeze the application threads, and
BINDER_FREEZE is used to freeze the corresponding binder interface.
There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any
existing transactions to drain out before actually freezing the binder
interface.
But freezing an app requires 2 steps, freezing the binder interface with
ioctl(BINDER_FREEZE) and then freezing the application main threads with
cgroupfs. This is not an atomic operation. The following race issue
might happen.
1) Binder interface is frozen by ioctl(BINDER_FREEZE);
2) Main thread A initiates a new sync binder transaction to process B;
3) Main thread A is frozen by "echo 1 > cgroup.freeze";
4) The response from process B reaches the frozen thread, which will
unexpectedly fail.
This patch provides a mechanism to check if there's any new pending
transaction happening between ioctl(BINDER_FREEZE) and freezing the
main thread. If there's any, the main thread freezing operation can
be rolled back to finish the pending transaction.
Furthermore, the response might reach the binder driver before the
rollback actually happens. That will still cause failed transaction.
As the other process doesn't wait for another response of the response,
the response transaction failure can be fixed by treating the response
transaction like an oneway/async one, allowing it to reach the frozen
thread. And it will be consumed when the thread gets unfrozen later.
NOTE: This patch reuses the existing definition of struct
binder_frozen_status_info but expands the bit assignments of __u32
member sync_recv.
To ensure backward compatibility, bit 0 of sync_recv still indicates
there's an outstanding sync binder transaction. This patch adds new
information to bit 1 of sync_recv, indicating the binder transaction
happens exactly when there's a race.
If an existing userspace app runs on a new kernel, a sync binder call
will set bit 0 of sync_recv so ioctl(BINDER_GET_FROZEN_INFO) still
return the expected value (true). The app just doesn't check bit 1
intentionally so it doesn't have the ability to tell if there's a race.
This behavior is aligned with what happens on an old kernel which
doesn't set bit 1 at all.
A new userspace app can 1) check bit 0 to know if there's a sync binder
transaction happened when being frozen - same as before; and 2) check
bit 1 to know if that sync binder transaction happened exactly when
there's a race - a new information for rollback decision.
the same time, confirmed the pending transactions succeeded.
Fixes: 432ff1e916 ("binder: BINDER_FREEZE ioctl")
Acked-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Li Li <dualli@google.com>
Test: stress test with apps being frozen and initiating binder calls at
Link: https://lore.kernel.org/r/20210910164210.2282716-2-dualli@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5fdb55c1ac upstream.
During BC_FREE_BUFFER processing, the BINDER_TYPE_FDA object
cleanup may close 1 or more fds. The close operations are
completed using the task work mechanism -- which means the thread
needs to return to userspace or the file object may never be
dereferenced -- which can lead to hung processes.
Force the binder thread back to userspace if an fd is closed during
BC_FREE_BUFFER handling.
Fixes: 80cd795630 ("binder: fix use-after-free due to ksys_close() during fdget()")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Martijn Coenen <maco@android.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Todd Kjos <tkjos@google.com>
Link: https://lore.kernel.org/r/20210830195146.587206-1-tkjos@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d91adc5322 upstream.
This reverts commit f3de5d857b.
That commit broke USB on all routers that have USB always powered on and
don't require toggling any GPIO. It's a majority of devices actually.
The original code worked and seemed safe: vcc GPIO is optional and
bcma_hci_platform_power_gpio() takes care of checking the pointer before
using it.
This revert fixes:
[ 10.801127] bcma_hcd: probe of bcma0:11 failed with error -2
Fixes: f3de5d857b ("USB: bcma: Add a check for devm_gpiod_get")
Cc: stable <stable@vger.kernel.org>
Cc: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Link: https://lore.kernel.org/r/20210831065419.18371-1-zajec5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b55d37ef6b upstream.
ScanLogic SL11R-IDE with firmware older than 2.6c (the latest one) has
broken tag handling, preventing the device from working at all:
usb 1-1: new full-speed USB device number 2 using uhci_hcd
usb 1-1: New USB device found, idVendor=04ce, idProduct=0002, bcdDevice= 2.60
usb 1-1: New USB device strings: Mfr=1, Product=1, SerialNumber=0
usb 1-1: Product: USB Device
usb 1-1: Manufacturer: USB Device
usb-storage 1-1:1.0: USB Mass Storage device detected
scsi host2: usb-storage 1-1:1.0
usbcore: registered new interface driver usb-storage
usb 1-1: reset full-speed USB device number 2 using uhci_hcd
usb 1-1: reset full-speed USB device number 2 using uhci_hcd
usb 1-1: reset full-speed USB device number 2 using uhci_hcd
usb 1-1: reset full-speed USB device number 2 using uhci_hcd
Add US_FL_BULK_IGNORE_TAG to fix it. Also update my e-mail address.
2.6c is the only firmware that claims Linux compatibility.
The firmware can be upgraded using ezotgdbg utility:
https://github.com/asciilifeform/ezotgdbg
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ondrej Zary <linux@zary.sk>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210913210106.12717-1-linux@zary.sk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0594c58161 upstream.
The initial observation was that in PV mode under Xen 32-bit user space
didn't work anymore. Attempts of system calls ended in #GP(0x402). All
of the sudden the vector 0x80 handler was not in place anymore. As it
turns out up to 5.13 redundant initialization did occur: Once from
cpu_initialize_context() (through its VCPUOP_initialise hypercall) and a
2nd time while each CPU was brought fully up. This 2nd initialization is
now gone, uncovering that the 1st one was flawed: Unlike for the
set_trap_table hypercall, a full virtual IDT needs to be specified here;
the "vector" fields of the individual entries are of no interest. With
many (kernel) IDT entries still(?) (i.e. at that point at least) empty,
the syscall vector 0x80 ended up in slot 0x20 of the virtual IDT, thus
becoming the domain's handler for vector 0x20.
Make xen_convert_trap_info() fit for either purpose, leveraging the fact
that on the xen_copy_trap_info() path the table starts out zero-filled.
This includes moving out the writing of the sentinel, which would also
have lead to a buffer overrun in the xen_copy_trap_info() case if all
(kernel) IDT entries were populated. Convert the writing of the sentinel
to clearing of the entire table entry rather than just the address
field.
(I didn't bother trying to identify the commit which uncovered the issue
in 5.14; the commit named below is the one which actually introduced the
bad code.)
Fixes: f87e4cac4f ("xen: SMP guest support")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/7a266932-092e-b68f-f2bb-1473b61adc6e@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9ed38fd4a1 upstream.
Although very unlikely that the tlink pointer would be null in this case,
get_next_mid function can in theory return null (but not an error)
so need to check for null (not for IS_ERR, which can not be returned
here).
Address warning:
fs/smbfs_client/connect.c:2392 cifs_match_super()
warn: 'tlink' isn't an ERR_PTR
Pointed out by Dan Carpenter via smatch code analysis tool
CC: stable@vger.kernel.org
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91bb163e1e upstream.
According USB spec each ISOC transaction should be performed in a
designated for that transaction interval. On bus errors or delays
in operating system scheduling of client software can result in no
packet being transferred for a (micro)frame. An error indication
should be returned as status to the client software in such a case.
Current implementation in case of missed/dropped interval send same
data in next possible interval instead of reporting missed isoc.
This fix complete requests with -ENODATA if interval elapsed.
HSOTG core in BDMA and Slave modes haven't HW support for
(micro)frames tracking, this is why SW should care about tracking
of (micro)frames. Because of that method and consider operating
system scheduling delays, added few additional checking's of elapsed
target (micro)frame:
1. Immediately before enabling EP to start transfer.
2. With any transfer completion interrupt.
3. With incomplete isoc in/out interrupt.
4. With EP disabled interrupt because of incomplete transfer.
5. With OUT token received while EP disabled interrupt (for OUT
transfers).
6. With NAK replied to IN token interrupt (for IN transfers).
As part of ISOC flow, additionally fixed 'current' and 'target' frame
calculation functions. In HS mode SOF limits provided by DSTS register
is 0x3fff, but in non HS mode this limit is 0x7ff.
Tested by internal tool which also using for dwc3 testing.
Signed-off-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/95d1423adf4b0f68187c9894820c4b7e964a3f7f.1631175721.git.Minas.Harutyunyan@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bcbda81020 upstream.
We get an unexpected value of /proc/sys/vm/overcommit_memory after
running the following program:
int main()
{
int fd = open("/proc/sys/vm/overcommit_memory", O_RDWR);
write(fd, "1", 1);
write(fd, "2", 1);
close(fd);
}
write(fd, "2", 1) will pass *ppos = 1 to proc_dointvec_minmax.
proc_dointvec_minmax will return 0 without setting new_policy.
t.data = &new_policy;
ret = proc_dointvec_minmax(&t, write, buffer, lenp, ppos)
-->do_proc_dointvec
-->__do_proc_dointvec
if (write) {
if (proc_first_pos_non_zero_ignore(ppos, table))
goto out;
sysctl_overcommit_memory = new_policy;
so sysctl_overcommit_memory will be set to an uninitialized value.
Check whether new_policy has been changed by proc_dointvec_minmax.
Link: https://lkml.kernel.org/r/20210923020524.13289-1-chenjun102@huawei.com
Fixes: 56f3547bfa ("mm: adjust vm_committed_as_batch according to vm overcommit policy")
Signed-off-by: Chen Jun <chenjun102@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Feng Tang <feng.tang@intel.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Rui Xiang <rui.xiang@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8f71f8923 upstream.
nvkm test builds fail with the following error.
drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c: In function 'nvkm_control_mthd_pstate_info':
drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c:60:35: error: overflow in conversion from 'int' to '__s8' {aka 'signed char'} changes value from '-251' to '5'
The code builds on most architectures, but fails on parisc where ENOSYS
is defined as 251.
Replace the error code with -ENODEV (-19). The actual error code does
not really matter and is not passed to userspace - it just has to be
negative.
Fixes: 7238eca4cf ("drm/nouveau: expose pstate selection per-power source in sysfs")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3727a8bac upstream.
Jann Horn reported a problem with commit eb1231f73c ("selinux:
clarify task subjective and objective credentials") where some LSM
hooks were attempting to access the subjective credentials of a task
other than the current task. Generally speaking, it is not safe to
access another task's subjective credentials and doing so can cause
a number of problems.
Further, while looking into the problem, I realized that Smack was
suffering from a similar problem brought about by a similar commit
1fb057dcde ("smack: differentiate between subjective and objective
task credentials").
This patch addresses this problem by restoring the use of the task's
objective credentials in those cases where the task is other than the
current executing task. Not only does this resolve the problem
reported by Jann, it is arguably the correct thing to do in these
cases.
Cc: stable@vger.kernel.org
Fixes: eb1231f73c ("selinux: clarify task subjective and objective credentials")
Fixes: 1fb057dcde ("smack: differentiate between subjective and objective task credentials")
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9351590f51 ]
Cached root file was not being completely invalidated sometimes.
Reproducing:
- With a DFS share with 2 targets, one disabled and one enabled
- start some I/O on the mount
# while true; do ls /mnt/dfs; done
- at the same time, disable the enabled target and enable the disabled
one
- wait for DFS cache to expire
- on reconnect, the previous cached root handle should be invalid, but
open_cached_dir_by_dentry() will still try to use it, but throws a
use-after-free warning (kref_get())
Make smb2_close_cached_fid() invalidate all fields every time, but only
send an SMB2_close() when the entry is still valid.
Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9848417926 ]
The intel powerclamp driver will setup a per-CPU worker with RT
priority. The worker will then invoke play_idle() in which it remains in
the idle poll loop until it is stopped by the timer it started earlier.
That timer needs to expire in hard interrupt context on PREEMPT_RT.
Otherwise the timer will expire in ksoftirqd as a SOFT timer but that task
won't be scheduled on the CPU because its priority is lower than the
priority of the worker which is in the idle loop.
Always expire the idle timer in hard interrupt context.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210906113034.jgfxrjdvxnjqgtmc@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 884f0e84f1 ]
The pending timer has been set up in blk_throtl_init(). However, the
timer is not deleted in blk_throtl_exit(). This means that the timer
handler may still be running after freeing the timer, which would
result in a use-after-free.
Fix by calling del_timer_sync() to delete the timer in blk_throtl_exit().
Signed-off-by: Li Jinlin <lijinlin3@huawei.com>
Link: https://lore.kernel.org/r/20210907121242.2885564-1-lijinlin3@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f04064814c ]
The serial number is copied into the buffer via memcpy_and_pad()
with the length NVMET_SN_MAX_SIZE. So when printing out we also
need to take just that length as anything beyond that will be
uninitialized.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d44084c934 ]
A consumer is expected to disable a PWM before calling pwm_put(). And if
they didn't there is hopefully a good reason (or the consumer needs
fixing). Also if disabling an enabled PWM was the right thing to do,
this should better be done in the framework instead of in each low level
driver.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d768cd7fd ]
A consumer is expected to disable a PWM before calling pwm_put(). And if
they didn't there is hopefully a good reason (or the consumer needs
fixing). Also if disabling an enabled PWM was the right thing to do,
this should better be done in the framework instead of in each low level
driver.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c68eb29c8e ]
A consumer is expected to disable a PWM before calling pwm_put(). And if
they didn't there is hopefully a good reason (or the consumer needs
fixing). Also if disabling an enabled PWM was the right thing to do,
this should better be done in the framework instead of in each low level
driver.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 71731090ab ]
On init, the disabled state is cleared right before hw_init and that
causes the device to report on "Operational" state before the device
initialization is finished. Although the char device is not yet exposed
to the user at this stage, the sysfs entries are exposed.
This can cause errors in monitoring applications that use the sysfs
entries.
In order to avoid this, a new state "in device creation" is introduced
to ne reported when the device is not disabled but is still in init
flow.
Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09ae43043c ]
The address resolution via debugfs was not taking into consideration the
page offset, resulting in a wrong address.
Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6c849012b ]
Currently there is no validity check for event ID received from F/W,
Thus exposing driver to memory overrun.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d7eff46c21 ]
Get process vm root BO ref in case process is exiting and root BO is
freed, to avoid NULL pointer dereference backtrace:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
Call Trace:
amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
seq_show+0x12c/0x180
seq_read+0x153/0x410
vfs_read+0x91/0x140[ 3427.206183] ksys_read+0x4f/0xb0
do_syscall_64+0x5b/0x1a0
entry_SYSCALL_64_after_hwframe+0x65/0xca
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6a355a22f ]
1) Generalize the function--if the user didn't set
i2c_address, still return true/false to
indicate whether VBIOS contains the RAS EEPROM
address. This function shouldn't evaluate
whether the user set the i2c_address pointer or
not.
2) Don't touch the caller's i2c_address, unless
you have to--this function shouldn't have side
effects.
3) Correctly set the function comment as a
kernel-doc comment.
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 89aad770d6 ]
In case of host-resident MMU, when the page tables pool is destroyed,
its pointer is not nullified correctly.
As a result, on a device fini which happens after a failing reset, the
already destroyed pool is accessed, which leads to a kernel panic.
The patch fixes the setting of the pool pointer to NULL.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5f5dec07ac ]
Patch series "nilfs2: fix incorrect usage of kobject".
This patchset from Nanyong Sun fixes memory leak issues and a NULL
pointer dereference issue caused by incorrect usage of kboject in nilfs2
sysfs implementation.
This patch (of 6):
Reported by syzkaller:
BUG: memory leak
unreferenced object 0xffff888100ca8988 (size 8):
comm "syz-executor.1", pid 1930, jiffies 4294745569 (age 18.052s)
hex dump (first 8 bytes):
6c 6f 6f 70 31 00 ff ff loop1...
backtrace:
kstrdup+0x36/0x70 mm/util.c:60
kstrdup_const+0x35/0x60 mm/util.c:83
kvasprintf_const+0xf1/0x180 lib/kasprintf.c:48
kobject_set_name_vargs+0x56/0x150 lib/kobject.c:289
kobject_add_varg lib/kobject.c:384 [inline]
kobject_init_and_add+0xc9/0x150 lib/kobject.c:473
nilfs_sysfs_create_device_group+0x150/0x7d0 fs/nilfs2/sysfs.c:986
init_nilfs+0xa21/0xea0 fs/nilfs2/the_nilfs.c:637
nilfs_fill_super fs/nilfs2/super.c:1046 [inline]
nilfs_mount+0x7b4/0xe80 fs/nilfs2/super.c:1316
legacy_get_tree+0x105/0x210 fs/fs_context.c:592
vfs_get_tree+0x8e/0x2d0 fs/super.c:1498
do_new_mount fs/namespace.c:2905 [inline]
path_mount+0xf9b/0x1990 fs/namespace.c:3235
do_mount+0xea/0x100 fs/namespace.c:3248
__do_sys_mount fs/namespace.c:3456 [inline]
__se_sys_mount fs/namespace.c:3433 [inline]
__x64_sys_mount+0x14b/0x1f0 fs/namespace.c:3433
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
If kobject_init_and_add return with error, then the cleanup of kobject
is needed because memory may be allocated in kobject_init_and_add
without freeing.
And the place of cleanup_dev_kobject should use kobject_put to free the
memory associated with the kobject. As the section "Kobject removal" of
"Documentation/core-api/kobject.rst" says, kobject_del() just makes the
kobject "invisible", but it is not cleaned up. And no more cleanup will
do after cleanup_dev_kobject, so kobject_put is needed here.
Link: https://lkml.kernel.org/r/1625651306-10829-1-git-send-email-konishi.ryusuke@gmail.com
Link: https://lkml.kernel.org/r/1625651306-10829-2-git-send-email-konishi.ryusuke@gmail.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Link: https://lkml.kernel.org/r/20210629022556.3985106-2-sunnanyong@huawei.com
Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3fa421dedb ]
When removing the device we call blkdev_put() on the device once we've
removed it, and because we have an EXCL open we need to take the
->open_mutex on the block device to clean it up. Unfortunately during
device remove we are holding the sb writers lock, which results in the
following lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc2+ #407 Not tainted
------------------------------------------------------
losetup/11595 is trying to acquire lock:
ffff973ac35dd138 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock:
ffff973ac9812c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
lo_open+0x28/0x60 [loop]
blkdev_get_whole+0x25/0xf0
blkdev_get_by_dev.part.0+0x168/0x3c0
blkdev_open+0xd2/0xe0
do_dentry_open+0x161/0x390
path_openat+0x3cc/0xa20
do_filp_open+0x96/0x120
do_sys_openat2+0x7b/0x130
__x64_sys_openat+0x46/0x70
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
blkdev_put+0x3a/0x220
btrfs_rm_device.cold+0x62/0xe5
btrfs_ioctl+0x2a31/0x2e70
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}:
lo_write_bvec+0xc2/0x240 [loop]
loop_process_work+0x238/0xd00 [loop]
process_one_work+0x26b/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
process_one_work+0x245/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
flush_workqueue+0x91/0x5e0
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
(wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&lo->lo_mutex);
lock(&disk->open_mutex);
lock(&lo->lo_mutex);
lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11595:
#0: ffff973ac9812c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace:
CPU: 0 PID: 11595 Comm: losetup Not tainted 5.14.0-rc2+ #407
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack_lvl+0x57/0x72
check_noncircular+0xcf/0xf0
? stack_trace_save+0x3b/0x50
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
? flush_workqueue+0x67/0x5e0
? lockdep_init_map_type+0x47/0x220
flush_workqueue+0x91/0x5e0
? flush_workqueue+0x67/0x5e0
? verify_cpu+0xf0/0x100
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
? blkdev_ioctl+0x8d/0x2a0
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fc21255d4cb
So instead save the bdev and do the put once we've dropped the sb
writers lock in order to avoid the lockdep recursion.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f96a5bfa1 ]
We update the ctime/mtime of a block device when we remove it so that
blkid knows the device changed. However we do this by re-opening the
block device and calling filp_update_time. This is more correct because
it'll call the inode->i_op->update_time if it exists, but the block dev
inodes do not do this. Instead call generic_update_time() on the
bd_inode in order to avoid the blkdev_open path and get rid of the
following lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc2+ #406 Not tainted
------------------------------------------------------
losetup/11596 is trying to acquire lock:
ffff939640d2f538 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock:
ffff939655510c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
lo_open+0x28/0x60 [loop]
blkdev_get_whole+0x25/0xf0
blkdev_get_by_dev.part.0+0x168/0x3c0
blkdev_open+0xd2/0xe0
do_dentry_open+0x161/0x390
path_openat+0x3cc/0xa20
do_filp_open+0x96/0x120
do_sys_openat2+0x7b/0x130
__x64_sys_openat+0x46/0x70
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
blkdev_get_by_dev.part.0+0x56/0x3c0
blkdev_open+0xd2/0xe0
do_dentry_open+0x161/0x390
path_openat+0x3cc/0xa20
do_filp_open+0x96/0x120
file_open_name+0xc7/0x170
filp_open+0x2c/0x50
btrfs_scratch_superblocks.part.0+0x10f/0x170
btrfs_rm_device.cold+0xe8/0xed
btrfs_ioctl+0x2a31/0x2e70
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}:
lo_write_bvec+0xc2/0x240 [loop]
loop_process_work+0x238/0xd00 [loop]
process_one_work+0x26b/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
process_one_work+0x245/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
flush_workqueue+0x91/0x5e0
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
(wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&lo->lo_mutex);
lock(&disk->open_mutex);
lock(&lo->lo_mutex);
lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11596:
#0: ffff939655510c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace:
CPU: 1 PID: 11596 Comm: losetup Not tainted 5.14.0-rc2+ #406
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack_lvl+0x57/0x72
check_noncircular+0xcf/0xf0
? stack_trace_save+0x3b/0x50
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
? flush_workqueue+0x67/0x5e0
? lockdep_init_map_type+0x47/0x220
flush_workqueue+0x91/0x5e0
? flush_workqueue+0x67/0x5e0
? verify_cpu+0xf0/0x100
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
? blkdev_ioctl+0x8d/0x2a0
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 88b604263f ]
current_stack_pointer() simply returns current value of %r15. If
current_stack_pointer() caller allocates stack (which is the case in
unwind code) %r15 points to a stack frame allocated for callees, meaning
current_stack_pointer() caller (e.g. stack_trace_save) will end up in
the stacktrace. This is not expected by stack_trace_save*() callers and
causes problems.
current_frame_address() on the other hand returns function stack frame
address, which matches %r15 upon function invocation. Using it in
get_stack_pointer() makes it more aligned with x86 implementation
(according to BACKTRACE_SELF_TEST output) and meets stack_trace_save*()
caller's expectations, notably KCSAN.
Also make sure unwind_start is always inlined.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Marco Elver <elver@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/patch.git-04dd26be3043.your-ad-here.call-01630504868-ext-6188@work.hours
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6d37ccdd2 ]
capsnaps will take inode references via ihold when queueing to flush.
When force unmounting, the client will just close the sessions and
may never get a flush reply, causing a leak and inode ref leak.
Fix this by removing the capsnaps for an inode when removing the caps.
URL: https://tracker.ceph.com/issues/52295
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b11ed50346 ]
The current code will update the mtime and then try to get caps to
handle the write. If we end up having to request caps from the MDS, then
the mtime in the cap grant will clobber the updated mtime and it'll be
lost.
This is most noticable when two clients are alternately writing to the
same file. Fw caps are continually being granted and revoked, and the
mtime ends up stuck because the updated mtimes are always being
overwritten with the old one.
Fix this by changing the order of operations in ceph_write_iter to get
the caps before updating the times. Also, make sure we check the pool
full conditions before even getting any caps or uninlining.
URL: https://tracker.ceph.com/issues/46574
Reported-by: Jozef Kováč <kovac@firma.zoznam.sk>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2ad32cf09b ]
If we hit a decoding error late in the frame, then we might exit the
function without putting the pool_ns string. Ensure that we always put
that reference on the way out of the function.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fa209644a7 ]
It was reported that on "HP ENVY x360" that power LED does not come
back, certain keys like brightness controls do not work, and the fan
never spins up, even under load on 5.14 final.
In analysis of the SSDT it's clear that the Microsoft UUID doesn't
provide functional support, but rather the AMD UUID should be
supporting this system.
Because this is a gap in the expected logic, we checked back with
internal team. The conclusion was that on Windows AMD uPEP *does*
run even when Microsoft UUID present, but most OEM systems have
adopted value of "0x3" for supported functions and hence nothing
runs.
Henceforth add support for running both Microsoft and AMD methods.
This approach will also allow the same logic on Intel systems if
desired at a future time as well by pulling the evaluation of
`lps0_dsm_func_mask_microsoft` out of the `if` block for
`acpi_s2idle_vendor_amd`.
Link: https://gitlab.freedesktop.org/drm/amd/uploads/9fbcd7ec3a385cc6949c9bacf45dc41b/acpi-f.20.bin
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1691
Reported-by: Maxwell Beck <max@ryt.one>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
[ rjw: Edits of the new comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5f939f4977 ]
commit 63f2f9cceb ("ASoC: audio-graph: remove Platform support")
removed Platform support from audio-graph, because it doesn't have
"plat" support on DT (simple-card has).
But, Platform support is needed if user is using
snd_dmaengine_pcm_register() which adds generic DMA as Platform.
And this Platform dev is using CPU dev.
Without this patch, at least STM32MP15 audio sound card is no more
functional (v5.13 or later). This patch respawn Platform Support on
audio-graph again.
Reported-by: Olivier MOYSAN <olivier.moysan@foss.st.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Tested-by: Olivier MOYSAN <olivier.moysan@foss.st.com>
Link: https://lore.kernel.org/r/878s0jzrpf.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 436fc4feea ]
kmemleak with enabled auto scanning reports that our stack allocation is
lost. This is because we're saving the pointer + STACK_INIT_OFFSET to
lowcore. When kmemleak now scans the objects, it thinks that this one is
lost because it can't find a corresponding pointer.
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aac6c0f907 ]
The xilinx dma driver uses the consistent allocations, so for correct
operation also set the DMA mask for coherent APIs. It fixes the below
kernel crash with dmatest client when DMA IP is configured with 64-bit
address width and linux is booted from high (>4GB) memory.
Call trace:
[ 489.531257] dma_alloc_from_pool+0x8c/0x1c0
[ 489.535431] dma_direct_alloc+0x284/0x330
[ 489.539432] dma_alloc_attrs+0x80/0xf0
[ 489.543174] dma_pool_alloc+0x160/0x2c0
[ 489.547003] xilinx_cdma_prep_memcpy+0xa4/0x180
[ 489.551524] dmatest_func+0x3cc/0x114c
[ 489.555266] kthread+0x124/0x130
[ 489.558486] ret_from_fork+0x10/0x3c
[ 489.562051] ---[ end trace 248625b2d596a90a ]---
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Reviewed-by: Harini Katakam <harini.katakam@xilinx.com>
Link: https://lore.kernel.org/r/1629363528-30347-1-git-send-email-radhey.shyam.pandey@xilinx.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9cc238c7a5 ]
In preparation for moving cxl_memdev allocation to the core, introduce
cdevm_file_operations to coordinate file operations shutdown relative to
driver data release.
The motivation for moving cxl_memdev allocation to the core (beyond
better file organization of sysfs attributes in core/ and drivers in
cxl/), is that device lifetime is longer than module lifetime. The cxl_pci
module should be free to come and go without needing to coordinate with
devices that need the text associated with cxl_memdev_release() to stay
resident. The move will fix a use after free bug when looping driver
load / unload with CONFIG_DEBUG_KOBJECT_RELEASE=y.
Another motivation for passing in file_operations to the core cxl_memdev
creation flow is to allow for alternate drivers, like unit test code, to
define their own ioctl backends.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162792539962.368511.2962268954245340288.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cbba178708 ]
Currently, nothing is output on the serial console, unless
"console=ttyS0,115200n8" or "earlycon" are appended to the kernel
command line. Enable automatic console selection using
chosen/stdout-path by adding a proper alias, and configure the expected
serial rate.
While at it, add aliases for the other three serial ports, which are
provided on the same micro-USB connector as the first one.
Fixes: 0fa6107eca ("RISC-V: Initial DTS for Microchip ICICLE board")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 70982eef4d ]
The ret value might be -EBUSY, caller will think lru lock is still
locked but actually NOT. So return -ENOSPC instead. Otherwise we hit
list corruption.
ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0,
caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will
be stuck as we actually did not free any BO memory. This usually happens
when the fence is not signaled for a long time.
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Fixes: ebd59851c7 ("drm/ttm: move swapout logic around v3")
Link: https://patchwork.freedesktop.org/patch/msgid/20210907040832.1107747-1-xinhui.pan@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 88053ec8cb ]
KVM in nVHE mode divides up its VA space into two equal halves, and
picks the half that does not conflict with the HYP ID map to map its
linear region. This worked fine when the kernel's linear map itself was
guaranteed to cover precisely as many bits of VA space, but this was
changed by commit f4693c2716 ("arm64: mm: extend linear region for
52-bit VA configurations").
The result is that, depending on the placement of the ID map, kernel-VA
to hyp-VA translations may produce addresses that either conflict with
other HYP mappings (including the ID map itself) or generate addresses
outside of the 52-bit addressable range, neither of which is likely to
lead to anything useful.
Given that 52-bit capable cores are guaranteed to implement VHE, this
only affects configurations such as pKVM where we opt into non-VHE mode
even if the hardware is VHE capable. So just for these configurations,
let's limit the kernel linear map to 51 bits and work around the
problem.
Fixes: f4693c2716 ("arm64: mm: extend linear region for 52-bit VA configurations")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20210826165613.60774-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ef0505158 ]
pasid_mutex and dev->iommu->param->lock are held while unbinding mm is
flushing IO page fault workqueue and waiting for all page fault works to
finish. But an in-flight page fault work also need to hold the two locks
while unbinding mm are holding them and waiting for the work to finish.
This may cause an ABBA deadlock issue as shown below:
idxd 0000:00:0a.0: unbind PASID 2
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc7+ #549 Not tainted [ 186.615245] ----------
dsa_test/898 is trying to acquire lock:
ffff888100d854e8 (¶m->lock){+.+.}-{3:3}, at:
iopf_queue_flush_dev+0x29/0x60
but task is already holding lock:
ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
intel_svm_unbind+0x34/0x1e0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (pasid_mutex){+.+.}-{3:3}:
__mutex_lock+0x75/0x730
mutex_lock_nested+0x1b/0x20
intel_svm_page_response+0x8e/0x260
iommu_page_response+0x122/0x200
iopf_handle_group+0x1c2/0x240
process_one_work+0x2a5/0x5a0
worker_thread+0x55/0x400
kthread+0x13b/0x160
ret_from_fork+0x22/0x30
-> #1 (¶m->fault_param->lock){+.+.}-{3:3}:
__mutex_lock+0x75/0x730
mutex_lock_nested+0x1b/0x20
iommu_report_device_fault+0xc2/0x170
prq_event_thread+0x28a/0x580
irq_thread_fn+0x28/0x60
irq_thread+0xcf/0x180
kthread+0x13b/0x160
ret_from_fork+0x22/0x30
-> #0 (¶m->lock){+.+.}-{3:3}:
__lock_acquire+0x1134/0x1d60
lock_acquire+0xc6/0x2e0
__mutex_lock+0x75/0x730
mutex_lock_nested+0x1b/0x20
iopf_queue_flush_dev+0x29/0x60
intel_svm_drain_prq+0x127/0x210
intel_svm_unbind+0xc5/0x1e0
iommu_sva_unbind_device+0x62/0x80
idxd_cdev_release+0x15a/0x200 [idxd]
__fput+0x9c/0x250
____fput+0xe/0x10
task_work_run+0x64/0xa0
exit_to_user_mode_prepare+0x227/0x230
syscall_exit_to_user_mode+0x2c/0x60
do_syscall_64+0x48/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
¶m->lock --> ¶m->fault_param->lock --> pasid_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(pasid_mutex);
lock(¶m->fault_param->lock);
lock(pasid_mutex);
lock(¶m->lock);
*** DEADLOCK ***
2 locks held by dsa_test/898:
#0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at:
iommu_sva_unbind_device+0x53/0x80
#1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at:
intel_svm_unbind+0x34/0x1e0
stack backtrace:
CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549
Hardware name: Intel Corporation Kabylake Client platform/KBL S
DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016
Call Trace:
dump_stack_lvl+0x5b/0x74
dump_stack+0x10/0x12
print_circular_bug.cold+0x13d/0x142
check_noncircular+0xf1/0x110
__lock_acquire+0x1134/0x1d60
lock_acquire+0xc6/0x2e0
? iopf_queue_flush_dev+0x29/0x60
? pci_mmcfg_read+0xde/0x240
__mutex_lock+0x75/0x730
? iopf_queue_flush_dev+0x29/0x60
? pci_mmcfg_read+0xfd/0x240
? iopf_queue_flush_dev+0x29/0x60
mutex_lock_nested+0x1b/0x20
iopf_queue_flush_dev+0x29/0x60
intel_svm_drain_prq+0x127/0x210
? intel_pasid_tear_down_entry+0x22e/0x240
intel_svm_unbind+0xc5/0x1e0
iommu_sva_unbind_device+0x62/0x80
idxd_cdev_release+0x15a/0x200
pasid_mutex protects pasid and svm data mapping data. It's unnecessary
to hold pasid_mutex while flushing the workqueue. To fix the deadlock
issue, unlock pasid_pasid during flushing the workqueue to allow the works
to be handled.
Fixes: d5b9e4bfe0 ("iommu/vt-d: Report prq to io-pgfault framework")
Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com
[joro: Removed timing information from kernel log messages]
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c3811a50ad ]
Currently, iommu_init_ga() checks and disables IOMMU VAPIC support
(i.e. AMD AVIC support in IOMMU) when GAMSup feature bit is not set.
However it forgets to clear IRQ_POSTING_CAP from the previously set
amd_iommu_irq_ops.capability.
This triggers an invalid page fault bug during guest VM warm reboot
if AVIC is enabled since the irq_remapping_cap(IRQ_POSTING_CAP) is
incorrectly set, and crash the system with the following kernel trace.
BUG: unable to handle page fault for address: 0000000000400dd8
RIP: 0010:amd_iommu_deactivate_guest_mode+0x19/0xbc
Call Trace:
svm_set_pi_irte_mode+0x8a/0xc0 [kvm_amd]
? kvm_make_all_cpus_request_except+0x50/0x70 [kvm]
kvm_request_apicv_update+0x10c/0x150 [kvm]
svm_toggle_avic_for_irq_window+0x52/0x90 [kvm_amd]
svm_enable_irq_window+0x26/0xa0 [kvm_amd]
vcpu_enter_guest+0xbbe/0x1560 [kvm]
? avic_vcpu_load+0xd5/0x120 [kvm_amd]
? kvm_arch_vcpu_load+0x76/0x240 [kvm]
? svm_get_segment_base+0xa/0x10 [kvm_amd]
kvm_arch_vcpu_ioctl_run+0x103/0x590 [kvm]
kvm_vcpu_ioctl+0x22a/0x5d0 [kvm]
__x64_sys_ioctl+0x84/0xc0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes by moving the initializing of AMD IOMMU interrupt remapping mode
(amd_iommu_guest_ir) earlier before setting up the
amd_iommu_irq_ops.capability with appropriate IRQ_POSTING_CAP flag.
[joro: Squashed the two patches and limited
check_features_on_all_iommus() to CONFIG_IRQ_REMAP
to fix a compile warning.]
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20210820202957.187572-2-suravee.suthikulpanit@amd.com
Link: https://lore.kernel.org/r/20210820202957.187572-3-suravee.suthikulpanit@amd.com
Fixes: 8bda0cfbdc ("iommu/amd: Detect and initialize guest vAPIC log")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 907872baa9 ]
parisc build test images fail to compile with the following error.
drivers/parisc/dino.c:160:12: error:
'pci_dev_is_behind_card_dino' defined but not used
Move the function just ahead of its only caller to avoid the error.
Fixes: 5fa1659105 ("parisc: Disable HP HSC-PCI Cards to prevent kernel crash")
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4b92d4add5 ]
DEFINE_SMP_CALL_CACHE_FUNCTION() was usefel before the CPU hotplug rework
to ensure that the cache related functions are called on the upcoming CPU
because the notifier itself could run on any online CPU.
The hotplug state machine guarantees that the callbacks are invoked on the
upcoming CPU. So there is no need to have this SMP function call
obfuscation. That indirection was missed when the hotplug notifiers were
converted.
This also solves the problem of ARM64 init_cache_level() invoking ACPI
functions which take a semaphore in that context. That's invalid as SMP
function calls run with interrupts disabled. Running it just from the
callback in context of the CPU hotplug thread solves this.
Fixes: 8571890e15 ("arm64: Add support for ACPI based firmware tables")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/871r69ersb.ffs@tglx
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b3dc549986 ]
Due to high latency in PCIE clock switching on RKL platforms,
switching the PCIE clock dynamically at runtime can lead to HDMI/DP
audio problems. On newer asics this is handled in the SMU firmware.
For SMU7-based asics, disable PCIE clock switching to avoid the issue.
AMD provide a parameter to disable PICE_DPM.
modprobe amdgpu ppfeaturemask=0xfff7bffb
It's better to contorl PCIE_DPM in amd gpu driver,
switch PCI_DPM by determining intel RKL platform for SMU7-based asics.
Fixes: 1a31474cdb ("drm/amd/pm: workaround for audio noise issue")
Ref: https://lists.freedesktop.org/archives/amd-gfx/2021-August/067413.html
Signed-off-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fb83610762 ]
There are two pairs of declarations for thermal_cooling_device_register()
and thermal_of_cooling_device_register(), and only one set was changed
in a recent patch, so the other one now causes a compile-time warning:
drivers/net/wireless/mediatek/mt76/mt7915/init.c: In function 'mt7915_thermal_init':
drivers/net/wireless/mediatek/mt76/mt7915/init.c:134:48: error: passing argument 1 of 'thermal_cooling_device_register' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
134 | cdev = thermal_cooling_device_register(wiphy_name(wiphy), phy,
| ^~~~~~~~~~~~~~~~~
In file included from drivers/net/wireless/mediatek/mt76/mt7915/init.c:7:
include/linux/thermal.h:407:39: note: expected 'char *' but argument is of type 'const char *'
407 | thermal_cooling_device_register(char *type, void *devdata,
| ~~~~~~^~~~
Change the dummy helper functions to have the same arguments as the
normal version.
Fixes: f991de53a8 ("thermal: make device_register's type argument const")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jean-Francois Dagenais <jeff.dagenais@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210722090717.1116748-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cfd799837d ]
Since the commit e5efaeb8a8 ("bootconfig: Support mixing
a value and subkeys under a key") allows to co-exist a value
node and key nodes under a node, xbc_node_for_each_child()
is not only returning key node but also a value node.
In the boot-time tracing using xbc_node_for_each_child() to
iterate the events, groups and instances, but those must be
key nodes. Thus it must use xbc_node_for_each_subkey().
Link: https://lkml.kernel.org/r/163112988361.74896.2267026262061819145.stgit@devnote2
Fixes: e5efaeb8a8 ("bootconfig: Support mixing a value and subkeys under a key")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b234ed6d62 ]
Currently, usermodehelper is enabled right before PID1 starts going
through the initcalls. However, any call of a usermodehelper from a
pure_, core_, postcore_, arch_, subsys_ or fs_ initcall is futile, as
there is no filesystem contents yet.
Up until commit e7cb072eb9 ("init/initramfs.c: do unpacking
asynchronously"), such calls, whether via some request_module(), a
legacy uevent "/sbin/hotplug" notification or something else, would
just fail silently with (presumably) -ENOENT from
kernel_execve(). However, that commit introduced the
wait_for_initramfs() synchronization hook which must be called from
the usermodehelper exec path right before the kernel_execve, in order
that request_module() et al done from *after* rootfs_initcall()
time (i.e. device_ and late_ initcalls) would continue to find a
populated initramfs as they used to.
Any call of wait_for_initramfs() done before the unpacking has been
scheduled (i.e. before rootfs_initcall time) must just return
immediately [and let the caller find an empty file system] in order
not to deadlock the machine. I mistakenly thought, and my limited
testing confirmed, that there were no such calls, so I added a
pr_warn_once() in wait_for_initramfs(). It turns out that one can
indeed hit request_module() as well as kobject_uevent_env() during
those early init calls, leading to a user-visible warning in the
kernel log emitted consistently for certain configurations.
We could just remove the pr_warn_once(), but I think it's better to
postpone enabling the usermodehelper framework until there is at least
some chance of finding the executable. That is also a little more
efficient in that a lot of work done in umh.c will be elided. However,
it does change the error seen by those early callers from -ENOENT to
-EBUSY, so there is a risk of a regression if any caller care about
the exact error value.
Link: https://lkml.kernel.org/r/20210728134638.329060-1-linux@rasmusvillemoes.dk
Fixes: e7cb072eb9 ("init/initramfs.c: do unpacking asynchronously")
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e38b3f2005 ]
alloc_pages_bulk_array() attempts to allocate at least one page based on
the provided pages, and then opportunistically allocates more if that
can be done without dropping the spinlock.
So if it returns fewer than requested, that could just mean that it
needed to drop the lock. In that case, try again immediately.
Only pause for a time if no progress could be made.
Reported-and-tested-by: Mike Javorski <mike.javorski@gmail.com>
Reported-and-tested-by: Lothar Paltins <lopa@mailbox.org>
Fixes: f6e70aab9d ("SUNRPC: refresh rq_pages using a bulk page allocator")
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Mel Gorman <mgorman@suse.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 15256194ef ]
Make the oklabel within the CHKSTG macro local. This makes sure that
tools like objdump and the crash debugging tool still disassemble full
functions where the macro has been used instead of stopping half way
where such a global label is used and one has to guess how to
disassemble the rest of such a function:
E.g.:
0000000000cb0270 <mcck_int_handler>:
cb0270: b2 05 03 20 stck 800
...
cb0354: a7 74 00 97 jne cb0482 <oklabel270+0xe2>
0000000000cb0358 <oklabel243>:
cb0358: c0 e0 00 22 4e 8f larl %r14,10fa076 <opcode+0x2558>
...
Fixes: d35925b349 ("s390/mcck: move storage error checks to assembler")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d453ceb654 ]
Add trace event to report samples and their timestamp coming from the
EC. It allows to check if the timestamps are correct and the filter is
working correctly without introducing too much latency.
To enable these events:
cd /sys/kernel/debug/tracing/
echo 1 > events/cros_ec/enable
echo 0 > events/cros_ec/cros_ec_request_start/enable
echo 0 > events/cros_ec/cros_ec_request_done/enable
echo 1 > tracing_on
cat trace_pipe
Observe event flowing:
irq/105-chromeo-95 [000] .... 613.659758: cros_ec_sensorhub_timestamp: ...
irq/105-chromeo-95 [000] .... 613.665219: cros_ec_sensorhub_filter: dx: ...
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 020162d6f4 upstream.
This fixes a race condition: After pwmchip_add() is called there might
already be a consumer and then modifying the hardware behind the
consumer's back is bad. So reset before calling pwmchip_add().
Note that reseting the hardware isn't the right thing to do if the PWM
is already running as it might e.g. disable (or even enable) a backlight
that is supposed to be on (or off).
Fixes: 4dce82c1e8 ("pwm: add pwm-mxs support")
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d2813fb17 upstream.
This fixes a race condition: After pwmchip_add() is called there might
already be a consumer and then modifying the hardware behind the
consumer's back is bad. So set the default before.
(Side-note: I don't know what this register setting actually does, if
this modifies the polarity there is an inconsistency because the
inversed polarity isn't considered if the PWM is already running during
.probe().)
Fixes: acfd92fdfb ("pwm: lpc32xx: Set PWM_PIN_LEVEL bit to default value")
Cc: Sylvain Lemieux <slemieux@tycoint.com>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4002173b7 upstream.
The first thing metric_delayed_work does is check mdsc->stopping,
and then return immediately if it's set. That's good since we would
have already torn down the metric structures at this point, otherwise,
but there is no locking around mdsc->stopping.
It's possible that the ceph_metric_destroy call could race with the
delayed_work, in which case we could end up with the delayed_work
accessing destroyed percpu variables.
At this point in the mdsc teardown, the "stopping" flag has already been
set, so there's no benefit to flushing the work. Move the work
cancellation in ceph_metric_destroy ahead of the percpu variable
destruction, and eliminate the flush_delayed_work call in
ceph_mdsc_destroy.
Fixes: 18f473b384 ("ceph: periodically send perf metrics to MDSes")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 70ee251ded upstream.
adc_tm5_register_tzd() registers the thermal zone sensors for all
channels of the thermal monitor. If the registration of one channel
fails the function skips the processing of the remaining channels
and returns an error, which results in _probe() being aborted.
One of the reasons the registration could fail is that none of the
thermal zones is using the channel/sensor, which hardly is a critical
error (if it is an error at all). If this case is detected emit a
warning and continue with processing the remaining channels.
Fixes: ca66dca5ed ("thermal: qcom: add support for adc-tm5 PMIC thermal monitor")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210823134726.1.I1dd23ddf77e5b3568625d80d6827653af071ce19@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a9344cd0a upstream.
There are variables(power.may_skip_resume and dev->power.must_resume)
and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
a system wide suspend transition.
Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
its "noirq" and "early" resume callbacks to be skipped if the device
can be left in suspend after a system-wide transition into the working
state. PM core determines that the driver's "noirq" and "early" resume
callbacks should be skipped or not with dev_pm_skip_resume() function by
checking power.may_skip_resume variable.
power.must_resume variable is getting set to false in __device_suspend()
function without checking device's DPM_FLAG_MAY_SKIP_RESUME settings.
In problematic scenario, where all the devices in the suspend_late
stage are successful and some device can fail to suspend in
suspend_noirq phase. So some devices successfully suspended in suspend_late
stage are not getting chance to execute __device_suspend_noirq()
to set dev->power.must_resume variable to true and not getting
resumed in early_resume phase.
Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
setting power.must_resume variable in __device_suspend function.
Fixes: 6e176bf8d4 ("PM: sleep: core: Do not skip callbacks in the resume phase")
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98e2e409e7 upstream.
When the refcount is decreased to 0, the resource reclamation branch is
entered. Before CPU0 reaches the race point (1), CPU1 may obtain the
spinlock and traverse the rbtree to find 'root', see
nilfs_lookup_root().
Although CPU1 will call refcount_inc() to increase the refcount, it is
obviously too late. CPU0 will release 'root' directly, CPU1 then
accesses 'root' and triggers UAF.
Use refcount_dec_and_lock() to ensure that both the operations of
decrease refcount to 0 and link deletion are lock protected eliminates
this risk.
CPU0 CPU1
nilfs_put_root():
<-------- (1)
spin_lock(&nilfs->ns_cptree_lock);
rb_erase(&root->rb_node, &nilfs->ns_cptree);
spin_unlock(&nilfs->ns_cptree_lock);
kfree(root);
<-------- use-after-free
refcount_t: underflow; use-after-free.
WARNING: CPU: 2 PID: 9476 at lib/refcount.c:28 \
refcount_warn_saturate+0x1cf/0x210 lib/refcount.c:28
Modules linked in:
CPU: 2 PID: 9476 Comm: syz-executor.0 Not tainted 5.10.45-rc1+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
RIP: 0010:refcount_warn_saturate+0x1cf/0x210 lib/refcount.c:28
... ...
Call Trace:
__refcount_sub_and_test include/linux/refcount.h:283 [inline]
__refcount_dec_and_test include/linux/refcount.h:315 [inline]
refcount_dec_and_test include/linux/refcount.h:333 [inline]
nilfs_put_root+0xc1/0xd0 fs/nilfs2/the_nilfs.c:795
nilfs_segctor_destroy fs/nilfs2/segment.c:2749 [inline]
nilfs_detach_log_writer+0x3fa/0x570 fs/nilfs2/segment.c:2812
nilfs_put_super+0x2f/0xf0 fs/nilfs2/super.c:467
generic_shutdown_super+0xcd/0x1f0 fs/super.c:464
kill_block_super+0x4a/0x90 fs/super.c:1446
deactivate_locked_super+0x6a/0xb0 fs/super.c:335
deactivate_super+0x85/0x90 fs/super.c:366
cleanup_mnt+0x277/0x2e0 fs/namespace.c:1118
__cleanup_mnt+0x15/0x20 fs/namespace.c:1125
task_work_run+0x8e/0x110 kernel/task_work.c:151
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_user_mode_loop kernel/entry/common.c:164 [inline]
exit_to_user_mode_prepare+0x13c/0x170 kernel/entry/common.c:191
syscall_exit_to_user_mode+0x16/0x30 kernel/entry/common.c:266
do_syscall_64+0x45/0x80 arch/x86/entry/common.c:56
entry_SYSCALL_64_after_hwframe+0x44/0xa9
There is no reproduction program, and the above is only theoretical
analysis.
Link: https://lkml.kernel.org/r/1629859428-5906-1-git-send-email-konishi.ryusuke@gmail.com
Fixes: ba65ae4729 ("nilfs2: add checkpoint tree to nilfs object")
Link: https://lkml.kernel.org/r/20210723012317.4146-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e1fbbd0731 upstream.
Keno Fischer reported that when a binray loaded via ld-linux-x the
prctl(PR_SET_MM_MAP) doesn't allow to setup brk value because it lays
before mm:end_data.
For example a test program shows
| # ~/t
|
| start_code 401000
| end_code 401a15
| start_stack 7ffce4577dd0
| start_data 403e10
| end_data 40408c
| start_brk b5b000
| sbrk(0) b5b000
and when executed via ld-linux
| # /lib64/ld-linux-x86-64.so.2 ~/t
|
| start_code 7fc25b0a4000
| end_code 7fc25b0c4524
| start_stack 7fffcc6b2400
| start_data 7fc25b0ce4c0
| end_data 7fc25b0cff98
| start_brk 55555710c000
| sbrk(0) 55555710c000
This of course prevent criu from restoring such programs. Looking into
how kernel operates with brk/start_brk inside brk() syscall I don't see
any problem if we allow to setup brk/start_brk without checking for
end_data. Even if someone pass some weird address here on a purpose then
the worst possible result will be an unexpected unmapping of existing vma
(own vma, since prctl works with the callers memory) but test for
RLIMIT_DATA is still valid and a user won't be able to gain more memory in
case of expanding VMAs via new values shipped with prctl call.
Link: https://lkml.kernel.org/r/20210121221207.GB2174@grain
Fixes: bbdc6076d2 ("binfmt_elf: move brk out of mmap when doing direct loader exec")
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reported-by: Keno Fischer <keno@juliacomputing.com>
Acked-by: Andrey Vagin <avagin@gmail.com>
Tested-by: Andrey Vagin <avagin@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb41f33458 upstream.
The assumption that lead to commit 5e5da1e9fb ("pwm: ab8500:
Explicitly allocate pwm chip base dynamically") was wrong: The
pwm-ab8500 devices are not directly instantiated from device tree, but
from the ab8500 mfd driver. So the pdev->id isn't -1, but a number
between 1 and 3. Now that pwmchip ids are always allocated dynamically,
this cannot easily be reverted.
Introduce a new member in the driver data struct that tracks the
hardware id and use this to calculate the register offset.
Side-note: Using chip->base to calculate the offset was never robust
because if there was already a PWM with id 1 at the time ab8500-pwm.1
was probed, the associated pwmchip would get assigned chip->base = 2 (or
something bigger).
Fixes: 5e5da1e9fb ("pwm: ab8500: Explicitly allocate pwm chip base dynamically")
Fixes: 6173f8f4ed ("pwm: Move AB8500 PWM driver to PWM framework")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a86d41404 upstream.
Currently perf saves a build-id with size but old versions assumes the
size of 20. In case the build-id is less than 20 (like for MD5), it'd
fill the rest with 0s.
I saw a problem when old version of perf record saved a binary in the
build-id cache and new version of perf reads the data. The symbols
should be read from the build-id cache (as the path no longer has the
same binary) but it failed due to mismatch in the build-id.
symsrc__init: build id mismatch for /home/namhyung/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf.
The build-id event in the data has 20 byte build-ids, but it saw a
different size (16) when it reads the build-id of the elf file in the
build-id cache.
$ readelf -n ~/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf
Displaying notes found in: .note.gnu.build-id
Owner Data size Description
GNU 0x00000010 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 53e4c2f42a4c61a2d632d92a72afa08f
Let's fix this by allowing trailing zeros if the size is different.
Fixes: 39be8d0115 ("perf tools: Pass build_id object to dso__build_id_equal()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210910224630.1084877-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2930ede52 upstream.
Instead of using the file offset in the debug file.
This fixes a regression from 00a3423492 ("perf symbols: Make
dso__load_bfd_symbols() load PE files from debug cache only"), causing
incorrect symbol resolution when debug file have been stripped from
non-debug sections (in which case its .text section is empty and doesn't
have any file position).
The debug files could also be created with a different file alignment,
and have different file positions from the mmap-ed binary, or have the
section reordered.
This instead looks for the file image base, using the corresponding bfd
*ABS* symbols. As PE symbols only have 4 bytes, it also needs to keep
.text section vma high bits.
Signed-off-by: Remi Bernon <rbernon@codeweavers.com>
Fixes: 00a3423492 ("perf symbols: Make dso__load_bfd_symbols() load PE files from debug cache only")
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Nicholas Fraser <nfraser@codeweavers.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210909192637.4139125-1-rbernon@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit adf9ae0d15 upstream.
In commit 9f0b4807a4 ("um: rework userspace stubs to not hard-code
stub location") I changed stub_segv_handler() to do a calculation with
a pointer to a stack variable to find the data page that we're using
for the stack and the rest of the data. This same commit was meant to
do it as well for stub_clone_handler(), but the change inadvertently
went into commit 84b2789d61 ("um: separate child and parent errors
in clone stub") instead.
This was reported to not be compiled correctly by gcc 5, causing the
code to crash here. I'm not sure why, perhaps it's UB because the var
isn't initialized? In any case, this trick always seemed bad, so just
create a new inline function that does the calculation in assembly.
Reported-by: subashab@codeaurora.org
Fixes: 9f0b4807a4 ("um: rework userspace stubs to not hard-code stub location")
Fixes: 84b2789d61 ("um: separate child and parent errors in clone stub")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 099ec97ac9 upstream.
clang warns:
drivers/staging/rtl8192u/r8192U_core.c:4268:20: warning: bitwise and of
boolean expressions; did you mean logical and? [-Wbool-operation-and]
bpacket_toself = bpacket_match_bssid &
^~~~~~~~~~~~~~~~~~~~~
&&
1 warning generated.
Replace the bitwise AND with a logical one to clear up the warning, as
that is clearly what was intended.
Fixes: 8fc8598e61 ("Staging: Added Realtek rtl8192u driver to staging")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20210814235625.1780033-1-nathan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a2b2eb556 upstream.
The Linux console's VT102 implementation already consumes OSC
("Operating System Command") sequences, probably because that's how
palette changes are transmitted.
In addition to OSC, there are three other major clases of ANSI control
strings: APC ("Application Program Command"), PM ("Privacy Message"),
and DCS ("Device Control String"). They are handled similarly to OSC in
terms of termination.
Source: vt100.net
Add three new enumerated states, one for each of these types. All three
are handled the same way right now--they simply consume input until
terminated. I hope to expand upon this firmament in the future. Add
new predicate ansi_control_string(), returning true for any of these
states. Replace explicit checks against ESosc with calls to this
function. Transition to these states appropriately from the escape
initiation (ESesc) state.
This was motivated by the following Notcurses bugs:
https://github.com/dankamongmen/notcurses/issues/2050https://github.com/dankamongmen/notcurses/issues/1828https://github.com/dankamongmen/notcurses/issues/2069
where standard VT sequences are not consumed by the Linux console. It's
not necessary that the Linux console *support* these sequences, but it
ought *consume* these well-specified classes of sequences.
Tested by sending a variety of escape sequences to the console, and
verifying that they still worked, or were now properly consumed.
Verified that the escapes were properly terminated at a generic level.
Verified that the Notcurses tools continued to show expected output on
the Linux console, except now without escape bleedthrough.
Link: https://lore.kernel.org/lkml/YSydL0q8iaUfkphg@schwarzgerat.orthanc/
Signed-off-by: nick black <dankamongmen@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1511df6f5e upstream.
EMIT6_PCREL() macro assumes that the previous pass generated 6 bytes
of code, which is not the case if branch shortening took place. Fix by
using jit->prg, like all the other EMIT6_PCREL_*() macros.
Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Fixes: 4e9b4a6883 ("s390/bpf: Use relative long branches")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e61dc9da0 upstream.
The JIT uses agfi for subtracting constants, but -(-0x80000000) cannot
be represented as a 32-bit signed binary integer. Fix by using algfi in
this particular case.
Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Fixes: 0546231057 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db7bee6538 upstream.
Currently the JIT completely removes things like `reg32 += 0`,
however, the BPF_ALU semantics requires the target register to be
zero-extended in such cases.
Fix by optimizing out only the arithmetic operation, but not the
subsequent zero-extension.
Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Fixes: 0546231057 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02319bf15a upstream.
After d12e1c4649 ("net: dsa: b53: Set correct number of ports in the
DSA struct") we stopped setting dsa_switch::num_ports to DSA_MAX_PORTS,
which created an off by one error between the statically allocated
bcm_sf2_priv::port_sts array (of size DSA_MAX_PORTS). When
dsa_is_cpu_port() is used, we end-up accessing an out of bounds member
and causing a NPD.
Fix this by iterating with the appropriate port count using
ds->num_ports.
Fixes: d12e1c4649 ("net: dsa: b53: Set correct number of ports in the DSA struct")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eca4cf12ac upstream.
The recent patch has introduced a regression by not reading the reset
count in the ERROR_RECOVERY async event handler. We may have just
gone through a reset and the reset count has just incremented. If
we don't update the reset count in the ERROR_RECOVERY event handler,
the health check timer will see that the reset count has changed and
will initiate an unintended reset.
Restore the unconditional update of the reset count in
bnxt_async_event_process() if error recovery watchdog is enabled.
Also, update the reset count at the end of the reset sequence to
make it even more robust.
Fixes: 1b2b918319 ("bnxt_en: Fix possible unintended driver initiated error recovery")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0341d5e3d1 ]
The cur_tx counter must be incremented after TACT bit of
txdesc->status was set. However, a CPU is possible to reorder
instructions and/or memory accesses between cur_tx and
txdesc->status. And then, if TX interrupt happened at such a
timing, the sh_eth_tx_free() may free the descriptor wrongly.
So, add wmb() before cur_tx++.
Otherwise NETDEV WATCHDOG timeout is possible to happen.
Fixes: 86a74ff21a ("net: sh_eth: add support for Renesas SuperH Ethernet")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cdff1eda69 ]
One MIPS platform (mach-rc32434) defines GPIOBASE. This macro
conflicts with one of the same name in lpc_sch.c. Rename the latter one
to prevent the build error.
../drivers/mfd/lpc_sch.c:25: error: "GPIOBASE" redefined [-Werror]
25 | #define GPIOBASE 0x44
../arch/mips/include/asm/mach-rc32434/rb.h:32: note: this is the location of the previous definition
32 | #define GPIOBASE 0x050000
Cc: Denis Turischev <denis@compulab.co.il>
Fixes: e82c60ae7d ("mfd: Introduce lpc_sch for Intel SCH LPC bridge")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1b2b918319 ]
If error recovery is already enabled, bnxt_timer() will periodically
check the heartbeat register and the reset counter. If we get an
error recovery async. notification from the firmware (e.g. change in
primary/secondary role), we will immediately read and update the
heartbeat register and the reset counter. If the timer for the next
health check expires soon after this, we may read the heartbeat register
again in quick succession and find that it hasn't changed. This will
trigger error recovery unintentionally.
The likelihood is small because we also reset fw_health->tmr_counter
which will reset the interval for the next health check. But the
update is not protected and bnxt_timer() can miss the update and
perform the health check without waiting for the full interval.
Fix it by only reading the heartbeat register and reset counter in
bnxt_async_event_process() if error recovery is trasitioning to the
enabled state. Also add proper memory barriers so that when enabling
for the first time, bnxt_timer() will see the tmr_counter interval and
perform the health check after the full interval has elapsed.
Fixes: 7e914027f7 ("bnxt_en: Enable health monitoring.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6fdab8a3ad ]
The current asic.rev is incomplete and does not include the metal
revision. Add the metal revision and decode the complete asic
revision into the more common and readable form (A0, B0, etc).
Fixes: 7154917a12 ("bnxt_en: Refactor bnxt_dl_info_get().")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 63f8428b40 ]
Broadcom's b53 switches have one IMP (Inband Management Port) that needs
to be programmed using its own designed register. IMP port may be
different than CPU port - especially on devices with multiple CPU ports.
For that reason it's required to explicitly note IMP port index and
check for it when choosing a register to use.
This commit fixes BCM5301x support. Those switches use CPU port 5 while
their IMP port is 8. Before this patch b53 was trying to program port 5
with B53_PORT_OVERRIDE_CTRL instead of B53_GMII_PORT_OVERRIDE_CTRL(5).
It may be possible to also replace "cpu_port" usages with
dsa_is_cpu_port() but that is out of the scope of thix BCM5301x fix.
Fixes: 967dd82ffc ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a0ed250f9 ]
The GRE tunnel device can pull existing outer headers in ipge_xmit.
This is a rare path, apparently unique to this device. The below
commit ensured that pulling does not move skb->data beyond csum_start.
But it has a false positive if ip_summed is not CHECKSUM_PARTIAL and
thus csum_start is irrelevant.
Refine to exclude this. At the same time simplify and strengthen the
test.
Simplify, by moving the check next to the offending pull, making it
more self documenting and removing an unnecessary branch from other
code paths.
Strengthen, by also ensuring that the transport header is correct and
therefore the inner headers will be after skb_reset_inner_headers.
The transport header is set to csum_start in skb_partial_csum_set.
Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/
Fixes: 1d011c4803 ("ip_gre: add validation for csum_start")
Reported-by: Ido Schimmel <idosch@idosch.org>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ddbc2a00d ]
Previous commit 68233c583a removes the qlcnic_rom_lock()
in qlcnic_pinit_from_rom(), but remains its corresponding
unlock function, which is odd. I'm not very sure whether the
lock is missing, or the unlock is redundant. This bug is
suggested by a static analysis tool, please advise.
Fixes: 68233c583a ("qlcnic: updated reset sequence")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c7c5e6ff53 ]
syzbot found that forcing a big quantum attribute would crash hosts fast,
essentially using this:
tc qd replace dev eth0 root fq_codel quantum 4294967295
This is because fq_codel_dequeue() would have to loop
~2^31 times in :
if (flow->deficit <= 0) {
flow->deficit += q->quantum;
list_move_tail(&flow->flowchain, &q->old_flows);
goto begin;
}
SFQ max quantum is 2^19 (half a megabyte)
Lets adopt a max quantum of one megabyte for FQ_CODEL.
Fixes: 4b549a2ef4 ("fq_codel: Fair Queue Codel AQM")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 730affed24 ]
Bug reported by KASAN:
BUG: KASAN: use-after-scope in inet6_ehashfn (net/ipv6/inet6_hashtables.c:40)
Call Trace:
(...)
inet6_ehashfn (net/ipv6/inet6_hashtables.c:40)
(...)
nf_sk_lookup_slow_v6 (net/ipv6/netfilter/nf_socket_ipv6.c:91
net/ipv6/netfilter/nf_socket_ipv6.c:146)
It seems that this bug has already been fixed by Eric Dumazet in the
past in:
commit 78296c97ca ("netfilter: xt_socket: fix a stack corruption bug")
But a variant of the same issue has been introduced in
commit d64d80a2cd ("netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match")
`daddr` and `saddr` potentially hold a reference to ipv6_var that is no
longer in scope when the call to `nf_socket_get_sock_v6` is made.
Fixes: d64d80a2cd ("netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match")
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d12e1c4649 ]
Setting DSA_MAX_PORTS caused DSA to call b53 callbacks (e.g.
b53_disable_port() during dsa_register_switch()) for invalid
(non-existent) ports. That made b53 modify unrelated registers and is
one of reasons for a broken BCM5301x support.
This problem exists for years but DSA_MAX_PORTS usage has changed few
times. It seems the most accurate to reference commit dropping
dsa_switch_alloc() in the Fixes tag.
Fixes: 7e99e34701 ("net: dsa: remove dsa_switch_alloc helper")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cdb067d31c ]
It isn't true that CPU port is always the last one. Switches BCM5301x
have 9 ports (port 6 being inactive) and they use port 5 as CPU by
default (depending on design some other may be CPU ports too).
A more reliable way of determining number of ports is to check for the
last set bit in the "enabled_ports" bitfield.
This fixes b53 internal state, it will allow providing accurate info to
the DSA and is required to fix BCM5301x support.
Fixes: 967dd82ffc ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ecdc28defc ]
If the network devices connected to the system beyond
HSO_MAX_NET_DEVICES. add_net_device() in hso_create_net_device()
will be failed for the network_table is full. It will lead to
business failure which rely on network_table, for example,
hso_suspend() and hso_resume(). It will also lead to memory leak
because resource release process can not search the hso_device
object from network_table in hso_free_interface().
Add failure handler for add_net_device() in hso_create_net_device()
to solve the above problems.
Fixes: 72dc1c096c ("HSO: add option hso driver")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bfd862a7e9 ]
'$cin' and '$sin' variables are local to a function: they are then not
available from the cleanup trap.
Instead, we need to use '$large' and '$small' that are not local and
defined just before setting the trap.
Without this patch, running this script in a loop might cause a:
write: No space left on device
issue.
Fixes: 1a418cb8e8 ("mptcp: simult flow self-tests")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1094c6fe72 ]
Florian noted that if mptcp_alloc_tx_skb() allocation fails
in __mptcp_push_pending(), we can end-up invoking
mptcp_push_release()/tcp_push() with a zero mss, causing
a divide by 0 error.
This change addresses the issue refactoring the skb allocation
code checking if skb collapsing will happen for sure and doing
the skb allocation only after such check. Skb allocation will
now happen only after the call to tcp_send_mss() which
correctly initializes mss_now.
As side bonuses we now fill the skb tx cache only when needed,
and this also clean-up a bit the output path.
v1 -> v2:
- use lockdep_assert_held_once() - Jakub
- fix indentation - Jakub
Reported-by: Florian Westphal <fw@strlen.de>
Fixes: 724cfd2ee8 ("mptcp: allocate TX skbs in msk context")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8af52e6977 ]
Currently the clean target when using O= isn't cleaning the feature
detect output. This is because O= and OUTPUT= are set to canonical
paths. For example in tools/perf/Makefile:
FULL_O := $(shell cd $(PWD); readlink -f $(O) || echo $(O))
This means that OUTPUT ends in a / and most usages prepend it to a file
without adding an extra /. This line that was changed adds an extra /
before the 'feature' folder but not to the end, resulting in a clean
command like this:
rm -f /tmp/build//featuretest-all.bin ...
After the change the clean command looks like this:
rm -f /tmp/build/feature/test-all.bin ...
Fixes: 762323eb39 ("perf build: Move feature cleanup under tools/build")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20210816130705.1331868-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0e90dfa7a8 ]
I noticed that only port 0 worked on the RTL8366RB since we
started to use custom tags.
It turns out that the format of egress custom tags is actually
different from ingress custom tags. While the lower bits just
contain the port number in ingress tags, egress tags need to
indicate destination port by setting the bit for the
corresponding port.
It was working on port 0 because port 0 added 0x00 as port
number in the lower bits, and if you do this the packet appears
at all ports, including the intended port. Ooops.
Fix this and all ports work again. Use the define for shifting
the "type A" into place while we're at it.
Tested on the D-Link DIR-685 by sending traffic to each of
the ports in turn. It works.
Fixes: 86dd9868b8 ("net: dsa: tag_rtl4_a: Support also egress tags")
Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7db304375e ]
In case of buffered reading from block device, when short read happens,
we should retry to read more, otherwise the IO will be completed
partially, for example, the following fio expects to read 2MB, but it
can only read 1M or less bytes:
fio --name=onessd --filename=/dev/nvme0n1 --filesize=2M \
--rw=randread --bs=2M --direct=0 --overwrite=0 --numjobs=1 \
--iodepth=1 --time_based=0 --runtime=2 --ioengine=io_uring \
--registerfiles --fixedbufs --gtod_reduce=1 --group_reporting
Fix the issue by allowing short read retry for block device, which sets
FMODE_BUF_RASYNC really.
Fixes: 9a173346bd ("io_uring: fix short read retries for non-reg files")
Cc: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/20210821150751.1290434-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 889a1b3f35 ]
If an error occurs after a 'gpiochip_add_data()' call it must be undone by
a corresponding 'gpiochip_remove()' as already done in the remove function.
To simplify the code a fix a leak in the error handling path of the probe,
use the managed version instead (i.e. 'devm_gpiochip_add_data()')
Fixes: 698b8eeaed ("gpio/mpc8xxx: change irq handler from chained to normal")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7d6588931c ]
Commit 76c47d1449 ("gpio: mpc8xxx: Add ACPI support") has switched to a
managed version when dealing with 'mpc8xxx_gc->regs'. So the corresponding
'iounmap()' call in the error handling path and in the remove should be
removed to avoid a double unmap.
This also allows some simplification in the probe. All the error handling
paths related to managed resources can be direct returns and a NULL check
in what remains in the error handling path can be removed.
Fixes: 76c47d1449 ("gpio: mpc8xxx: Add ACPI support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 555bda42b0 ]
Commit 698b8eeaed ("gpio/mpc8xxx: change irq handler from chained to normal")
has introduced a new 'goto err;' at the very end of the function, but has
not updated the error handling path accordingly.
Add the now missing 'irq_domain_remove()' call which balances a previous
'irq_domain_create_linear() call.
Fixes: 698b8eeaed ("gpio/mpc8xxx: change irq handler from chained to normal")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit edf7b4a2d8 ]
The build on fedora:35 and fedora:rawhide with clang is failing with:
49 41.00 fedora:35 : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable]
u64 len = 0;
^
1 error generated.
make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2
50 41.11 fedora:rawhide : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable]
u64 len = 0;
^
1 error generated.
make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2
That 'len' variable is not used at all, so just make sure all the
synthesize_RECORD() routines return ssize_t to propagate the writen()
return, as it may fail, ditch the 'ret' var and bail out if those
routines fail.
Fixes: 0bf02a0d80 ("perf bench: Add build-id injection benchmark")
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/CAM9d7cgEZNSor+B+7Y2C+QYGme_v5aH0Zn0RLfxoQ+Fy83EHrg@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 261f491133 ]
Acaict, perf_home_perfconfig() is supposed to cache the result of
home_perfconfig, which returns the default location of perfconfig for
the user, given the HOME environment variable.
However, the current implementation calls home_perfconfig every time
perf_home_perfconfig() is called (so no caching is actually performed),
replacing the previous pointer, thus also causing a memory leak.
This patch adds a check of whether either config or failed is set and,
in that case, directly returns config without calling home_perfconfig at
each invocation.
Fixes: f5f03e19ce ("perf config: Add perf_home_perfconfig function")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: http://lore.kernel.org/lkml/20210820130817.740536-1-rickyman7@gmail.com
[ Removed needless double check for the 'failed' variable ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6b5ff0405e ]
0day bot reports a build error:
ERROR: modpost: "clear_user_page" [drivers/media/v4l2-core/videobuf-dma-sg.ko] undefined!
so export it in arch/arc/ to fix the build error.
In most ARCHes, clear_user_page() is a macro. OTOH, in a few
ARCHes it is a function and needs to be exported.
PowerPC exported it in 2004. It looks like nds32 and nios2
still need to have it exported.
Fixes: 4102b53392 ("ARC: [mm] Aliasing VIPT dcache support 2/4")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-snps-arc@lists.infradead.org
Signed-off-by: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6826c6849b ]
The CPU_ON PSCI call takes a payload that KVM uses to configure a
destination vCPU to run. This payload is non-architectural state and not
exposed through any existing UAPI. Effectively, we have a race between
CPU_ON and userspace saving/restoring a guest: if the target vCPU isn't
ran again before the VMM saves its state, the requested PC and context
ID are lost. When restored, the target vCPU will be runnable and start
executing at its old PC.
We can avoid this race by making sure the reset payload is serviced
before userspace can access a vCPU's state.
Fixes: 358b28f09f ("arm/arm64: KVM: Allow a VCPU to fully reset itself")
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210818202133.1106786-3-oupton@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6654f9dfcb ]
KVM correctly serializes writes to a vCPU's reset state, however since
we do not take the KVM lock on the read side it is entirely possible to
read state from two different reset requests.
Cure the race for now by taking the KVM lock when reading the
reset_state structure.
Fixes: 358b28f09f ("arm/arm64: KVM: Allow a VCPU to fully reset itself")
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210818202133.1106786-2-oupton@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a89d69a44e ]
Since 2431c4f5b4 ("mtd: Implement mtd_{read,write}() as wrappers
around mtd_{read,write}_oob()") don't allow _write|_read and
_write_oob|_read_oob existing at the same time, we should check the
existence of callbacks "_read and _write" from subdev's master device
(We can trust master device since it has been registered) before
assigning, otherwise following warning occurs while making
concatenated device:
WARNING: CPU: 2 PID: 6728 at drivers/mtd/mtdcore.c:595
add_mtd_device+0x7f/0x7b0
Fixes: 2431c4f5b4 ("mtd: Implement mtd_{read,write}() around ...")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20210817114857.2784825-3-chengzhihao1@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a946506c48 ]
The driver was registering IRQ 0 when no IRQ was set. This leads to
warnings with newer kernels.
Clear the resource flags, so no resource is registered at all in this
case.
Fixes: 2f17dd34ff ("mfd: tqmx86: IO controller with I2C, Wachdog and GPIO")
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3245a7b7b ]
Syzbot hit use-after-free in nf_tables_dump_sets. The problem was in
missing lock protection for nft_ct_pcpu_template_refcnt.
Before commit f102d66b33 ("netfilter: nf_tables: use dedicated
mutex to guard transactions") all transactions were serialized by global
mutex, but then global mutex was changed to local per netnamespace
commit_mutex.
This change causes use-after-free bug, when 2 netnamespaces concurently
changing nft_ct_pcpu_template_refcnt without proper locking. Fix it by
adding nft_ct_pcpu_mutex and protect all nft_ct_pcpu_template_refcnt
changes with it.
Fixes: f102d66b33 ("netfilter: nf_tables: use dedicated mutex to guard transactions")
Reported-and-tested-by: syzbot+649e339fa6658ee623d3@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9f1168cf26 ]
The Intel IXP4xx PCI controller is only present on Intel IXP4xx
XScale-based network processor SoCs.
Add a dependency on ARCH_IXP4XX, to prevent asking the user about this
driver when configuring a kernel without support for the XScale
processor family.
Link: https://lore.kernel.org/r/6a88e55fe58fc280f4ff1ca83c154e4895b6dcbf.1624972789.git.geert+renesas@glider.be
Fixes: f7821b4934 ("PCI: ixp4xx: Add a new driver for IXP4xx")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
[lorenzo.pieralisi@arm.com: commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit daa3736151 ]
Remove interrupt disablement during backlight setting. It is
way to dangerous and makes platforms instable by having it
miss vblank IRQs leading to the graphics derailing.
The code is using ndelay() which is not available on
platforms such as ARM and will result in 32 * udelay(1)
which is substantial.
Add some code to detect if an interrupt occurs during the
tight loop and in that case just redo it from the top.
Fixes: 5317f37e48 ("backlight: Add Kinetic KTD253 backlight driver")
Cc: Stephan Gerhold <stephan@gerhold.net>
Reported-by: newbyte@disroot.org
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f949a9ebce ]
On Cherry Trail devices with an AXP288 PMIC the external SD-card slot
used the AXP's DLDO2 as card-voltage and either DLDO3 or GPIO1LDO
(GPIO1 pin in low noise LDO mode) as signal-voltage.
These regulators are turned on/off and in case of the signal-voltage
also have their output-voltage changed by the _PS0 and _PS3 power-
management ACPI methods on the MMC-controllers ACPI fwnode as well as
by the _DSM ACPI method for changing the signal voltage.
The AML code implementing these methods is directly accessing the
PMIC through ACPI I2C OpRegion accesses, instead of using the special
PMIC OpRegion handled by drivers/acpi/pmic/intel_pmic_xpower.c .
This means that the contents of the involved PMIC registers can change
without the change being made through the regmap interface, so regmap
should not cache the contents of these registers.
Mark the regulator power on/off, the regulator voltage control and the
GPIO1 control registers as volatile, to avoid regmap caching them.
Specifically this fixes an issue on some models where the i915 driver
toggles another LDO using the same on/off register on/off through
MIPI sequences (through intel_soc_pmic_exec_mipi_pmic_seq_element())
which then writes back a cached on/off register-value where the
card-voltage is off causing the external sdcard slot to stop working
when the screen goes blank, or comes back on again.
The regulator register-range now marked volatile also includes the
buck regulator control registers. This is done on purpose these are
normally not touched by the AML code, but they are updated directly
by the SoC's PUNIT which means that they may also change without going
through regmap.
Note the AXP288 PMIC is only used on Bay- and Cherry-Trail platforms,
so even though this is an ACPI specific problem there is no need to
make the new volatile ranges conditional since these platforms always
use ACPI.
Fixes: dc91c3b6fe ("mfd: axp20x: Mark AXP20X_VBUS_IPSOUT_MGMT as volatile")
Fixes: cd53216625 ("mfd: axp20x: Fix axp288 volatile ranges")
Reported-and-tested-by: Clamshell <clamfly@163.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f97493657c ]
Joakim Zhang reports that Wake-on-Lan with the stmmac ethernet driver broke
when moving the incorrect handling of mac link state out of mac_config().
This reason this breaks is because the stmmac's WoL is handled by the MAC
rather than the PHY, and phylink doesn't cater for that scenario.
This patch adds the necessary phylink code to handle suspend/resume events
according to whether the MAC still needs a valid link or not. This is the
barest minimum for this support.
Reported-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Tested-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0097ae5f7a ]
When the function IS_ALIGNED() returns false, the value of ret is 0.
So, we set ret to -EINVAL to indicate this error.
Clean up smatch warning:
drivers/ntb/test/ntb_perf.c:602 perf_setup_inbuf() warn: missing error
code 'ret'.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 319f83ac98 ]
When the value of nm->isr_ctx is false, the value of ret is 0.
So, we set ret to -ENOMEM to indicate this error.
Clean up smatch warning:
drivers/ntb/test/ntb_msi_test.c:373 ntb_msit_probe() warn: missing
error code 'ret'.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7db8263a12 ]
When adapter->registered_device_map is NULL, the value of err is
uncertain, we set err to -EINVAL to avoid ambiguity.
Clean up smatch warning:
drivers/net/ethernet/chelsio/cxgb/cxgb2.c:1114 init_one() warn: missing
error code 'err'
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1c500ad706 ]
syzbot is reporting circular locking problem at __loop_clr_fd() [1], for
commit a160c6159d ("block: add an optional probe callback to
major_names") is calling the module's probe function with major_names_lock
held.
Fortunately, since commit 990e78116d ("block: loop: fix deadlock
between open and remove") stopped holding loop_ctl_mutex in lo_open(),
current role of loop_ctl_mutex is to serialize access to loop_index_idr
and loop_add()/loop_remove(); in other words, management of id for IDR.
To avoid holding loop_ctl_mutex during whole add/remove operation, use
a bool flag to indicate whether the loop device is ready for use.
loop_unregister_transfer() which is called from cleanup_cryptoloop()
currently has possibility of use-after-free problem due to lack of
serialization between kfree() from loop_remove() from loop_control_remove()
and mutex_lock() from unregister_transfer_cb(). But since lo->lo_encryption
should be already NULL when this function is called due to module unload,
and commit 222013f9ac ("cryptoloop: add a deprecation warning")
indicates that we will remove this function shortly, this patch updates
this function to emit warning instead of checking lo->lo_encryption.
Holding loop_ctl_mutex in loop_exit() is pointless, for all users must
close /dev/loop-control and /dev/loop$num (in order to drop module's
refcount to 0) before loop_exit() starts, and nobody can open
/dev/loop-control or /dev/loop$num afterwards.
Link: https://syzkaller.appspot.com/bug?id=7bb10e8b62f83e4d445cdf4c13d69e407e629558 [1]
Reported-by: syzbot <syzbot+f61766d5763f9e7a118f@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/adb1e792-fc0e-ee81-7ea0-0906fc36419d@i-love.sakura.ne.jp
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2d52c58b9c ]
The function bfq_setup_merge prepares the merging between two
bfq_queues, say bfqq and new_bfqq. To this goal, it assigns
bfqq->new_bfqq = new_bfqq. Then, each time some I/O for bfqq arrives,
the process that generated that I/O is disassociated from bfqq and
associated with new_bfqq (merging is actually a redirection). In this
respect, bfq_setup_merge increases new_bfqq->ref in advance, adding
the number of processes that are expected to be associated with
new_bfqq.
Unfortunately, the stable-merging mechanism interferes with this
setup. After bfqq->new_bfqq has been set by bfq_setup_merge, and
before all the expected processes have been associated with
bfqq->new_bfqq, bfqq may happen to be stably merged with a different
queue than the current bfqq->new_bfqq. In this case, bfqq->new_bfqq
gets changed. So, some of the processes that have been already
accounted for in the ref counter of the previous new_bfqq will not be
associated with that queue. This creates an unbalance, because those
references will never be decremented.
This commit fixes this issue by reestablishing the previous, natural
behaviour: once bfqq->new_bfqq has been set, it will not be changed
until all expected redirections have occurred.
Signed-off-by: Davide Zini <davidezini2@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20210802141352.74353-2-paolo.valente@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aabbdc67f3 ]
Add quirk CDC_MBIM_FLAG_AVOID_ALTSETTING_TOGGLE for Telit LN920
0x1061 composition in order to avoid bind error.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b9edbfe1ad ]
Commit 3df98d7921 ("lsm,selinux: pass flowi_common instead of flowi
to the LSM hooks") introduced flowi{4,6}_to_flowi_common() functions which
cause UBSAN warning when building with LLVM 11.0.1 on Ubuntu 21.04.
================================================================================
UBSAN: object-size-mismatch in ./include/net/flow.h:197:33
member access within address ffffc9000109fbd8 with insufficient space
for an object of type 'struct flowi'
CPU: 2 PID: 7410 Comm: systemd-resolve Not tainted 5.14.0 #51
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
Call Trace:
dump_stack_lvl+0x103/0x171
ubsan_type_mismatch_common+0x1de/0x390
__ubsan_handle_type_mismatch_v1+0x41/0x50
udp_sendmsg+0xda2/0x1300
? ip_skb_dst_mtu+0x1f0/0x1f0
? sock_rps_record_flow+0xe/0x200
? inet_send_prepare+0x2d/0x90
sock_sendmsg+0x49/0x80
____sys_sendmsg+0x269/0x370
__sys_sendmsg+0x15e/0x1d0
? syscall_enter_from_user_mode+0xf0/0x1b0
do_syscall_64+0x3d/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f7081a50497
Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
RSP: 002b:00007ffc153870f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f7081a50497
RDX: 0000000000000000 RSI: 00007ffc15387140 RDI: 000000000000000c
RBP: 00007ffc15387140 R08: 0000563f29a5e4fc R09: 000000000000cd28
R10: 0000563f29a68a30 R11: 0000000000000246 R12: 000000000000000c
R13: 0000000000000001 R14: 0000563f29a68a30 R15: 0000563f29a5e50c
================================================================================
I don't think we need to call flowi{4,6}_to_flowi() from these functions
because the first member of "struct flowi4" and "struct flowi6" is
struct flowi_common __fl_common;
while the first member of "struct flowi" is
union {
struct flowi_common __fl_common;
struct flowi4 ip4;
struct flowi6 ip6;
struct flowidn dn;
} u;
which should point to the same address without access to "struct flowi".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 21274aa178 ]
Check one more time before exiting the API with an error.
Fix API to poll at least twice, in case there are other high priority
tasks and this API doesn't get CPU cycles for multiple jiffies update.
In addition, increase timeout from usecs_to_jiffies(10000) to
usecs_to_jiffies(20000), to prevent the case that for CONFIG_100HZ
timeout will be a single jiffies.
A single jiffies results actual timeout that can be any time between
1usec and 10msec. To solve this, a value of usecs_to_jiffies(20000)
ensures that timeout is 2 jiffies.
Signed-off-by: Smadar Fuks <smadarf@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dbe80cf471 ]
We must not pet a running watchdog when handle_boot_enabled is off
because this will kick off automatic triggering before userland is
running, defeating the purpose of the handle_boot_enabled control.
Furthermore, don't ping in case watchdog_set_last_hw_keepalive was
called incorrectly when the hardware watchdog is actually not running.
Fixed: cef9572e9a ("watchdog: add support for adjusting last known HW keepalive time")
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/93d56386-6e37-060b-55ce-84de8cde535f@web.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 32837d8a8f ]
Some Cavium endpoints are implemented as multi-function devices without ACS
capability, but they actually don't support peer-to-peer transactions.
Add ACS quirks to declare DMA isolation for the following devices:
- BGX device found on Octeon-TX (8xxx)
- CGX device found on Octeon-TX2 (9xxx)
- RPM device found on Octeon-TX3 (10xxx)
Link: https://lore.kernel.org/r/20210810122425.1115156-1-george.cherian@marvell.com
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f1de58802f ]
J7200 has the same PCIe IP as in J721E with minor changes in the
wrapper. J7200 allows byte access of bridge configuration space
registers and the register field for LINK_DOWN interrupt is different.
J7200 also requires "quirk_detect_quiet_flag" to be set. Configure these
changes as part of driver data applicable only to J7200.
Link: https://lore.kernel.org/r/20210811123336.31357-4-kishon@ti.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09c24094b2 ]
PCIe fails to link up if SERDES lanes not used by PCIe are assigned to
another protocol. For example, link training fails if lanes 2 and 3 are
assigned to another protocol while lanes 0 and 1 are used for PCIe to
form a two lane link. This failure is due to an incorrect tie-off on an
internal status signal indicating electrical idle.
Status signals going from SERDES to PCIe Controller are tied-off when a
lane is not assigned to PCIe. Signal indicating electrical idle is
incorrectly tied-off to a state that indicates non-idle. As a result,
PCIe sees unused lanes to be out of electrical idle and this causes
LTSSM to exit Detect.Quiet state without waiting for 12ms timeout to
occur. If a receiver is not detected on the first receiver detection
attempt in Detect.Active state, LTSSM goes back to Detect.Quiet and
again moves forward to Detect.Active state without waiting for 12ms as
required by PCIe base specification. Since wait time in Detect.Quiet is
skipped, multiple receiver detect operations are performed back-to-back
without allowing time for capacitance on the transmit lines to
discharge. This causes subsequent receiver detection to always fail even
if a receiver gets connected eventually.
Add a quirk flag "quirk_detect_quiet_flag" to program the minimum
time the LTSSM should wait on entering Detect.Quiet state here.
This has to be set for J7200 as it has an incorrect tie-off on unused
lanes.
Link: https://lore.kernel.org/r/20210811123336.31357-3-kishon@ti.com
Signed-off-by: Nadeem Athani <nadeem@cadence.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e242060c6 ]
Since kprobe_events and uprobe_events only check whether the
other same-type probe event has the same name or not, if the
user gives the same name of the existing tracepoint event (or
the other type of probe events), it silently fails to create
the tracefs entry (but registered.) as below.
/sys/kernel/tracing # ls events/task/task_rename
enable filter format hist id trigger
/sys/kernel/tracing # echo p:task/task_rename vfs_read >> kprobe_events
[ 113.048508] Could not create tracefs 'task_rename' directory
/sys/kernel/tracing # cat kprobe_events
p:task/task_rename vfs_read
To fix this issue, check whether the existing events have the
same name or not in trace_probe_register_event_call(). If exists,
it rejects to register the new event.
Link: https://lkml.kernel.org/r/162936876189.187130.17558311387542061930.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ccac969772 ]
When protected mode is enabled, the host is unable to access most parts
of the EL2 hypervisor image, including 'hyp_physvirt_offset' and the
contents of the hypervisor's '.rodata.str' section. Unfortunately,
nvhe_hyp_panic_handler() tries to read from both of these locations when
handling a BUG() triggered at EL2; the former for converting the ELR to
a physical address and the latter for displaying the name of the source
file where the BUG() occurred.
Hack the EL2 panic asm to pass both physical and virtual ELR values to
the host and utilise the newly introduced CONFIG_NVHE_EL2_DEBUG so that
we disable stage-2 protection for the host before returning to the EL1
panic handler. If the debug option is not enabled, display the address
instead of the source file:line information.
Cc: Andrew Scull <ascull@google.com>
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210813130336.8139-1-will@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fb31f0a499 ]
RISCV uses a global variable pfn_base for page/pfn translation. But this
is a common name and will be used elsewhere. In those cases, the
page-pfn macros which refer to this name will be referred to the
local/input variable instead. (such as in vfio_pin_pages_remote). This
make everything wrong.
This patch changes the name from pfn_base to riscv_pfn_base to fix
this problem.
Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ff80e2de3 ]
Although irq_create_mapping() is able to deal with duplicate
mappings, it really isn't supposed to be a substitute for
irq_find_mapping(), and can result in allocations that take place
in atomic context if the mapping didn't exist.
Fix the handful of MFD drivers that use irq_create_mapping() in
interrupt context by using irq_find_mapping() instead.
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e1e71c1688 ]
There is a potential race between fuse_read_interrupt() and
fuse_request_end().
TASK1
in fuse_read_interrupt(): delete req->intr_entry (while holding
fiq->lock)
TASK2
in fuse_request_end(): req->intr_entry is empty -> skip fiq->lock
wake up TASK3
TASK3
request is freed
TASK1
in fuse_read_interrupt(): dereference req->in.h.unique ***BAM***
Fix by always grabbing fiq->lock if the request was ever interrupted
(FR_INTERRUPTED set) thereby serializing with concurrent
fuse_read_interrupt() calls.
FR_INTERRUPTED is set before the request is queued on fiq->interrupts.
Dequeing the request is done with list_del_init() but FR_INTERRUPTED is not
cleared in this case.
Reported-by: lijiazi <lijiazi@xiaomi.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ec343111c0 ]
These are the actual frequencies reported by the PLL, so let's
report these. The roundoffs are inappropriate, we should round
to the frequency that the clock will later report.
Drop some whitespace at the same time.
Cc: phone-devel@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1fcef985c8 ]
The remoteproc driver is split between the responsibilities of getting
the SoC-internal ARM core up and running and the external RF (aka
"Iris") part configured.
In order to satisfy the regulator framework's need of a struct device *
to look up supplies this was implemented as two different drivers, using
of_platform_populate() in the remoteproc part to probe the iris part.
Unfortunately it's possible that the iris part probe defers on yet not
available regulators and an attempt to start the remoteproc will have to
be rejected, until this has been resolved. But there's no useful
mechanism of knowing when this would be.
Instead replace the of_platform_populate() and the iris probe with a
function that rolls its own struct device, with the relevant of_node
associated that is enough to acquire regulators and clocks specified in
the DT node and that may propagate the EPROBE_DEFER back to the wcnss
device's probe.
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reported-by: Anibal Limon <anibal.limon@linaro.org>
Reported-by: Loic Poulain <loic.poulain@linaro.org>
Tested-by: Anibal Limon <anibal.limon@linaro.org>
Link: https://lore.kernel.org/r/20210312002251.3273013-1-bjorn.andersson@linaro.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ee8a9600b5 ]
The network interface managed by the mlxbf_gige driver can
get into a problem state where traffic does not flow.
In this state, the interface will be up and enabled, but
will stop processing received packets. This problem state
will happen if three specific conditions occur:
1) driver has received more than (N * RxRingSize) packets but
less than (N+1 * RxRingSize) packets, where N is an odd number
Note: the command "ethtool -g <interface>" will display the
current receive ring size, which currently defaults to 128
2) the driver's interface was disabled via "ifconfig oob_net0 down"
during the window described in #1.
3) the driver's interface is re-enabled via "ifconfig oob_net0 up"
This patch ensures that the driver's "valid_polarity" field is
cleared during the open() method so that it always matches the
receive polarity used by hardware. Without this fix, the driver
needs to be unloaded and reloaded to correct this problem state.
Fixes: f92e1869d7 ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: David Thompson <davthompson@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a57d8c217a ]
Sometimes when unbinding the mv88e6xxx driver on Turris MOX, these error
messages appear:
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 0 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 100 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 0 from fdb: -2
(and similarly for other ports)
What happens is that DSA has a policy "even if there are bugs, let's at
least not leak memory" and dsa_port_teardown() clears the dp->fdbs and
dp->mdbs lists, which are supposed to be empty.
But deleting that cleanup code, the warnings go away.
=> the FDB and MDB lists (used for refcounting on shared ports, aka CPU
and DSA ports) will eventually be empty, but are not empty by the time
we tear down those ports. Aka we are deleting them too soon.
The addresses that DSA complains about are host-trapped addresses: the
local addresses of the ports, and the MAC address of the bridge device.
The problem is that offloading those entries happens from a deferred
work item scheduled by the SWITCHDEV_FDB_DEL_TO_DEVICE handler, and this
races with the teardown of the CPU and DSA ports where the refcounting
is kept.
In fact, not only it races, but fundamentally speaking, if we iterate
through the port list linearly, we might end up tearing down the shared
ports even before we delete a DSA user port which has a bridge upper.
So as it turns out, we need to first tear down the user ports (and the
unused ones, for no better place of doing that), then the shared ports
(the CPU and DSA ports). In between, we need to ensure that all work
items scheduled by our switchdev handlers (which only run for user
ports, hence the reason why we tear them down first) have finished.
Fixes: 161ca59d39 ("net: dsa: reference count the MDB entries at the cross-chip notifier level")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210914134726.2305133-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9edceaf430 ]
When we remove the siblings entry, we update ns->head->list, hence we
can't separate the removal and test for being empty. They have to be
in the same critical section to avoid a race.
To avoid breaking the refcounting imbalance again, add a list empty
check to nvme_find_ns_head.
Fixes: 5396fdac56 ("nvme: fix refcounting imbalance when all paths are down")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Tested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 472430a7b0 ]
When the command for querying imp info is issued to the firmware,
if the firmware does not support the command, the returned value
of bd num is 0.
Add protection mechanism before alloc memory to prevent apply for
0-length memory.
Fixes: 0b198b0d80 ("net: hns3: refactor dump m7 info of debugfs")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 111b64e35e ]
The delay is especially needed by the xRX300 and xRX330 SoCs. Without
this patch, some phys are sometimes not properly detected.
The patch was tested on BT Home Hub 5A and D-Link DWR-966.
Fixes: a09d042b08 ("net: dsa: lantiq: allow to use all GPHYs on xRX300 and xRX330")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ce062a0adb ]
When the mdio legacy mapping is used the mii_bus priv registered by DSA
refer to the dsa switch struct instead of the qca8k_priv struct and
causes a kernel panic. Create dedicated function when the internal
dedicated mdio driver is used to properly handle the 2 different
implementation.
Fixes: 759bafb8a3 ("net: dsa: qca8k: add support for internal phy and internal mdio")
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bfe8443509 ]
There are two cases where the current PF does not support RDMA
functionality. The first is if the NVM loaded on the device is set
to not support RDMA (common_caps.rdma is false). The second is if
the kernel bonding driver has included the current PF in an active
link aggregate.
When the driver has determined that this PF does not support RDMA, then
auxiliary devices should not be created on the auxiliary bus. Without
a device on the auxiliary bus, even if the irdma driver is present, there
will be no RDMA activity attempted on this PF.
Currently, in the reset flow, an attempt to create auxiliary devices is
performed without regard to the ability of the PF. There needs to be a
check in ice_aux_plug_dev (as the central point that creates auxiliary
devices) to see if the PF is in a state to support the functionality.
When disabling and re-enabling RDMA due to the inclusion/removal of the PF
in a link aggregate, we also need to set/clear the bit which controls
auxiliary device creation so that a reset recovery in a link aggregate
situation doesn't try to create auxiliary devices when it shouldn't.
Fixes: f9f5301e7e ("ice: Register auxiliary device to provide RDMA")
Reported-by: Yongxin Liu <yongxin.liu@windriver.com>
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c91c1da72b ]
Some profiles of the driver don't support a dedicated PTP-RQ, hence can't
support HW TS and CQE compression simultaneously. When HW TS is enabled
the COE compression is disabled, and should be restored when the HW TS
is turned off. Add rx_filter as an input to modifying CQE compression to
enforce this restriction.
Fixes: 256f79d13c ("net/mlx5e: Fix HW TS with CQE compression according to profile")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f1940d4e9c ]
The following crash happens when a never-used device is unbound from
uio_hv_generic driver:
kernel BUG at mm/slub.c:321!
invalid opcode: 0000 [#1] SMP PTI
CPU: 0 PID: 4001 Comm: bash Kdump: loaded Tainted: G X --------- --- 5.14.0-0.rc2.23.el9.x86_64 #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
RIP: 0010:__slab_free+0x1d5/0x3d0
...
Call Trace:
? pick_next_task_fair+0x18e/0x3b0
? __cond_resched+0x16/0x40
? vunmap_pmd_range.isra.0+0x154/0x1c0
? __vunmap+0x22d/0x290
? hv_ringbuffer_cleanup+0x36/0x40 [hv_vmbus]
kfree+0x331/0x380
? hv_uio_remove+0x43/0x60 [uio_hv_generic]
hv_ringbuffer_cleanup+0x36/0x40 [hv_vmbus]
vmbus_free_ring+0x21/0x60 [hv_vmbus]
hv_uio_remove+0x4f/0x60 [uio_hv_generic]
vmbus_remove+0x23/0x30 [hv_vmbus]
__device_release_driver+0x17a/0x230
device_driver_detach+0x3c/0xa0
unbind_store+0x113/0x130
...
The problem appears to be that we free 'ring_info->pkt_buffer' twice:
first, when the device is unbound from in-kernel driver (netvsc in this
case) and second from hv_uio_remove(). Normally, ring buffer is supposed
to be re-initialized from hv_uio_open() but this happens when UIO device
is being opened and this is not guaranteed to happen.
Generally, it is OK to call hv_ringbuffer_cleanup() twice for the same
channel (which is being handed over between in-kernel drivers and UIO) even
if we didn't call hv_ringbuffer_init() in between. We, however, need to
avoid kfree() call for an already freed pointer.
Fixes: adae1e931a ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210831143916.144983-1-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2a48d96fd5 upstream.
Use __maybe_unused for noirq_suspend()/noirq_resume() hooks to avoid
build warning with !CONFIG_PM_SLEEP:
>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function]
796 | static int stmmac_pltfr_noirq_resume(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function]
775 | static int stmmac_pltfr_noirq_suspend(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Fixes: 276aae3772 ("net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 427900d27d upstream.
Currently, the VF does not clear the interrupt source immediately after
receiving the interrupt. As a result, if the second interrupt task is
triggered when processing the first interrupt task, clearing the
interrupt source before exiting will clear the interrupt sources of the
two tasks at the same time. As a result, no interrupt is triggered for
the second task. The VF detects the missed message only when the next
interrupt is generated.
Clearing it immediately after executing check_evt_cause ensures that:
1. Even if two interrupt tasks are triggered at the same time, they can
be processed.
2. If the second task is triggered during the processing of the first
task and the interrupt source is not cleared, the interrupt is reported
after vector0 is enabled.
Fixes: b90fcc5bd9 ("net: hns3: add reset handling for VF when doing Core/Global/IMP reset")
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b81d894874 upstream.
The firmware will not disable mac in flr process. Therefore, the driver
needs to proactively disable mac during flr, which is the same as the
function reset.
Fixes: 35d93a3004 ("net: hns3: adjust the process of PF reset")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1dc839ec09 upstream.
Currently, affinity_mask is set to a single cpu. As a result,
irqbalance becomes invalid in SUBSET or EXACT mode. To solve
this problem, change affinity_mask to numa node range. In this
way, irqbalance can be performed on the cpu of the numa node.
Fixes: 0812545487 ("net: hns3: add interrupt affinity support for misc interrupt")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d18e81183b upstream.
The hardware cannot handle short tunnel frames below 65 bytes,
and will cause vlan tag missing problem. So pads packet size to
65 bytes for tunnel frames to fix this bug.
Fixes: 3db084d28dc0("net: hns3: Fix for vxlan tx checksum bug")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1affc01fdc upstream.
The call to bnxt_free_mem(..., false) in the bnxt_half_open_nic() error
path will deallocate ring descriptor memory via bnxt_free_?x_rings(),
but because irq_re_init is false, the ring info itself is not freed.
To simplify error paths, deallocation functions have generally been
written to be safe when called on unallocated memory. It should always
be safe to call dev_close(), which calls bnxt_free_skbs() a second time,
even in this semi- allocated ring state.
Calling bnxt_free_skbs() a second time with the rings already freed will
cause NULL pointer dereference. Fix it by checking the rings are valid
before proceeding in bnxt_free_tx_skbs() and
bnxt_free_one_rx_ring_skbs().
Fixes: 975bc99a4a ("bnxt_en: Refactor bnxt_free_rx_skbs().")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8b92b8c1e upstream.
We should not walk/touch page tables outside of VMA boundaries when
holding only the mmap sem in read mode. Evil user space can modify the
VMA layout just before this function runs and e.g., trigger races with
page table removal code since commit dd2283f260 ("mm: mmap: zap pages
with read mmap_sem in munmap").
find_vma() does not check if the address is >= the VMA start address;
use vma_lookup() instead.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Fixes: dd2283f260 ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a1e92d089 upstream.
We queue an irq work for deferred processing of mce event in realmode
mce handler, where translation is disabled. Queuing of the work may
result in accessing memory outside RMO region, such access needs the
translation to be enabled for an LPAR running with hash mmu else the
kernel crashes.
After enabling translation in mce_handle_error() we used to leave it
enabled to avoid crashing here, but now with the commit
74c3354bc1 ("powerpc/pseries/mce: restore msr before returning from
handler") we are restoring the MSR to disable translation.
Hence to fix this enable the translation before queuing the work.
Without this change following trace is seen on injecting SLB multihit in
an LPAR running with hash mmu.
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 5 PID: 1883 Comm: insmod Tainted: G OE 5.14.0-mce+ #137
NIP: c000000000735d60 LR: c000000000318640 CTR: 0000000000000000
REGS: c00000001ebff9a0 TRAP: 0300 Tainted: G OE (5.14.0-mce+)
MSR: 8000000000001003 <SF,ME,RI,LE> CR: 28008228 XER: 00000001
CFAR: c00000000031863c DAR: c00000027fa8fe08 DSISR: 40000000 IRQMASK: 0
...
NIP llist_add_batch+0x0/0x40
LR __irq_work_queue_local+0x70/0xc0
Call Trace:
0xc00000001ebffc0c (unreliable)
irq_work_queue+0x40/0x70
machine_check_queue_event+0xbc/0xd0
machine_check_early_common+0x16c/0x1f4
Fixes: 74c3354bc1 ("powerpc/pseries/mce: restore msr before returning from handler")
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
[mpe: Fix comment formatting, trim oops in change log for readability]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210909064330.312432-1-ganeshgr@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae7aaecc3f upstream.
The rfscv instruction does not work correctly with the fake-suspend mode
in POWER9, which can end up with the hypervisor restoring an incorrect
checkpoint.
Work around this by setting the _TIF_RESTOREALL flag if a system call
returns to a transaction active state, causing rfid to be used instead
of rfscv to return, which will do the right thing. The contents of the
registers are irrelevant because they will be overwritten in this case
anyway.
Fixes: 7fa95f9ada ("powerpc/64s: system call support for scv/rfscv instructions")
Reported-by: Eirik Fuller <efuller@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210908101718.118522-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 267cdfa213 upstream.
POWER9 DD2.2 and 2.3 hardware implements a "fake-suspend" mode where
certain TM instructions executed in HV=0 mode cause softpatch interrupts
so the hypervisor can emulate them and prevent problematic processor
conditions. In this fake-suspend mode, the treclaim. instruction does
not modify registers.
Unfortunately the rfscv instruction executed by the guest do not
generate softpatch interrupts, which can cause the hypervisor to lose
track of the fake-suspend mode, and it can execute this treclaim. while
not in fake-suspend mode. This modifies GPRs and crashes the hypervisor.
It's not trivial to disable scv in the guest with HFSCR now, because
they assume a POWER9 has scv available. So this fix saves and restores
checkpointed registers across the treclaim.
Fixes: 7854f7545b ("KVM: PPC: Book3S: Rework TM save/restore code and make it C-callable")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210908101718.118522-2-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 273c29e944 upstream.
If a failover occurs before a login response is received, the login
response buffer maybe undefined. Check that there was no failover
before accessing the login response buffer.
Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e50e711351 upstream.
Turn udp_tunnel_nic work-queue to an ordered work-queue. This queue
holds the UDP-tunnel configuration commands of the different netdevs.
When the netdevs are functions of the same NIC the order of
execution may be crucial.
Problem example:
NIC with 2 PFs, both PFs declare offload quota of up to 3 UDP-ports.
$ifconfig eth2 1.1.1.1/16 up
$ip link add eth2_19503 type vxlan id 5049 remote 1.1.1.2 dev eth2 dstport 19053
$ip link set dev eth2_19503 up
$ip link add eth2_19504 type vxlan id 5049 remote 1.1.1.3 dev eth2 dstport 19054
$ip link set dev eth2_19504 up
$ip link add eth2_19505 type vxlan id 5049 remote 1.1.1.4 dev eth2 dstport 19055
$ip link set dev eth2_19505 up
$ip link add eth2_19506 type vxlan id 5049 remote 1.1.1.5 dev eth2 dstport 19056
$ip link set dev eth2_19506 up
NIC RX port offload infrastructure offloads the first 3 UDP-ports (on
all devices which sets NETIF_F_RX_UDP_TUNNEL_PORT feature) and not
UDP-port 19056. So both PFs gets this offload configuration.
$ip link set dev eth2_19504 down
This triggers udp-tunnel-core to remove the UDP-port 19504 from
offload-ports-list and offload UDP-port 19056 instead.
In this scenario it is important that the UDP-port of 19504 will be
removed from both PFs before trying to add UDP-port 19056. The NIC can
stop offloading a UDP-port only when all references are removed.
Otherwise the NIC may report exceeding of the offload quota.
Fixes: cc4e3835ef ("udp_tunnel: add central NIC RX port offload infrastructure")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20e100f527 upstream.
Handle MFW (management FW) error response in order to avoid a crash
during recovery flows.
Changes from v1:
- Add "Fixes tag".
Fixes: tag 5e7ba042fd ("qed: Fix reading stale configuration information")
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b704b27be upstream.
If altname deletion of the short alternative name fails, the error
message printed is: "Failed to add short alternative name".
This is obviously a typo, as we are testing altname deletion.
Fix this using a proper error message.
Fixes: f95e6c9c46 ("selftest: net: add alternative names test")
Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f884f3962 upstream.
Commit 10d3be5692 ("tcp-tso: do not split TSO packets at retransmit
time") may directly retrans a multiple segments TSO/GSO packet without
split, Since this commit, we can no longer assume that a retransmitted
packet is a single segment.
This patch fixes the tp->undo_retrans accounting in tcp_sacktag_one()
that use the actual segments(pcount) of the retransmitted packet.
Before that commit (10d3be5692), the assumption underlying the
tp->undo_retrans-- seems correct.
Fixes: 10d3be5692 ("tcp-tso: do not split TSO packets at retransmit time")
Signed-off-by: zhenggy <zhenggy@chinatelecom.cn>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a69ae291e1 upstream.
Commit 865c50e1d2 ("x86/uaccess: utilize CONFIG_CC_HAS_ASM_GOTO_OUTPUT")
added an optimised version of __get_user_asm() for x86 using 'asm goto'.
Like the non-optimised code, the 32-bit implementation of 64-bit
get_user() expands to a pair of 32-bit accesses. Unlike the
non-optimised code, the _original_ pointer is incremented to copy the
high word instead of loading through a new pointer explicitly
constructed to point at a 32-bit type. Consequently, if the pointer
points at a 64-bit type then we end up loading the wrong data for the
upper 32-bits.
This was observed as a mount() failure in Android targeting i686 after
b0cfcdd9b9 ("d_path: make 'prepend()' fill up the buffer exactly on
overflow") because the call to copy_from_kernel_nofault() from
prepend_copy() ends up in __get_kernel_nofault() and casts the source
pointer to a 'u64 __user *'. An attempt to mount at "/debug_ramdisk"
therefore ends up failing trying to mount "/debumdismdisk".
Use the existing '__gu_ptr' source pointer to unsigned int for 32-bit
__get_user_asm_u64() instead of the original pointer.
Cc: Bill Wendling <morbo@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: 865c50e1d2 ("x86/uaccess: utilize CONFIG_CC_HAS_ASM_GOTO_OUTPUT")
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a52e73368 upstream.
DSA supports connecting to a phy-handle, and has a fallback to a non-OF
based method of connecting to an internal PHY on the switch's own MDIO
bus, if no phy-handle and no fixed-link nodes were present.
The -ENODEV error code from the first attempt (phylink_of_phy_connect)
is what triggers the second attempt (phylink_connect_phy).
However, when the first attempt returns a different error code than
-ENODEV, this results in an unbalance of calls to phylink_create and
phylink_destroy by the time we exit the function. The phylink instance
has leaked.
There are many other error codes that can be returned by
phylink_of_phy_connect. For example, phylink_validate returns -EINVAL.
So this is a practical issue too.
Fixes: aab9c4067d ("net: dsa: Plug in PHYLINK support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20210914134331.2303380-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04f08eb44b upstream.
syzbot reported another data-race in af_unix [1]
Lets change __skb_insert() to use WRITE_ONCE() when changing
skb head qlen.
Also, change unix_dgram_poll() to use lockless version
of unix_recvq_full()
It is verry possible we can switch all/most unix_recvq_full()
to the lockless version, this will be done in a future kernel version.
[1] HEAD commit: 8596e589b7
BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll
write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0:
__skb_insert include/linux/skbuff.h:1938 [inline]
__skb_queue_before include/linux/skbuff.h:2043 [inline]
__skb_queue_tail include/linux/skbuff.h:2076 [inline]
skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264
unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850
sock_sendmsg_nosec net/socket.c:703 [inline]
sock_sendmsg net/socket.c:723 [inline]
____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
___sys_sendmsg net/socket.c:2446 [inline]
__sys_sendmmsg+0x315/0x4b0 net/socket.c:2532
__do_sys_sendmmsg net/socket.c:2561 [inline]
__se_sys_sendmmsg net/socket.c:2558 [inline]
__x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1:
skb_queue_len include/linux/skbuff.h:1869 [inline]
unix_recvq_full net/unix/af_unix.c:194 [inline]
unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777
sock_poll+0x23e/0x260 net/socket.c:1288
vfs_poll include/linux/poll.h:90 [inline]
ep_item_poll fs/eventpoll.c:846 [inline]
ep_send_events fs/eventpoll.c:1683 [inline]
ep_poll fs/eventpoll.c:1798 [inline]
do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226
__do_sys_epoll_wait fs/eventpoll.c:2238 [inline]
__se_sys_epoll_wait fs/eventpoll.c:2233 [inline]
__x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x0000001b -> 0x00000001
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G W 5.14.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 86b18aaa2b ("skbuff: fix a data race in skb_queue_len()")
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c4cea8fa7 upstream.
If the sendmsg() call in vhost_tx_batch() fails, both the 'batched_xdp'
and 'done_idx' indexes are left unchanged. If such failure happens
when batched_xdp == VHOST_NET_BATCH, the next call to
vhost_net_build_xdp() will access and write memory outside the xdp
buffers area.
Since sendmsg() can only error with EBADFD, this change addresses the
issue explicitly freeing the XDP buffers batch on error.
Fixes: 0a0be13b8f ("vhost_net: batch submitting XDP buffers to underlayer sockets")
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ec783c7cb2 upstream.
We need to import the 'sys' package since the script has called
sys.exit() method.
Fixes: 6ad7cbc015 ("Makefile: Add clang-tidy and static analyzer support to makefile")
Signed-off-by: Kortan <kortanzh@gmail.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5c102238c upstream.
There is an off-by-one problem in ipa_table_init_add(), when
initializing filter tables.
In that function, the number of filter table entries is determined
based on the number of set bits in the filter map. However that
count does *not* include the extra "slot" in the filter table that
holds the filter map itself. Meanwhile, ipa_table_addr() *does*
include the filter map in the memory it returns, but because the
count it's provided doesn't include it, it includes one too few
table entries.
Fix this by including the extra slot for the filter map in the count
computed in ipa_table_init_add().
Note: ipa_filter_reset_table() does not have this problem; it resets
filter table entries one by one, but does not overwrite the filter
bitmap.
Fixes: 2b9feef2b6 ("soc: qcom: ipa: filter and routing tables")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b89a05b21f upstream.
In perf_event_addr_filters_apply, the task associated with
the event (event->ctx->task) is read using READ_ONCE at the beginning
of the function, checked, and then re-read from event->ctx->task,
voiding all guarantees of the checks. Reuse the value that was read by
READ_ONCE to ensure the consistency of the task struct throughout the
function.
Fixes: 375637bc52 ("perf/core: Introduce address range filtering")
Signed-off-by: Baptiste Lepers <baptiste.lepers@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210906015310.12802-1-baptiste.lepers@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b871895b14 upstream.
If a system call is made with a transaction active, the kernel
immediately aborts it and returns. scv system calls disable irqs even
earlier in their interrupt handler, and tabort_syscall does not fix this
up.
This can result in irq soft-mask state being messed up on the next
kernel entry, and crashing at BUG_ON(arch_irq_disabled_regs(regs)) in
the kernel exit handlers, or possibly worse.
This can't easily be fixed in asm because at this point an async irq may
have hit, which is soft-masked and marked pending. The pending interrupt
has to be replayed before returning to userspace. The fix is to move the
tabort_syscall code to C in the main syscall handler, and just skip the
system call but otherwise return as usual, which will take care of the
pending irqs. This also does a bunch of other things including possible
signal delivery to the process, but the doomed transaction should still
be aborted when it is eventually returned to.
The sc system call path is changed to use the new C function as well to
reduce code and path differences. This slows down how quickly system
calls are aborted when called while a transaction is active, which could
potentially impact TM performance. But making any system call is already
bad for performance, and TM is on the way out, so go with simpler over
faster.
Fixes: 7fa95f9ada ("powerpc/64s: system call support for scv/rfscv instructions")
Reported-by: Eirik Fuller <efuller@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use #ifdef rather than IS_ENABLED() to fix build error on 32-bit]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210903125707.1601269-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 70f437fb43 upstream.
Dispatching requests inline with the .queue_rq() call may block while
holding the send_mutex. If the tcp io_work also happens to schedule, it
may see the req_list is non-empty, leaving "pending" true and remaining
in TASK_RUNNING. Since io_work is of higher scheduling priority, the
.queue_rq task may not get a chance to run, blocking forward progress
and leading to io timeouts.
Instead of checking for pending requests within io_work, let the queueing
restart io_work outside the send_mutex lock if there is more work to be
done.
Fixes: a0fdd14180 ("nvme-tcp: rerun io_work if req_list is not empty")
Reported-by: Samuel Jones <sjones@kalrayinc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40ee363c84 upstream.
Checking tunnel offloading, it turns out that offloading doesn't work
as expected. The following script allows to reproduce the issue.
Call it as `testscript DEVICE LOCALIP REMOTEIP NETMASK'
=== SNIP ===
if [ $# -ne 4 ]
then
echo "Usage $0 DEVICE LOCALIP REMOTEIP NETMASK"
exit 1
fi
DEVICE="$1"
LOCAL_ADDRESS="$2"
REMOTE_ADDRESS="$3"
NWMASK="$4"
echo "Driver: $(ethtool -i ${DEVICE} | awk '/^driver:/{print $2}') "
ethtool -k "${DEVICE}" | grep tx-udp
echo
echo "Set up NIC and tunnel..."
ip addr add "${LOCAL_ADDRESS}/${NWMASK}" dev "${DEVICE}"
ip link set "${DEVICE}" up
sleep 2
ip link add vxlan1 type vxlan id 42 \
remote "${REMOTE_ADDRESS}" \
local "${LOCAL_ADDRESS}" \
dstport 0 \
dev "${DEVICE}"
ip addr add fc00::1/64 dev vxlan1
ip link set vxlan1 up
sleep 2
rm -f vxlan.pcap
echo "Running tcpdump and iperf3..."
( nohup tcpdump -i any -w vxlan.pcap >/dev/null 2>&1 ) &
sleep 2
iperf3 -c fc00::2 >/dev/null
pkill tcpdump
echo
echo -n "Max. Paket Size: "
tcpdump -r vxlan.pcap -nnle 2>/dev/null \
| grep "${LOCAL_ADDRESS}.*> ${REMOTE_ADDRESS}.*OTV" \
| awk '{print $8}' | awk -F ':' '{print $1}' \
| sort -n | tail -1
echo
ip link del vxlan1
ip addr del ${LOCAL_ADDRESS}/${NWMASK} dev "${DEVICE}"
=== SNAP ===
The expected outcome is
Max. Paket Size: 64904
This is what you see on igb, the code igc has been taken from.
However, on igc the output is
Max. Paket Size: 1516
so the GSO aggregate packets are segmented by the kernel before calling
igc_xmit_frame. Inside the subsequent call to igc_tso, the check for
skb_is_gso(skb) fails and the function returns prematurely.
It turns out that this occurs because the feature flags aren't set
entirely correctly in igc_probe. In contrast to the original code
from igb_probe, igc_probe neglects to set the flags required to allow
tunnel offloading.
Setting the same flags as igb fixes the issue on igc.
Fixes: 34428dff36 ("igc: Add GSO partial support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tested-by: Corinna Vinschen <vinschen@redhat.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 276aae3772 upstream.
commit 5f58591323 ("net: stmmac: delete the eee_ctrl_timer after
napi disabled"), this patch tries to fix system hang caused by eee_ctrl_timer,
unfortunately, it only can resolve it for system reboot stress test. System
hang also can be reproduced easily during system suspend/resume stess test
when mount NFS on i.MX8MP EVK board.
In stmmac driver, eee feature is combined to phylink framework. When do
system suspend, phylink_stop() would queue delayed work, it invokes
stmmac_mac_link_down(), where to deactivate eee_ctrl_timer synchronizly.
In above commit, try to fix issue by deactivating eee_ctrl_timer obviously,
but it is not enough. Looking into eee_ctrl_timer expire callback
stmmac_eee_ctrl_timer(), it could enable hareware eee mode again. What is
unexpected is that LPI interrupt (MAC_Interrupt_Enable.LPIEN bit) is always
asserted. This interrupt has chance to be issued when LPI state entry/exit
from the MAC, and at that time, clock could have been already disabled.
The result is that system hang when driver try to touch register from
interrupt handler.
The reason why above commit can fix system hang issue in stmmac_release()
is that, deactivate eee_ctrl_timer not just after napi disabled, further
after irq freed.
In conclusion, hardware would generate LPI interrupt when clock has been
disabled during suspend or resume, since hardware is in eee mode and LPI
interrupt enabled.
Interrupts from MAC, MTL and DMA level are enabled and never been disabled
when system suspend, so postpone clocks management from suspend stage to
noirq suspend stage should be more safe.
Fixes: 5f58591323 ("net: stmmac: delete the eee_ctrl_timer after napi disabled")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee27e330a9 upstream.
Fixes the below flow of sleeping in atomic context by releasing
the RCU lock before calling to free_match_list.
build_match_list() <- disables preempt
-> free_match_list()
-> tree_put_node()
-> down_write_ref_node() <- take write lock
Fixes: 693c6883bb ("net/mlx5: Add hash table for flow groups in flow table")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 57f0ff059e upstream.
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:
# perf top --sort comm -g --ignore-callees=do_idle
terminates with:
#0 0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
#1 0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
#2 0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
#3 hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
#4 0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
sample_self=sample_self@entry=true) at util/hist.c:657
#5 0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
#6 0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
at util/hist.c:1056
#7 iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
#8 0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
at util/hist.c:1231
#9 0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
at builtin-top.c:842
#10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
#11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
#12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
#13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
#14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
#15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
#16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
#17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
#18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6
If you look at the frame #2, the code is:
488 if (he->srcline) {
489 he->srcline = strdup(he->srcline);
490 if (he->srcline == NULL)
491 goto err_rawdata;
492 }
If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.
Also, if you look at the commit 1fb7d06a50, it adds the srcline property
into the struct, but not initializing it everywhere needed.
Committer notes:
Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():
2181 if (al.sym != NULL) {
2182 if (perf_hpp_list.parent && !*parent &&
2183 symbol__match_regex(al.sym, &parent_regex))
2184 *parent = al.sym;
2185 else if (have_ignore_callees && root_al &&
2186 symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187 /* Treat this symbol as the root,
2188 forgetting its callees. */
2189 *root_al = al;
2190 callchain_cursor_reset(cursor);
2191 }
2192 }
And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:
1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212 int max_stack_depth, void *arg)
1213 {
1214 int err, err2;
1215 struct map *alm = NULL;
1216
1217 if (al)
1218 alm = map__get(al->map);
1219
1220 err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221 iter->evsel, al, max_stack_depth);
1222 if (err) {
1223 map__put(alm);
1224 return err;
1225 }
1226
1227 err = iter->ops->prepare_entry(iter, al);
1228 if (err)
1229 goto out;
1230
1231 err = iter->ops->add_single_entry(iter, al);
1232 if (err)
1233 goto out;
1234
That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:
iter->ops->add_single_entry(iter, al);
will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06a50 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 040b8907cc upstream.
With the new static annotation, the compiler warns when the functions
are actually unused:
drivers/gpu/drm/rockchip/cdn-dp-core.c:1123:12: error: 'cdn_dp_resume' defined but not used [-Werror=unused-function]
1123 | static int cdn_dp_resume(struct device *dev)
| ^~~~~~~~~~~~~
Mark them __maybe_unused to suppress that warning as well.
[ Not so 'new' static annotations any more, and I removed the part of
the patch that added __maybe_unused to cdn_dp_suspend(), because it's
used by the shutdown/remove code.
So only the resume function ends up possibly unused if CONFIG_PM isn't
set - Linus ]
Fixes: 7c49abb4c2 ("drm/rockchip: cdn-dp-core: Make cdn_dp_core_suspend/resume static")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f4bb62e64c upstream.
In tipc_sk_enqueue() we use hardcoded 2 jiffies to extract
socket buffer from generic queue to particular socket.
The 2 jiffies is too short in case there are other high priority
tasks get CPU cycles for multiple jiffies update. As result, no
buffer could be enqueued to particular socket.
To solve this, we switch to use constant timeout 20msecs.
Then, the function will be expired between 2 jiffies (CONFIG_100HZ)
and 20 jiffies (CONFIG_1000HZ).
Fixes: c637c10355 ("tipc: resolve race problem at unicast message reception")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3f0cc1a94 upstream.
A number of users have reported that they were not able to get the PHY
to successfully link up, especially after commit c36757eb9d ("net:
phy: consider AN_RESTART status when reading link status") where we
stopped reading just BMSR, but we also read BMCR to determine the link
status.
Andrius at NetBSD did a wonderful job at debugging the problem
and found out that the MDIO bus clock frequency would be incorrectly set
back to its default value which would prevent the MDIO bus controller
from reading PHY registers properly. Back when we only read BMSR, if we
read all 1s, we could falsely indicate a link status, though in general
there is a cable plugged in, so this went unnoticed. After a second read
of BMCR was added, a wrong read will lead to the inability to determine
a link UP condition which is when it started to be visibly broken, even
if it was long before that.
The fix consists in restoring the value of the MD_CSR register that was
set prior to the MAC reset.
Link: http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=53494
Fixes: 90f750a81a ("r6040: consolidate MAC reset to its own function")
Reported-by: Andrius V <vezhlys@gmail.com>
Reported-by: Darek Strugacz <darek.strugacz@op.pl>
Tested-by: Darek Strugacz <darek.strugacz@op.pl>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b6ff7eb66 upstream.
The reference count leak issue may take place in an error handling
path. If both conditions of tunnel->version == L2TP_HDR_VER_3 and the
return value of l2tp_v3_ensure_opt_in_linear is nonzero, the function
would directly jump to label invalid, without decrementing the reference
count of the l2tp_session object session increased earlier by
l2tp_tunnel_get_session(). This may result in refcount leaks.
Fix this issue by decrease the reference count before jumping to the
label invalid.
Fixes: 4522a70db7 ("l2tp: fix reading optional fields of L2TPv3")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d9ea761fdd upstream.
Commit 2677d20677 ("dccp: don't free ccid2_hc_tx_sock ...") fixed
a UAF but reintroduced CVE-2017-6074.
When the sock is cloned, two dccps_hc_tx_ccid will reference to the
same ccid. So one can free the ccid object twice from two socks after
cloning.
This issue was found by "Hadar Manor" as well and assigned with
CVE-2020-16119, which was fixed in Ubuntu's kernel. So here I port
the patch from Ubuntu to fix it.
The patch prevents cloned socks from referencing the same ccid.
Fixes: 2677d20677 ("dccp: don't free ccid2_hc_tx_sock ...")
Signed-off-by: Zhenpeng Lin <zplin@psu.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7366c23ff4 upstream.
Building dp83640.c on arch/parisc/ produces a build warning for
PAGE0 being redefined. Since the macro is not used in the dp83640
driver, just make it a comment for documentation purposes.
In file included from ../drivers/net/phy/dp83640.c:23:
../drivers/net/phy/dp83640_reg.h:8: warning: "PAGE0" redefined
8 | #define PAGE0 0x0000
from ../drivers/net/phy/dp83640.c:11:
../arch/parisc/include/asm/page.h:187: note: this is the location of the previous definition
187 | #define PAGE0 ((struct zeropage *)__PAGE_OFFSET)
Fixes: cb646e2b02 ("ptp: Added a clock driver for the National Semiconductor PHYTER.")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Richard Cochran <richard.cochran@omicron.at>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210913220605.19682-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c3a0a018e upstream.
Remove the assert from the callback priv lookup function since it does
not require RTNL lock and is already protected by flow_indr_block_lock.
This will avoid warnings from being emitted to dmesg if the driver
registers its callback after an ingress qdisc was created for a
netdevice.
The warnings started after the following patch was merged:
commit 74fc4f8287 ("net: Fix offloading indirect devices dependency on qdisc order creation")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9756e44fd4 upstream.
The commit 733c99ee8b ("net: fix NULL pointer reference in
cipso_v4_doi_free") was merged by a mistake, this patch try
to cleanup the mess.
And we already have the commit e842cb60e8 ("net: fix NULL
pointer reference in cipso_v4_doi_free") which fixed the root
cause of the issue mentioned in it's description.
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cc19862ffe upstream.
syzbot reported an use-after-free crash:
BUG: KASAN: use-after-free in tipc_recvmsg+0xf77/0xf90 net/tipc/socket.c:1979
Call Trace:
tipc_recvmsg+0xf77/0xf90 net/tipc/socket.c:1979
sock_recvmsg_nosec net/socket.c:943 [inline]
sock_recvmsg net/socket.c:961 [inline]
sock_recvmsg+0xca/0x110 net/socket.c:957
tipc_conn_rcv_from_sock+0x162/0x2f0 net/tipc/topsrv.c:398
tipc_conn_recv_work+0xeb/0x190 net/tipc/topsrv.c:421
process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
As Hoang pointed out, it was caused by skb_cb->bytes_read still accessed
after calling tsk_advance_rx_queue() to free the skb in tipc_recvmsg().
This patch is to fix it by accessing skb_cb->bytes_read earlier than
calling tsk_advance_rx_queue().
Fixes: f4919ff59c ("tipc: keep the skb in rcv queue until the whole data is read")
Reported-by: syzbot+e6741b97d5552f97c24d@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 81065b35e2 upstream.
There are two cases for machine check recovery:
1) The machine check was triggered by ring3 (application) code.
This is the simpler case. The machine check handler simply queues
work to be executed on return to user. That code unmaps the page
from all users and arranges to send a SIGBUS to the task that
triggered the poison.
2) The machine check was triggered in kernel code that is covered by
an exception table entry. In this case the machine check handler
still queues a work entry to unmap the page, etc. but this will
not be called right away because the #MC handler returns to the
fix up code address in the exception table entry.
Problems occur if the kernel triggers another machine check before the
return to user processes the first queued work item.
Specifically, the work is queued using the ->mce_kill_me callback
structure in the task struct for the current thread. Attempting to queue
a second work item using this same callback results in a loop in the
linked list of work functions to call. So when the kernel does return to
user, it enters an infinite loop processing the same entry for ever.
There are some legitimate scenarios where the kernel may take a second
machine check before returning to the user.
1) Some code (e.g. futex) first tries a get_user() with page faults
disabled. If this fails, the code retries with page faults enabled
expecting that this will resolve the page fault.
2) Copy from user code retries a copy in byte-at-time mode to check
whether any additional bytes can be copied.
On the other side of the fence are some bad drivers that do not check
the return value from individual get_user() calls and may access
multiple user addresses without noticing that some/all calls have
failed.
Fix by adding a counter (current->mce_count) to keep track of repeated
machine checks before task_work() is called. First machine check saves
the address information and calls task_work_add(). Subsequent machine
checks before that task_work call back is executed check that the address
is in the same page as the first machine check (since the callback will
offline exactly one page).
Expected worst case is four machine checks before moving on (e.g. one
user access with page faults disabled, then a repeat to the same address
with page faults enabled ... repeat in copy tail bytes). Just in case
there is some code that loops forever enforce a limit of 10.
[ bp: Massage commit message, drop noinstr, fix typo, extend panic
messages. ]
Fixes: 5567d11c21 ("x86/mce: Send #MC singal from task work")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/YT/IJ9ziLqmtqEPu@agluck-desk2.amr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 34b1999da9 upstream.
Jiri Olsa reported a fault when running:
# cat /proc/kallsyms | grep ksys_read
ffffffff8136d580 T ksys_read
# objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore
/proc/kcore: file format elf64-x86-64
Segmentation fault
general protection fault, probably for non-canonical address 0xf887ffcbff000: 0000 [#1] SMP PTI
CPU: 12 PID: 1079 Comm: objdump Not tainted 5.14.0-rc5qemu+ #508
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014
RIP: 0010:kern_addr_valid
Call Trace:
read_kcore
? rcu_read_lock_sched_held
? rcu_read_lock_sched_held
? rcu_read_lock_sched_held
? trace_hardirqs_on
? rcu_read_lock_sched_held
? lock_acquire
? lock_acquire
? rcu_read_lock_sched_held
? lock_acquire
? rcu_read_lock_sched_held
? rcu_read_lock_sched_held
? rcu_read_lock_sched_held
? lock_release
? _raw_spin_unlock
? __handle_mm_fault
? rcu_read_lock_sched_held
? lock_acquire
? rcu_read_lock_sched_held
? lock_release
proc_reg_read
? vfs_read
vfs_read
ksys_read
do_syscall_64
entry_SYSCALL_64_after_hwframe
The fault happens because kern_addr_valid() dereferences existent but not
present PMD in the high kernel mappings.
Such PMDs are created when free_kernel_image_pages() frees regions larger
than 2Mb. In this case, a part of the freed memory is mapped with PMDs and
the set_memory_np_noalias() -> ... -> __change_page_attr() sequence will
mark the PMD as not present rather than wipe it completely.
Have kern_addr_valid() check whether higher level page table entries are
present before trying to dereference them to fix this issue and to avoid
similar issues in the future.
Stable backporting note:
------------------------
Note that the stable marking is for all active stable branches because
there could be cases where pagetable entries exist but are not valid -
see 9a14aefc1d ("x86: cpa, fix lookup_address"), for example. So make
sure to be on the safe side here and use pXY_present() accessors rather
than pXY_none() which could #GP when accessing pages in the direct map.
Also see:
c40a56a781 ("x86/mm/init: Remove freed kernel image areas from alias mapping")
for more info.
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: <stable@vger.kernel.org> # 4.4+
Link: https://lkml.kernel.org/r/20210819132717.19358-1-rppt@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aeef8b5089 upstream.
The end address passed to memtype_reserve() is handed directly to
sanitize_phys(). However, end is exclusive and sanitize_phys() expects
an inclusive address. If end falls at the end of the physical address
space, sanitize_phys() will return 0. This can result in drivers
failing to load, and the following warning:
WARNING: CPU: 26 PID: 749 at arch/x86/mm/pat.c:354 reserve_memtype+0x262/0x450
reserve_memtype failed: [mem 0x3ffffff00000-0xffffffffffffffff], req uncached-minus
Call Trace:
[<ffffffffa427b1f2>] reserve_memtype+0x262/0x450
[<ffffffffa42764aa>] ioremap_nocache+0x1a/0x20
[<ffffffffc04620a1>] mpt3sas_base_map_resources+0x151/0xa60 [mpt3sas]
[<ffffffffc0465555>] mpt3sas_base_attach+0xf5/0xa50 [mpt3sas]
---[ end trace 6d6eea4438db89ef ]---
ioremap reserve_memtype failed -22
mpt3sas_cm0: unable to map adapter memory! or resource not found
mpt3sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:10597/_scsih_probe()!
Fix this by passing the inclusive end address to sanitize_phys().
Fixes: 510ee090ab ("x86/mm/pat: Prepare {reserve, free}_memtype() for "decoy" addresses")
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/x49o8a3pu5i.fsf@segfault.boston.devel.redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2faea8b64 upstream.
When we forcefully evict a mapping from the the address space and thus the
MMU context, the MMU context is leaked, as the mapping no longer points to
it, so it doesn't get freed when the GEM object is destroyed. Add the
mssing context put to fix the leak.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6408538f0 upstream.
Move the refcount manipulation of the MMU context to the point where the
hardware state is programmed. At that point it is also known if a previous
MMU state is still there, or the state needs to be reprogrammed with a
potentially different context.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f3eea9d01 upstream.
The MMU state may be kept across a runtime suspend/resume cycle, as we
avoid a full hardware reset to keep the latency of the runtime PM small.
Don't pretend that the MMU state is lost in driver state. The MMU
context is pushed out when new HW jobs with a different context are
coming in. The only exception to this is when the GPU is unbound, in
which case we need to make sure to also free the last active context.
Cc: stable@vger.kernel.org # 5.4
Reported-by: Michael Walle <michael@walle.cc>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 23e0f5a57d upstream.
While the DMA frontend can only be active when the MMU context is set, the
reverse isn't necessarily true, as the frontend can be stopped while the
MMU state is kept. Stop treating mmu_context being set as a indication that
the frontend is running and instead add a explicit property.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cda7532916 upstream.
The prev context is the MMU context at the time of the job
queueing in hardware. As a job might be queued multiple times
due to recovery after a GPU hang, we need to make sure to put
the stale prev MMU context from a prior queuing, to avoid the
reference and thus the MMU context leaking.
Cc: stable@vger.kernel.org # 5.4
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Michael Walle <michael@walle.cc>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d329e1286 upstream.
A common complaint is that using O_NONBLOCK files with io_uring can be a
bit of a pain. Be a bit nicer and allow normal retry IFF the file does
support async behavior. This makes it possible to use io_uring more
reliably with O_NONBLOCK files, for use cases where it either isn't
possible or feasible to modify the file flags.
Cc: stable@vger.kernel.org
Reported-and-tested-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b514e898e upstream.
Current RUNPM mechanism relies on PMFW to master the timing for BACO
in/exit. And that needs cooperation from sound driver for dstate
change notification for function 1(audio). Otherwise(on sound driver
missing), BACO cannot be kicked in correctly and hang will be observed
on RUNPM exit.
By switching back to legacy message way on sound driver missing,
we are able to fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-and-tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a70939851f upstream.
[Why]
The "base_addr_is_mc_addr" field was added for dcn3.1 support but
pa_config was never updated to set it to false.
Uninitialized memory causes it to be set to true which results in
address mistranslation and white screen.
[How]
Use memset to ensure all fields are initialized to 0 by default.
Fixes: 64b1d0e8d5 ("drm/amd/display: Add DCN3.1 HWSEQ")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90517c9838 upstream.
[Why]
call stack of amdgpu dsc mst pbn, slot num calculation is as below:
-compute_bpp_x16_from_target_bandwidth
-decide_dsc_target_bpp_x16
-setup_dsc_config
-dc_dsc_compute_bandwidth_range
-compute_mst_dsc_configs_for_link
-compute_mst_dsc_configs_for_state
from pbn -> dsc target bpp_x16
bpp_x16 is calulated by compute_bpp_x16_from_target_bandwidth.
Beside pixel clock and bpp, num_slices_h and bpp_increment_div
will also affect bpp_x16.
from dsc target bpp_x16 -> pbn
within dm_update_mst_vcpi_slots_for_dsc,
pbn = drm_dp_calc_pbn_mode(clock, bpp_x16, true);
drm_dp_calc_pbn_mode(int clock, int bpp, bool dsc)
{
return DIV_ROUND_UP_ULL(mul_u32_u32(clock * (bpp / 16), 64 * 1006),
8 * 54 * 1000 * 1000);
}
bpp / 16 trunc digits after decimal point. This will cause calculation
delta. drm_dp_calc_pbn_mode does not have other informations,
like num_slices_h, bpp_increment_div. therefore, it does not do revese
calcuation properly from bpp_x16 to pbn.
pbn from drm_dp_calc_pbn_mode is less than pbn from
compute_mst_dsc_configs_for_state. This cause not enough mst slot
allocated to display. display could not visually light up.
[How]
pass pbn from compute_mst_dsc_configs_for_state to
dm_update_mst_vcpi_slots_for_dsc
Cc: stable@vger.kernel.org
Reviewed-by: Scott Foster <Scott.Foster@amd.com>
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Hersen Wu <hersenwu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9987fbb368 upstream.
On Carrizo/Stoney systems we set backlight through panel_cntl, i.e.
directly via the PWM registers, if DMCU is not initialized. We
always read it back through ABM registers which leads to a
mismatch and forces atomic_commit to program the backlight
each time.
Instead make sure we use the same logic for backlight readback,
i.e. read it from panel_cntl if DMCU is not initialized.
We also need to remove some extraneous and incorrect calculations
at the end of dce_get_16_bit_backlight_from_pwm.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1666
Cc: stable@vger.kernel.org
Reviewed-by: Josip Pavic <josip.pavic@amd.com>
Acked-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e35ac9d0b5 upstream.
When we need a buffer for SVE register state we call sve_alloc() to make
sure that one is there. In order to avoid repeated allocations and frees
we keep the buffer around unless we change vector length and just memset()
it to ensure a clean register state. The function that deals with this
takes the task to operate on as an argument, however in the case where we
do a memset() we initialise using the SVE state size for the current task
rather than the task passed as an argument.
This is only an issue in the case where we are setting the register state
for a task via ptrace and the task being configured has a different vector
length to the task tracing it. In the case where the buffer is larger in
the traced process we will leak old state from the traced process to
itself, in the case where the buffer is smaller in the traced process we
will overflow the buffer and corrupt memory.
Fixes: bc0ee47603 ("arm64/sve: Core task context handling")
Cc: <stable@vger.kernel.org> # 4.15.x
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210909165356.10675-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16c8d2df7e upstream.
When setting up the next segment, we check what type the iter is and
handle it accordingly. However, when incrementing and processed amount
we do not, and both iter advance and addr/len are adjusted, regardless
of type. Split the increment side just like we do on the setup side.
Fixes: 4017eb91a9 ("io_uring: make loop_rw_iter() use original user supplied pointers")
Cc: stable@vger.kernel.org
Reported-by: Valentina Palmiotti <vpalmiotti@gmail.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90702dcd19 upstream.
We can reproduce this issue with below steps:
1) enable WoL on the host
2) host system suspended
3) remote client send out wakeup packets
We can see that host system resume back, but can't work, such as ping failed.
After a bit digging, this issue is introduced by the commit 46f69ded98
("net: stmmac: Use resolved link config in mac_link_up()"), which use
the finalised link parameters in mac_link_up() rather than the
parameters in mac_config().
There are two scenarios for MAC suspend/resume in STMMAC driver:
1) MAC suspend with WoL inactive, stmmac_suspend() call
phylink_mac_change() to notify phylink machine that a change in MAC
state, then .mac_link_down callback would be invoked. Further, it will
call phylink_stop() to stop the phylink instance. When MAC resume back,
firstly phylink_start() is called to start the phylink instance, then
call phylink_mac_change() which will finally trigger phylink machine to
invoke .mac_config and .mac_link_up callback. All is fine since
configuration in these two callbacks will be initialized, that means MAC
can restore the state.
2) MAC suspend with WoL active, phylink_mac_change() will put link
down, but there is no phylink_stop() to stop the phylink instance, so it
will link up again, that means .mac_config and .mac_link_up would be
invoked before system suspended. After system resume back, it will do
DMA initialization and SW reset which let MAC lost the hardware setting
(i.e MAC_Configuration register(offset 0x0) is reset). Since link is up
before system suspended, so .mac_link_up would not be invoked after
system resume back, lead to there is no chance to initialize the
configuration in .mac_link_up callback, as a result, MAC can't work any
longer.
After discussed with Russell King [1], we confirm that phylink framework
have not take WoL into consideration yet. This patch calls
phylink_suspend()/phylink_resume() functions which is newly introduced
by Russell King to fix this issue.
[1] https://lore.kernel.org/netdev/20210901090228.11308-1-qiangqing.zhang@nxp.com/
Fixes: 46f69ded98 ("net: stmmac: Use resolved link config in mac_link_up()")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5fab34565 upstream.
In lock_region, simplify the calculation of the region_width parameter.
This field is the size, but encoded as ceil(log2(size)) - 1.
ceil(log2(size)) may be computed directly as fls(size - 1). However, we
want to use the 64-bit versions as the amount to lock can exceed
32-bits.
This avoids undefined (and completely wrong) behaviour when locking all
memory (size ~0). In this case, the old code would "round up" ~0 to the
nearest page, overflowing to 0. Since fls(0) == 0, this would calculate
a region width of 10 + 0 = 10. But then the code would shift by
(region_width - 11) = -1. As shifting by a negative number is undefined,
UBSAN flags the bug. Of course, even if it were defined the behaviour is
wrong, instead of locking all memory almost none would get locked.
The new form of the calculation corrects this special case and avoids
the undefined behaviour.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-and-tested-by: Chris Morgan <macromorgan@hotmail.com>
Fixes: f3ba91228e ("drm/panfrost: Add initial panfrost driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210824173028.7528-2-alyssa.rosenzweig@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5bccb945f3 upstream.
Add safe lut configuration for all the targets in dpu
driver as per QOS recommendation.
Issue reported on SC7280:
With wait-for-safe feature in smmu enabled, RT client
buffer levels are checked to be safe before smmu invalidation.
Since display was always set to unsafe it was delaying the
invalidaiton process thus impacting the performance on NRT clients
such as eMMC and NVMe.
Validated this change on SC7280, With this change eMMC performance
has improved significantly.
Changes in v2:
- Add fixes tag (Sai)
- CC stable kernel (Dimtry)
Changes in v3:
- Correct fixes tag with appropriate hash (stephen)
- Resend patch adding reviewed by tag
- Resend patch adding correct format for pushing into stable tree (Greg)
Fixes: 591e34a091 ("drm/msm/disp/dpu1: add support for display for SC7280 target")
Cc: stable@vger.kernel.org
Signed-off-by: Kalyan Thota <kalyan_t@codeaurora.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> (sc7280, sc7180)
Link: https://lore.kernel.org/r/1628070028-2616-1-git-send-email-kalyan_t@codeaurora.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 92bd92c44d upstream.
Commit 2f015ec6ea ("drm/dp_mst: Add sideband down request tracing +
selftests") added some debug code for sideband message tracing. But
it seems to have unintentionally changed the behavior on sideband message
failure. It catches and returns failure only if DRM_UT_DP is enabled.
Otherwise it ignores the error code and returns success. So on an MST
unplug, the caller is unaware that the clear payload message failed and
ends up waiting for 4 seconds for the response. Fixes the issue by
returning the proper error code.
Changes in V2:
-- Revise commit text as review comment
-- add Fixes text
Changes in V3:
-- remove "unlikely" optimization
Fixes: 2f015ec6ea ("drm/dp_mst: Add sideband down request tracing + selftests")
Cc: <stable@vger.kernel.org> # v5.5+
Signed-off-by: Rajkumar Subbiah <rsubbia@codeaurora.org>
Signed-off-by: Kuogee Hsieh <khsieh@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1625585434-9562-1-git-send-email-khsieh@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 81d0885d68 upstream.
tx_done is not used for napi_complete_done(). Thus, NAPI busy polling
mechanism by gro_flush_timeout and napi_defer_hard_irqs will not able
be triggered after a packet is transmitted when there is no receive
packet.
Fix this by taking the maximum value between tx_done and rx_done as
overall budget completed by the rxtx NAPI poll to ensure XDP Tx ZC
operation is continuously polling for next Tx frame. This gives
benefit of lower packet submission processing latency and jitter
under XDP Tx ZC mode.
Performance of tx-only using xdp-sock on Intel ADL-S platform is
the same with and without this patch.
root@intel-corei7-64:~# ./xdpsock -i enp0s30f4 -t -z -q 1 -n 10
sock0@enp0s30f4:1 txonly xdp-drv
pps pkts 10.00
rx 0 0
tx 511630 8659520
sock0@enp0s30f4:1 txonly xdp-drv
pps pkts 10.00
rx 0 0
tx 511625 13775808
sock0@enp0s30f4:1 txonly xdp-drv
pps pkts 10.00
rx 0 0
tx 511619 18892032
Fixes: 132c32ee5b ("net: stmmac: Add TX via XDP zero-copy socket")
Cc: <stable@vger.kernel.org> # 5.13.x
Co-developed-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 552799f8b3 upstream.
Currently, outgoing packets larger than 1496 bytes are dropped when
tagged VLAN is used on a switch port.
Add the frame check sequence length to the value of the register
GSWIP_MAC_FLEN to fix this. This matches the lantiq_ppa vendor driver,
which uses a value consisting of 1518 bytes for the MAC frame, plus the
lengths of special tag and VLAN tags.
Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Hoffmann <jan@3e8.eu>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32b2397c1e upstream.
There is a use after free crash when the pmem driver tears down its
mapping while I/O is still inbound.
This is triggered by driver unbind, "ndctl destroy-namespace", while I/O
is in flight.
Fix the sequence of blk_cleanup_queue() vs memunmap().
The crash signature is of the form:
BUG: unable to handle page fault for address: ffffc90080200000
CPU: 36 PID: 9606 Comm: systemd-udevd
Call Trace:
? pmem_do_bvec+0xf9/0x3a0
? xas_alloc+0x55/0xd0
pmem_rw_page+0x4b/0x80
bdev_read_page+0x86/0xb0
do_mpage_readpage+0x5d4/0x7a0
? lru_cache_add+0xe/0x10
mpage_readpages+0xf9/0x1c0
? bd_link_disk_holder+0x1a0/0x1a0
blkdev_readpages+0x1d/0x20
read_pages+0x67/0x1a0
ndctl Call Trace in vmcore:
PID: 23473 TASK: ffff88c4fbbe8000 CPU: 1 COMMAND: "ndctl"
__schedule
schedule
blk_mq_freeze_queue_wait
blk_freeze_queue
blk_cleanup_queue
pmem_release_queue
devm_action_release
release_nodes
devres_release_all
device_release_driver_internal
device_driver_detach
unbind_store
Cc: <stable@vger.kernel.org>
Signed-off-by: sumiyawang <sumiyawang@tencent.com>
Reviewed-by: yongduan <yongduan@tencent.com>
Link: https://lore.kernel.org/r/1629632949-14749-1-git-send-email-sumiyawang@tencent.com
Fixes: 50f44ee724 ("mm/devm_memremap_pages: fix final page put race")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fab827dbee upstream.
Commit 5d097056c9 ("kmemcg: account certain kmem allocations to memcg")
enabled memcg accounting for pids allocated from init_pid_ns.pid_cachep,
but forgot to adjust the setting for nested pid namespaces. As a result,
pid memory is not accounted exactly where it is really needed, inside
memcg-limited containers with their own pid namespaces.
Pid was one the first kernel objects enabled for memcg accounting.
init_pid_ns.pid_cachep marked by SLAB_ACCOUNT and we can expect that any
new pids in the system are memcg-accounted.
Though recently I've noticed that it is wrong. nested pid namespaces
creates own slab caches for pid objects, nested pids have increased size
because contain id both for all parent and for own pid namespaces. The
problem is that these slab caches are _NOT_ marked by SLAB_ACCOUNT, as a
result any pids allocated in nested pid namespaces are not
memcg-accounted.
Pid struct in nested pid namespace consumes up to 500 bytes memory, 100000
such objects gives us up to ~50Mb unaccounted memory, this allow container
to exceed assigned memcg limits.
Link: https://lkml.kernel.org/r/8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com
Fixes: 5d097056c9 ("kmemcg: account certain kmem allocations to memcg")
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 276aeee1c5 upstream.
Servers happened below panic:
Kernel version:5.4.56
BUG: unable to handle page fault for address: 0000000000002c48
RIP: 0010:__next_zones_zonelist+0x1d/0x40
Call Trace:
__alloc_pages_nodemask+0x277/0x310
alloc_page_interleave+0x13/0x70
handle_mm_fault+0xf99/0x1390
__do_page_fault+0x288/0x500
do_page_fault+0x30/0x110
page_fault+0x3e/0x50
The reason for the panic is that MAX_NUMNODES is passed in the third
parameter in __alloc_pages_nodemask(preferred_nid). So access to
zonelist->zoneref->zone_idx in __next_zones_zonelist will cause a panic.
In offset_il_node(), first_node() returns nid from pol->v.nodes, after
this other threads may chang pol->v.nodes before next_node(). This race
condition will let next_node return MAX_NUMNODES. So put pol->nodes in
a local variable.
The race condition is between offset_il_node and cpuset_change_task_nodemask:
CPU0: CPU1:
alloc_pages_vma()
interleave_nid(pol,)
offset_il_node(pol,)
first_node(pol->v.nodes) cpuset_change_task_nodemask
//nodes==0xc mpol_rebind_task
mpol_rebind_policy
mpol_rebind_nodemask(pol,nodes)
//nodes==0x3
next_node(nid, pol->v.nodes)//return MAX_NUMNODES
Link: https://lkml.kernel.org/r/20210906034658.48721-1-yanghui.def@bytedance.com
Signed-off-by: yanghui <yanghui.def@bytedance.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32d4f4b782 upstream.
Commit f56ce412a5 ("mm: memcontrol: fix occasional OOMs due to
proportional memory.low reclaim") introduced a divide by zero corner
case when oomd is being used in combination with cgroup memory.low
protection.
When oomd decides to kill a cgroup, it will force the cgroup memory to
be reclaimed after killing the tasks, by writing to the memory.max file
for that cgroup, forcing the remaining page cache and reclaimable slab
to be reclaimed down to zero.
Previously, on cgroups with some memory.low protection that would result
in the memory being reclaimed down to the memory.low limit, or likely
not at all, having the page cache reclaimed asynchronously later.
With f56ce412a5 the oomd write to memory.max tries to reclaim all the
way down to zero, which may race with another reclaimer, to the point of
ending up with the divide by zero below.
This patch implements the obvious fix.
Link: https://lkml.kernel.org/r/20210826220149.058089c6@imladris.surriel.com
Fixes: f56ce412a5 ("mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim")
Signed-off-by: Rik van Riel <riel@surriel.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Chris Down <chris@chrisdown.name>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09a26e8327 upstream.
Guillaume Morin reported hitting the following WARNING followed by GPF or
NULL pointer deference either in cgroups_destroy or in the kill_css path.:
percpu ref (css_release) <= 0 (-1) after switching to atomic
WARNING: CPU: 23 PID: 130 at lib/percpu-refcount.c:196 percpu_ref_switch_to_atomic_rcu+0x127/0x130
CPU: 23 PID: 130 Comm: ksoftirqd/23 Kdump: loaded Tainted: G O 5.10.60 #1
RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x127/0x130
Call Trace:
rcu_core+0x30f/0x530
rcu_core_si+0xe/0x10
__do_softirq+0x103/0x2a2
run_ksoftirqd+0x2b/0x40
smpboot_thread_fn+0x11a/0x170
kthread+0x10a/0x140
ret_from_fork+0x22/0x30
Upon further examination, it was discovered that the css structure was
associated with hugetlb reservations.
For private hugetlb mappings the vma points to a reserve map that
contains a pointer to the css. At mmap time, reservations are set up
and a reference to the css is taken. This reference is dropped in the
vma close operation; hugetlb_vm_op_close. However, if a vma is split no
additional reference to the css is taken yet hugetlb_vm_op_close will be
called twice for the split vma resulting in an underflow.
Fix by taking another reference in hugetlb_vm_op_open. Note that the
reference is only taken for the owner of the reserve map. In the more
common fork case, the pointer to the reserve map is cleared for
non-owning vmas.
Link: https://lkml.kernel.org/r/20210830215015.155224-1-mike.kravetz@oracle.com
Fixes: e9fe92ae0c ("hugetlb_cgroup: add reservation accounting for private mappings")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Guillaume Morin <guillaume@morinfr.org>
Suggested-by: Guillaume Morin <guillaume@morinfr.org>
Tested-by: Guillaume Morin <guillaume@morinfr.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f87060d345 upstream.
In commit 510d25c92e ("mm/hwpoison: disable pcp for
page_handle_poison()"), __page_handle_poison() was introduced, and if we
mark:
RET_A = dissolve_free_huge_page();
RET_B = take_page_off_buddy();
then __page_handle_poison was supposed to return TRUE When RET_A == 0 &&
RET_B == TRUE
But since it failed to take care the case when RET_A is -EBUSY or -ENOMEM,
and just return the ret as a bool which actually become TRUE, it break the
original logic.
The following result is a huge page in freelist but was
referenced as poisoned, and lead into the final panic:
kernel BUG at mm/internal.h:95!
invalid opcode: 0000 [#1] SMP PTI
skip...
RIP: 0010:set_page_refcounted mm/internal.h:95 [inline]
RIP: 0010:remove_hugetlb_page+0x23c/0x240 mm/hugetlb.c:1371
skip...
Call Trace:
remove_pool_huge_page+0xe4/0x110 mm/hugetlb.c:1892
return_unused_surplus_pages+0x8d/0x150 mm/hugetlb.c:2272
hugetlb_acct_memory.part.91+0x524/0x690 mm/hugetlb.c:4017
This patch replaces 'bool' with 'int' to handle RET_A correctly.
Link: https://lkml.kernel.org/r/61782ac6-1e8a-4f6f-35e6-e94fce3b37f5@linux.alibaba.com
Fixes: 510d25c92e ("mm/hwpoison: disable pcp for page_handle_poison()")
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Abaci <abaci@linux.alibaba.com>
Cc: <stable@vger.kernel.org> [5.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a052096bdd upstream.
The cpu hotplug notifiers are called without updating the core/thread
masks when a new CPU is added. This causes problems with code setting
up data structures in a cpu hotplug notifier, and relying on that later
in normal code.
This caused a crash in the new core scheduling code (SCHED_CORE),
where rq->core was set up in a notifier depending on cpu masks.
To fix this, add a cpu_setup_mask which is used in update_cpu_masks()
instead of the cpu_online_mask to determine whether the cpu masks should
be set for a certain cpu. Also move update_cpu_masks() to update the
masks before calling notify_cpu_starting() so that the notifiers are
seeing the updated masks.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Cc: <stable@vger.kernel.org>
[hca@linux.ibm.com: get rid of cpu_online_mask handling]
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93ebb68287 upstream.
Since commit 903cd0f315 ("swiotlb: Use is_swiotlb_force_bounce for
swiotlb data bouncing") if code sets swiotlb_force it needs to do so
before the swiotlb is initialised. Otherwise
io_tlb_default_mem->force_bounce will not get set to true, and devices
that use (the default) swiotlb will not bounce despite switolb_force
having the value of SWIOTLB_FORCE.
Let us restore swiotlb functionality for PV by fulfilling this new
requirement.
This change addresses what turned out to be a fragility in
commit 64e1f0c531 ("s390/mm: force swiotlb for protected
virtualization"), which ain't exactly broken in its original context,
but could give us some more headache if people backport the broken
change and forget this fix.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 903cd0f315 ("swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing")
Fixes: 64e1f0c531 ("s390/mm: force swiotlb for protected virtualization")
Cc: stable@vger.kernel.org #5.3+
Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f34ee9cb2c upstream.
In the numa=off kernel command-line configuration init_chip_info() loops
around the number of chips and attempts to copy the cpumask of that node
which is NULL for all iterations after the first chip.
Hence, store the cpu mask for each chip instead of derving cpumask from
node while populating the "chips" struct array and copy that to the
chips[i].mask
Fixes: 053819e0bf ("cpufreq: powernv: Handle throttling due to Pmax capping at chip level")
Cc: stable@vger.kernel.org # v4.3+
Reported-by: Shirisha Ganta <shirisha.ganta1@ibm.com>
Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
[mpe: Rename goto label to out_free_chip_cpu_mask]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210728120500.87549-2-psampat@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 11e4b63abb upstream.
The standard printk() tries to flush the message to the console
immediately. It tries to take the console lock. If the lock is
already taken then the current owner is responsible for flushing
even the new message.
There is a small race window between checking whether a new message is
available and releasing the console lock. It is solved by re-checking
the state after releasing the console lock. If the check is positive
then console_unlock() tries to take the lock again and process the new
message as well.
The commit 996e966640 ("printk: remove logbuf_lock") causes that
console_seq is not longer read atomically. As a result, the re-check might
be done with an inconsistent 64-bit index.
Solve it by using the last sequence number that has been checked under
the console lock. In the worst case, it will take the lock again only
to realized that the new message has already been proceed. But it
was possible even before.
The variable next_seq is marked as __maybe_unused to call down compiler
warning when CONFIG_PRINTK is not defined.
Fixes: commit 996e966640 ("printk: remove logbuf_lock")
Reported-by: kernel test robot <lkp@intel.com> # unused next_seq warning
Cc: stable@vger.kernel.org # 5.13
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Link: https://lore.kernel.org/r/20210702150657.26760-1-pmladek@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f6e0fe01b upstream.
Commit 23243c1ace ("arch: use cross_compiling to check whether it is
a cross build or not") broke 64-bit parisc builds on 32-bit parisc
systems.
Helge mentioned:
- 64-bit parisc userspace is not supported yet [1]
- hppa gcc does not support "-m64" flag [2]
That means, parisc developers working on a 32-bit parisc machine need
to use hppa64-linux-gnu-gcc (cross compiler) for building the 64-bit
parisc kernel.
After the offending commit, gcc is used in such a case because
both $(SRCARCH) and $(SUBARCH) are 'parisc', hence cross_compiling is
unset.
A correct way is to introduce ARCH=parisc64 because building the 64-bit
parisc kernel on a 32-bit parisc system is not exactly a native build,
but rather a semi-cross build.
[1]: https://lore.kernel.org/linux-parisc/5dfd81eb-c8ca-b7f5-e80e-8632767c022d@gmx.de/#t
[2]: https://lore.kernel.org/linux-parisc/89515325-fc21-31da-d238-6f7a9abbf9a0@gmx.de/
Fixes: 23243c1ace ("arch: use cross_compiling to check whether it is a cross build or not")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Cc: <stable@vger.kernel.org> # v5.13+
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 030f653078 upstream.
I was debugging some crashes on parisc and I found out that there is a
crash possibility if a function using alloca is interrupted by a signal.
The reason for the crash is that the gcc alloca implementation leaves
garbage in the upper 32 bits of the sp register. This normally doesn't
matter (the upper bits are ignored because the PSW W-bit is clear),
however the signal delivery routine in the kernel uses full 64 bits of sp
and it fails with -EFAULT if the upper 32 bits are not zero.
I created this program that demonstrates the problem:
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <alloca.h>
static __attribute__((noinline,noclone)) void aa(int *size)
{
void * volatile p = alloca(-*size);
while (1) ;
}
static void handler(int sig)
{
write(1, "signal delivered\n", 17);
_exit(0);
}
int main(void)
{
int size = -0x100;
signal(SIGALRM, handler);
alarm(1);
aa(&size);
}
If you compile it with optimizations, it will crash.
The "aa" function has this disassembly:
000106a0 <aa>:
106a0: 08 03 02 41 copy r3,r1
106a4: 08 1e 02 43 copy sp,r3
106a8: 6f c1 00 80 stw,ma r1,40(sp)
106ac: 37 dc 3f c1 ldo -20(sp),ret0
106b0: 0c 7c 12 90 stw ret0,8(r3)
106b4: 0f 40 10 9c ldw 0(r26),ret0 ; ret0 = 0x00000000FFFFFF00
106b8: 97 9c 00 7e subi 3f,ret0,ret0 ; ret0 = 0xFFFFFFFF0000013F
106bc: d7 80 1c 1a depwi 0,31,6,ret0 ; ret0 = 0xFFFFFFFF00000100
106c0: 0b 9e 0a 1e add,l sp,ret0,sp ; sp = 0xFFFFFFFFxxxxxxxx
106c4: e8 1f 1f f7 b,l,n 106c4 <aa+0x24>,r0
This patch fixes the bug by truncating the "usp" variable to 32 bits.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e79c0e324b ]
abs() returns signed long, which could not convert the type
as unsigned, and it may cause a mismatch type warning from
static tools. To fix it, this patch uses an variable to save
the abs()'s result and does a explicit conversion.
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a39ff4a47f ]
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e842cb60e8 ]
In netlbl_cipsov4_add_std() when 'doi_def->map.std' alloc
failed, we sometime observe panic:
BUG: kernel NULL pointer dereference, address:
...
RIP: 0010:cipso_v4_doi_free+0x3a/0x80
...
Call Trace:
netlbl_cipsov4_add_std+0xf4/0x8c0
netlbl_cipsov4_add+0x13f/0x1b0
genl_family_rcv_msg_doit.isra.15+0x132/0x170
genl_rcv_msg+0x125/0x240
This is because in cipso_v4_doi_free() there is no check
on 'doi_def->map.std' when doi_def->type got value 1, which
is possibe, since netlbl_cipsov4_add_std() haven't initialize
it before alloc 'doi_def->map.std'.
This patch just add the check to prevent panic happen in similar
cases.
Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7c48662b9d ]
The problem is that gpio_free() can sleep and the cfg_soc() can be
called with spinlocks held. One problematic call tree is:
--> ath_reset_internal() takes &sc->sc_pcu_lock spin lock
--> ath9k_hw_reset()
--> ath9k_hw_gpio_request_in()
--> ath9k_hw_gpio_request()
--> ath9k_hw_gpio_cfg_soc()
Remove gpio_free(), use error message instead, so we should make sure
there is no GPIO conflict.
Also remove ath9k_hw_gpio_free() from ath9k_hw_apply_gpio_override(),
as gpio_mask will never be set for SOC chips.
Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1628481916-15030-1-git-send-email-miaoqing@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23151b9ae7 ]
Bad header can have large length field which can cause OOB.
cptr is the last bytes for read, and the eeprom is parsed
from high to low address. The OOB, triggered by the condition
length > cptr could cause memory error with a read on
negative index.
There are some sanity check around length, but it is not
compared with cptr (the remaining bytes). Here, the
corrupted/bad EEPROM can cause panic.
I was able to reproduce the crash, but I cannot find the
log and the reproducer now. After I applied the patch, the
bug is no longer reproducible.
Signed-off-by: Zekun Shen <bruceshenzk@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/YM3xKsQJ0Hw2hjrc@Zekuns-MBP-16.fios-router.home
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0be883a0d7 ]
The check for count appears to be incorrect since a non-zero count
check occurs a couple of statements earlier. Currently the check is
always false and the dev->port->irq != PARPORT_IRQ_NONE part of the
check is never tested and the if statement is dead-code. Fix this
by removing the check on count.
Note that this code is pre-git history, so I can't find a sha for
it.
Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Addresses-Coverity: ("Logically dead code")
Link: https://lore.kernel.org/r/20210730100710.27405-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ec449ed823 ]
Under high stress, SW steering might get stuck on polling for completion
that never comes.
For such cases QP needs to have protocol retransmission mechanism enabled.
Currently the retransmission timeout is defined as 0 (unlimited). Fix this
by defining a real timeout.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6cc64770fb ]
In line 849 (#1), "mlx5dr_htbl_put(cur_htbl);" drops the reference to
cur_htbl and may cause cur_htbl to be freed.
However, cur_htbl is subsequently used in the next line, which may result
in an use-after-free bug.
Fix this by calling mlx5dr_err() before the cur_htbl is put.
Signed-off-by: Wentao_Liang <Wentao_Liang_g@163.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a76b57311b ]
When P2P roc is removed, the IWL_MVM_STATUS_NEED_FLUSH_P2P bit is set
to indicate to iwl_mvm_roc_done_wk() that the removed roc is a P2P
one, so it will flush the broadcast station and not the aux station.
However, since setting this bit and scheduling the worker is done
in roc ended flow as well as in case the roc is removed, there is
a race where the worker has already started running (but did not
test this bit yet) and then it is scheduled again. In this case,
the first run of the worker will clear this bit, and thus the second
run will find it already cleared and will try to flush and remove
the aux station by mistake.
Fix it by scheduling the worker only if this bit is not yet set. In
case this bit is already set, the worker is either running or
scheduled, so there is no need to re-schedule it.
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210819183728.8c147659b331.If5924375e9bfd46214ab8ab81cb9d0f5c82fbcbc@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c6ce1c74ef ]
When TVQM is enabled (iwl_mvm_has_new_tx_api() is true), then
queue numbers are just sequentially assigned 0, 1, 2, ...
Prior to TVQM, in DQA, there were some statically allocated
queue numbers:
* IWL_MVM_DQA_AUX_QUEUE == 1,
* both IWL_MVM_DQA_INJECT_MONITOR_QUEUE and
IWL_MVM_DQA_P2P_DEVICE_QUEUE == 2, and
* IWL_MVM_DQA_AP_PROBE_RESP_QUEUE == 9.
Now, these values are assigned to the members mvm->aux_queue,
mvm->snif_queue, mvm->probe_queue and mvm->p2p_dev_queue by
default. Normally, this doesn't really matter, and if TVQM is
in fact available we override them to the real values after
allocating a queue for use there.
However, this allocation doesn't always happen. For example,
for mvm->p2p_dev_queue (== 2) it only happens when the P2P
Device interface is started, if any. If it's not started, the
value in mvm->p2p_dev_queue remains 2. This wouldn't really
matter all that much if it weren't for iwl_mvm_is_static_queue()
which checks a queue number against one of those four static
numbers.
Now, if no P2P Device or monitor interface is added then queue
2 may be dynamically allocated, yet alias mvm->p2p_dev_queue or
mvm->snif_queue, and thus iwl_mvm_is_static_queue() erroneously
returns true for it. If it then gets full, all interface queues
are stopped, instead of just backpressuring against the one TXQ
that's really the only affected one.
This clearly can lead to issues, as everything is stopped even
if just a single TXQ filled its corresponding HW queue, if it
happens to have an appropriate number (2 or 9, AUX is always
reassigned.) Due to a mac80211 bug, this also led to a situation
in which the queues remained stopped across a deauthentication
and then attempts to connect to a new AP started failing, but
that's fixed separately.
Fix all of this by simply initializing the queue numbers to
the invalid value until they're used, if TVQM is enabled, and
also setting them back to that value when the queues are later
freed again.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210802172232.2e47e623f9e2.I9b0830dafbb68ef35b7b8f0f46160abec02ac7d0@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f657f8eef3 ]
NFS implements blocking locks by blocking inside its lock method. In
the reexport case, this blocks the nfs server thread, which could lead
to deadlocks since an nfs server thread might be required to unlock the
conflicting lock. It also causes a crash, since the nfs server thread
assumes it can free the lock when its lm_notify lock callback is called.
Ideal would be to make the nfs lock method return without blocking in
this case, but for now it works just not to attempt blocking locks. The
difference is just that the original client will have to poll (as it
does in the v4.0 case) instead of getting a callback when the lock's
available.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1ec06c2dee ]
On systems with multiple SH per SE compute_static_thread_mgmt_se#
is split into independent masks, one for each SH, in the upper and
lower 16 bits. We need to detect this and apply cu masking to each
SH. The cu mask bits are assigned first to each SE, then to
alternate SHs, then finally to higher CU id. This ensures that
the maximum number of SPIs are engaged as early as possible while
balancing CU assignment to each SH.
v2: Use max SH/SE rather than max SH in cu_per_sh.
v3: Fix comment blocks, ensure se_mask is initially zero filled,
and correctly assign se.sh.cu positions to unset bits in cu_mask.
Signed-off-by: Sean Keely <Sean.Keely@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c75fc7193 ]
When more than one FE is connected to a BE, e.g. in a mixing use case,
the BE can be triggered multiple times when the FE are opened/started
concurrently. This race condition is problematic in the case of
SoundWire BE dailinks, and this is not desirable in a general
case. The code carefully checks when the BE can be stopped or
hw_free'ed, but the trigger code does not use any mutual exclusion.
Fix by using the same spinlock already used to check FE states, and
set the state before the trigger. In case of errors, the initial
state will be restored.
This patch does not change how the triggers are handled, it only makes
sure the states are handled in critical sections.
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Message-Id: <20210817164054.250028-2-pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b873120995 ]
xhci-mtk depends on xhci's internal virt_dev when it retrieves its
internal data from usb_host_endpoint both in add_endpoint and
drop_endpoint callbacks. But when setup packet was retired by
transaction errors in xhci_setup_device() path, a virt_dev for the slot
is newly created with real_port 0. This leads to xhci-mtks's NULL pointer
dereference from drop_endpoint callback as xhci-mtk assumes that virt_dev's
real_port is always started from one. The similar problems were addressed
by [1] but that can't cover the failure cases from setup_device.
This patch drops the usages of xhci's virt_dev in xhci-mtk's drop_endpoint
callback by adopting rhashtable for searching mtk's schedule entity
from a given usb_host_endpoint pointer instead of searching a linked list.
So mtk's drop_endpoint callback doesn't have to rely on virt_dev at all.
[1] https://lore.kernel.org/r/1617179142-2681-2-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Ikjoon Jang <ikjn@chromium.org>
Link: https://lore.kernel.org/r/20210805133731.1.Icc0f080e75b1312692d4c7c7d25e7df9fe1a05c2@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 66cce9e73e ]
When a remote usb device is attached to the local Virtual USB
Host Controller Root Hub port, the bound device driver may send
a port reset command.
vhci_hcd accepts port resets only when the device doesn't have
port address assigned to it. When reset happens device is in
assigned/used state and vhci_hcd rejects it leaving the port in
a stuck state.
This problem was found when a blue-tooth or xbox wireless dongle
was passed through using usbip.
A few drivers reset the port during probe including mt76 driver
specific to this bug report. Fix the problem with a change to
honor reset requests when device is in used state (VDEV_ST_USED).
Reported-and-tested-by: Michael <msbroadf@gmail.com>
Suggested-by: Michael <msbroadf@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210819225937.41037-1-skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7c75bde329 ]
If IRQ occurs between calling dsps_setup_optional_vbus_irq()
and dsps_create_musb_pdev(), then null pointer dereference occurs
since glue->musb wasn't initialized yet.
The patch puts initializing of neccesery data before registration
of the interrupt handler.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210819163323.17714-1-lutovinova@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2847c46c61 ]
This reverts commit 5d5323a6f3.
That commit effectively disabled Intel host initiated U1/U2 lpm for devices
with periodic endpoints.
Before that commit we disabled host initiated U1/U2 lpm if the exit latency
was larger than any periodic endpoint service interval, this is according
to xhci spec xhci 1.1 specification section 4.23.5.2
After that commit we incorrectly checked that service interval was smaller
than U1/U2 inactivity timeout. This is not relevant, and can't happen for
Intel hosts as previously set U1/U2 timeout = 105% * service interval.
Patch claimed it solved cases where devices can't be enumerated because of
bandwidth issues. This might be true but it's a side effect of accidentally
turning off lpm.
exit latency calculations have been revised since then
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20210820123503.2605901-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d72c74197b ]
smb_buf is allocated by small_smb_init_no_tc(), and buf type is
CIFS_SMALL_BUFFER, so we should use cifs_small_buf_release() to
release it in failed path.
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c0e9422c4e ]
Currently, most pktgen samples print the execution result when the
program is terminated normally. However, sample03 doesn't work
appropriately.
This is results of samples:
# DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample04_many_flows.sh -n 1
Running... ctrl^C to stop
Device: eth0@0
Result: OK: 19(c5+d13) usec, 1 (60byte,0frags)
51762pps 24Mb/sec (24845760bps) errors: 0
# DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample03_burst_single_flow.sh -n 1
Running... ctrl^C to stop
The reason why it doesn't print the execution result when the program is
terminated usually is that sample03 doesn't call the function which
prints the result, unlike other samples.
So, this commit solves this issue by calling the function before
termination. Also, this commit changes control_c function to
print_result to maintain consistency with other samples.
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 039190bb35 ]
Unlike OcteonTx2, the channel numbers used by CGX/RPM
and LBK on CN10K silicons aren't fixed in HW. They are
SW programmable, hence we cannot derive transmit link
from static channel numbers anymore. Get the same from
admin function via mailbox.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e72a55f2e5 ]
When a read/write command is sent via ioctl to the kernel,
and the command fails, the actual error response of the emmc
is not sent to the user.
IOCTL read/write tests are carried out using commands
17 (Single BLock Read), 24 (Single Block Write),
18 (Multi Block Read), 25 (Multi Block Write)
The tests are carried out on a 64Gb emmc device. All of these
tests try to access an "out of range" sector address (0x09B2FFFF).
It is seen that without the patch the response received by the user
is not OUT_OF_RANGE error (R1 response 31st bit is not set) as per
JEDEC specification. After applying the patch proper response is seen.
This is because the function returns without copying the response to
the user in case of failure. This patch fixes the issue.
Hence, this memcpy is required whether we get an error response or not.
Therefor it is moved up from the current position up to immediately
after we have called mmc_wait_for_req().
The test code and the output of only the CMD17 is included in the
commit to limit the message length.
CMD17 (Test Code Snippet):
==========================
printf("Forming CMD%d\n", opt_idx);
/* single block read */
cmd.blksz = 512;
cmd.blocks = 1;
cmd.write_flag = 0;
cmd.opcode = 17;
//cmd.arg = atoi(argv[3]);
cmd.arg = 0x09B2FFFF;
/* Expecting response R1B */
cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_ADTC;
memset(data, 0, sizeof(__u8) * 512);
mmc_ioc_cmd_set_data(cmd, data);
printf("Sending CMD%d: ARG[0x%08x]\n", opt_idx, cmd.arg);
if(ioctl(fd, MMC_IOC_CMD, &cmd))
perror("Error");
printf("\nResponse: %08x\n", cmd.response[0]);
CMD17 (Output without patch):
=============================
test@test-LIVA-Z:~$ sudo ./mmc cmd_test /dev/mmcblk0 17
Entering the do_mmc_commands:Device: /dev/mmcblk0 nargs:4
Entering the do_mmc_commands:Device: /dev/mmcblk0 options[17, 0x09B2FFF]
Forming CMD17
Sending CMD17: ARG[0x09b2ffff]
Error: Connection timed out
Response: 00000000
(Incorrect response)
CMD17 (Output with patch):
==========================
test@test-LIVA-Z:~$ sudo ./mmc cmd_test /dev/mmcblk0 17
[sudo] password for test:
Entering the do_mmc_commands:Device: /dev/mmcblk0 nargs:4
Entering the do_mmc_commands:Device: /dev/mmcblk0 options[17, 09B2FFFF]
Forming CMD17
Sending CMD17: ARG[0x09b2ffff]
Error: Connection timed out
Response: 80000900
(Correct OUT_OF_ERROR response as per JEDEC specification)
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Link: https://lore.kernel.org/r/20210824191726.8296-1-nishadkamdar@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2d82d73da3 ]
0Day robot observed that it's easily timeout on a heavy load host.
-------------------
# selftests: bpf: test_maps
# Fork 1024 tasks to 'test_update_delete'
# Fork 1024 tasks to 'test_update_delete'
# Fork 100 tasks to 'test_hashmap'
# Fork 100 tasks to 'test_hashmap_percpu'
# Fork 100 tasks to 'test_hashmap_sizes'
# Fork 100 tasks to 'test_hashmap_walk'
# Fork 100 tasks to 'test_arraymap'
# Fork 100 tasks to 'test_arraymap_percpu'
# Failed sockmap unexpected timeout
not ok 3 selftests: bpf: test_maps # exit=1
# selftests: bpf: test_lru_map
# nr_cpus:8
-------------------
Since this test will be scheduled by 0Day to a random host that could have
only a few cpus(2-8), enlarge the timeout to avoid a false NG report.
In practice, i tried to pin it to only one cpu by 'taskset 0x01 ./test_maps',
and knew 10S is likely enough, but i still perfer to a larger value 30.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210820015556.23276-2-lizhijian@cn.fujitsu.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ac5e45291 ]
For unexplained reasons, the prescaler register for this device needs to
be cleared (set to 1) while performing a data read or else the command
will hang. This does not appear to affect the real clock rate sent out
on the bus, so I assume it's purely to work around a hardware bug.
During normal operation, the prescaler is already set to 1, so nothing
needs to be done. However, in "initial mode" (which is used for sub-MHz
clock speeds, like the core sets while enumerating cards), it's set to
128 and so we need to reset it during data reads. We currently fail to
do this for long reads.
This has no functional affect on the driver's operation currently
written, as the MMC core always sets a clock above 1MHz before
attempting any long reads. However, the core could conceivably set any
clock speed at any time and the driver should still work, so I think
this fix is worthwhile.
I personally encountered this issue while performing data recovery on an
external chip. My connections had poor signal integrity, so I modified
the core code to reduce the clock speed. Without this change, I saw the
card enumerate but was unable to actually read any data.
Writes don't seem to work in the situation described above even with
this change (and even if the workaround is extended to encompass data
write commands). I was not able to find a way to get them working.
Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>
Link: https://lore.kernel.org/r/2fef280d8409ab0100c26c6ac7050227defd098d.1627818365.git.tommyhebb@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6966e6094c ]
When mmc_blk_card_busy() calls card_busy_detect() to poll for the card's
state with CMD13, this is done without any delays in between the commands
being sent.
Rather than fixing card_busy_detect() in this regards, let's instead
convert into using the common __mmc_poll_for_busy(), which also helps us to
avoid open-coding.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20210702134229.357717-4-ulf.hansson@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 468108155b ]
When __mmc_blk_ioctl_cmd() calls card_busy_detect() to verify that the
card's states moves back into transfer state, the polling with CMD13 is
done without any delays in between the commands being sent.
Rather than fixing card_busy_detect() in this regards, let's instead
convert into using the common mmc_poll_for_busy(), which also helps us to
avoid open-coding.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20210702134229.357717-3-ulf.hansson@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 972d508483 ]
When mmc_blk_fix_state() sends a CMD12 to try to move the card into the
transfer state, it calls card_busy_detect() to poll for the card's state
with CMD13. This is done without any delays in between the commands being
sent.
Rather than fixing card_busy_detect() in this regards, let's instead
convert into using the common mmc_poll_for_busy(), which also helps us to
avoid open-coding.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Link: https://lore.kernel.org/r/20210702134229.357717-2-ulf.hansson@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6211e9cb2f ]
Trying to boot without SYSFS, but with OF_DYNAMIC quickly
results in a crash:
[ 0.088460] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
[...]
[ 0.103927] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc3 #4179
[ 0.105810] Hardware name: linux,dummy-virt (DT)
[ 0.107147] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[ 0.108876] pc : kernfs_find_and_get_ns+0x3c/0x7c
[ 0.110244] lr : kernfs_find_and_get_ns+0x3c/0x7c
[...]
[ 0.134087] Call trace:
[ 0.134800] kernfs_find_and_get_ns+0x3c/0x7c
[ 0.136054] safe_name+0x4c/0xd0
[ 0.136994] __of_attach_node_sysfs+0xf8/0x124
[ 0.138287] of_core_init+0x90/0xfc
[ 0.139296] driver_init+0x30/0x4c
[ 0.140283] kernel_init_freeable+0x160/0x1b8
[ 0.141543] kernel_init+0x30/0x140
[ 0.142561] ret_from_fork+0x10/0x18
While not having sysfs isn't a very common option these days,
it is still expected that such configuration would work.
Paper over it by bailing out from __of_attach_node_sysfs() if
CONFIG_SYSFS isn't enabled.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210820144722.169226-1-maz@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3736127a3a ]
Function btrfs_lookup_data_extent calls btrfs_search_slot to verify if
the EXTENT_ITEM exists in the extent tree. btrfs_search_slot can return
values bellow zero if an error happened.
Function replay_one_extent currently checks if the search found
something (0 returned) and increments the reference, and if not, it
seems to evaluate as 'not found'.
Fix the condition by checking if the value was bellow zero and return
early.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cceaa89f02 ]
When using the NO_HOLES feature and expanding the size of an inode, we
update the inode's last_trans, last_sub_trans and last_log_commit fields
at maybe_insert_hole() so that a fsync does know that the inode needs to
be logged (by making sure that btrfs_inode_in_log() returns false). This
happens for expanding truncate operations, buffered writes, direct IO
writes and when cloning extents to an offset greater than the inode's
i_size.
However the way we do it is racy, because in between setting the inode's
last_sub_trans and last_log_commit fields, the log transaction ID that was
assigned to last_sub_trans might be committed before we read the root's
last_log_commit and assign that value to last_log_commit. If that happens
it would make a future call to btrfs_inode_in_log() return true. This is
a race that should be extremely unlikely to be hit in practice, and it is
the same that was described by commit bc0939fcfa ("btrfs: fix race
between marking inode needs to be logged and log syncing").
The fix would simply be to set last_log_commit to the value we assigned
to last_sub_trans minus 1, like it was done in that commit. However
updating these two fields plus the last_trans field is pointless here
because all the callers of btrfs_cont_expand() (which is the only
caller of maybe_insert_hole()) always call btrfs_set_inode_last_trans()
or btrfs_update_inode() after calling btrfs_cont_expand(). Calling either
btrfs_set_inode_last_trans() or btrfs_update_inode() guarantees that the
next fsync will log the inode, as it makes btrfs_inode_in_log() return
false.
So just remove the code that explicitly sets the inode's last_trans,
last_sub_trans and last_log_commit fields.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit db87db65c1 ]
> Hi Arnd,
>
> First bad commit (maybe != root cause):
>
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> head: 2f73937c9aa561e2082839bc1a8efaac75d6e244
> commit: 47fd22f2b8 [4771/5318] cs89x0: rework driver configuration
> config: m68k-randconfig-c003-20210804 (attached as .config)
> compiler: m68k-linux-gcc (GCC) 10.3.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=47fd22f2b84765a2f7e3f150282497b902624547
> git remote add linux-next https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> git fetch --no-tags linux-next master
> git checkout 47fd22f2b8
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-10.3.0 make.cross ARCH=m68k
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
> In file included from include/linux/kernel.h:19,
> from include/linux/list.h:9,
> from include/linux/module.h:12,
> from drivers/net/ethernet/cirrus/cs89x0.c:51:
> drivers/net/ethernet/cirrus/cs89x0.c: In function 'net_open':
> drivers/net/ethernet/cirrus/cs89x0.c:897:20: error: implicit declaration of function 'isa_virt_to_bus'; did you mean 'virt_to_bus'? [-Werror=implicit-function-declaration]
> 897 | (unsigned long)isa_virt_to_bus(lp->dma_buff));
> | ^~~~~~~~~~~~~~~
> include/linux/printk.h:141:17: note: in definition of macro 'no_printk'
> 141 | printk(fmt, ##__VA_ARGS__); \
> | ^~~~~~~~~~~
> drivers/net/ethernet/cirrus/cs89x0.c:86:3: note: in expansion of macro 'pr_debug'
> 86 | pr_##level(fmt, ##__VA_ARGS__); \
> | ^~~
> drivers/net/ethernet/cirrus/cs89x0.c:894:3: note: in expansion of macro 'cs89_dbg'
> 894 | cs89_dbg(1, debug, "%s: dma %lx %lx\n",
> | ^~~~~~~~
> >> drivers/net/ethernet/cirrus/cs89x0.c:914:3: error: implicit declaration of function 'disable_dma'; did you mean 'disable_irq'? [-Werror=implicit-function-declaration]
As far as I can tell, this is a bug with the m68kmmu architecture, not
with my driver:
The CONFIG_ISA_DMA_API option is provided for coldfire, which implements it,
but dragonball also sets the option as a side-effect, without actually
implementing
the interfaces. The patch below should fix it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8fb4df1f5 ]
'bp_ena' in Aura context is NIX block index, setting it
zero will always backpressure NIX0 block, even if NIXLF
belongs to NIX1. Hence fix this by setting it appropriately
based on NIX block address.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 05e4588738 ]
The kernel test robot reports undefined reference after we report wakeup
reason to mac80211. This is because CONFIG_PM is not defined in the testing
configuration file. In fact, functions within wow.c are used if CONFIG_PM
is defined, so use CONFIG_PM to decide whether we build this file or not.
The reported messages are:
hppa-linux-ld: drivers/net/wireless/realtek/rtw88/wow.o: in function `rtw_wow_show_wakeup_reason':
>> (.text+0x6c4): undefined reference to `ieee80211_report_wowlan_wakeup'
>> hppa-linux-ld: (.text+0x6e0): undefined reference to `ieee80211_report_wowlan_wakeup'
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210728014335.8785-4-pkshih@realtek.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 02a55c0009 ]
In current wow flow, driver calls rtw_wow_fw_start and sleep for 100ms,
to wait firmware finish preliminary work and then update the value of
WOWLAN_WAKE_REASON register to zero. But later firmware will start wow
function with power-saving mode, in which mode the value of
WOWLAN_WAKE_REASON register is 0xea. So driver may get 0xea value and
return fail. We use read_poll_timeout instead to check the value to avoid
this issue.
Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210728014335.8785-2-pkshih@realtek.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c626f3864b ]
In certain randconfigs, clang warns:
drivers/gpu/drm/exynos/exynos_drm_dma.c:121:19: warning: variable
'mapping' is uninitialized when used here [-Wuninitialized]
priv->mapping = mapping;
^~~~~~~
drivers/gpu/drm/exynos/exynos_drm_dma.c:111:16: note: initialize the
variable 'mapping' to silence this warning
void *mapping;
^
= NULL
1 warning generated.
This occurs when CONFIG_EXYNOS_IOMMU is enabled and both
CONFIG_ARM_DMA_USE_IOMMU and CONFIG_IOMMU_DMA are disabled, which makes
the code look like
void *mapping;
if (0)
mapping = arm_iommu_create_mapping()
else if (0)
mapping = iommu_get_domain_for_dev()
...
priv->mapping = mapping;
Add an else branch that initializes mapping to the -ENODEV error pointer
so that there is no more warning and the driver does not change during
runtime.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7de875b231 ]
Locks have two sets of op arrays, fl_lmops for the lock manager (lockd
or nfsd), fl_ops for the filesystem. The server-side lockd code has
been setting its own fl_ops, which leads to confusion (and crashes) in
the reexport case, where the filesystem expects to be the only one
setting fl_ops.
And there's no reason for it that I can see-the lm_get/put_owner ops do
the same job.
Reported-by: Daire Byrne <daire@dneg.com>
Tested-by: Daire Byrne <daire@dneg.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d1340f80f0 ]
In the gfs2 withdraw sequence, the dlm protocol is unmounted with a call
to lm_unmount. After a withdraw, users are allowed to unmount the
withdrawn file system. But at that point we may still have glocks left
over that we need to free via unmount's call to gfs2_gl_hash_clear.
These glocks may have never been completed because of whatever problem
caused the withdraw (IO errors or whatever).
Before this patch, function gdlm_put_lock would still try to call into
dlm to unlock these leftover glocks, which resulted in dlm returning
-EINVAL because the lock space was abandoned. These glocks were never
freed because there was no mechanism after that to free them.
This patch adds a check to gdlm_put_lock to see if the locking protocol
was inactive (DFL_UNMOUNT flag) and if so, free the glock and not
make the invalid call into dlm.
I could have combined this "if" with the one that follows, related to
leftover glock LVBs, but I felt the code was more readable with its own
if clause.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc64c390b2 ]
This driver is assuming that all adg->clk[i] is not NULL.
Because of this prerequisites, for_each_rsnd_clk() is possible to work
for all clk without checking NULL. In other words, all adg->clk[i]
should not NULL.
Some SoC might doesn't have clk_a/b/c/i. devm_clk_get() returns error in
such case. This driver calls rsnd_adg_null_clk_get() and use null_clk
instead of NULL in such cases.
But devm_clk_get() might returns NULL even though such clocks exist, but
it doesn't mean error (user deliberately chose to disable the feature).
NULL clk itself is not error from clk point of view, but is error from
this driver point of view because it is not assuming such case.
But current code is using IS_ERR() which doesn't care NULL.
This driver uses IS_ERR_OR_NULL() instead of IS_ERR() for clk check.
And it uses ERR_CAST() to clarify null_clk error.
One concern here is that it unconditionally uses null_clk if clk_a/b/c/i
was error. It is correct if it doesn't exist, but is not correct if it
returns error even though it exist.
It needs to check "clock-names" from DT before calling devm_clk_get() to
handling such case. But let's assume it is overkill so far.
Link: https://lore.kernel.org/r/YMCmhfQUimHCSH/n@mwanda
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://lore.kernel.org/r/87v940wyf9.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 83e5dcbece ]
When skipping the tests due to a lack of system support for MTE we
currently print a message saying FAIL which makes it look like the test
failed even though the test did actually report KSFT_SKIP, creating some
confusion. Change the error message to say SKIP instead so things are
clearer.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210819172902.56211-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 74fc4f8287 ]
Currently, when creating an ingress qdisc on an indirect device before
the driver registered for callbacks, the driver will not have a chance
to register its filter configuration callbacks.
To fix that, modify the code such that it keeps track of all the ingress
qdiscs that call flow_indr_dev_setup_offload(). When a driver calls
flow_indr_dev_register(), go through the list of tracked ingress qdiscs
and call the driver callback entry point so as to give it a chance to
register its callback.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 78a7b2a8a0 ]
nlattr could have a padding for 4 bytes alignment. So next nla's offset
should be calculated with a padding.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cbe34165cc ]
Fix buf allocation size (it needs to be 2 bytes larger). Found when
__alloc_size() annotations were added to kmalloc() interfaces.
In file included from ./include/linux/string.h:253,
from ./include/linux/bitmap.h:10,
from ./include/linux/cpumask.h:12,
from ./arch/x86/include/asm/paravirt.h:17,
from ./arch/x86/include/asm/irqflags.h:63,
from ./include/linux/irqflags.h:16,
from ./include/linux/rcupdate.h:26,
from ./include/linux/rculist.h:11,
from ./include/linux/pid.h:5,
from ./include/linux/sched.h:14,
from ./include/linux/blkdev.h:5,
from drivers/staging/rts5208/rtsx_scsi.c:12:
In function 'get_ms_information',
inlined from 'ms_sp_cmnd' at drivers/staging/rts5208/rtsx_scsi.c:2877:12,
inlined from 'rtsx_scsi_handler' at drivers/staging/rts5208/rtsx_scsi.c:3247:12:
./include/linux/fortify-string.h:54:29: warning: '__builtin_memcpy' forming offset [106, 107] is out
of the bounds [0, 106] [-Warray-bounds]
54 | #define __underlying_memcpy __builtin_memcpy
| ^
./include/linux/fortify-string.h:417:2: note: in expansion of macro '__underlying_memcpy'
417 | __underlying_##op(p, q, __fortify_size); \
| ^~~~~~~~~~~~~
./include/linux/fortify-string.h:463:26: note: in expansion of macro '__fortify_memcpy_chk'
463 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
| ^~~~~~~~~~~~~~~~~~~~
drivers/staging/rts5208/rtsx_scsi.c:2851:3: note: in expansion of macro 'memcpy'
2851 | memcpy(buf + i, ms_card->raw_sys_info, 96);
| ^~~~~~
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-staging@lists.linux.dev
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210818044252.1533634-1-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b16ac5bf73 ]
libbpf CI has reported send_signal test is flaky although
I am not able to reproduce it in my local environment.
But I am able to reproduce with on-demand libbpf CI ([1]).
Through code analysis, the following is possible reason.
The failed subtest runs bpf program in softirq environment.
Since bpf_send_signal() only sends to a fork of "test_progs"
process. If the underlying current task is
not "test_progs", bpf_send_signal() will not be triggered
and the subtest will fail.
To reduce the chances where the underlying process is not
the intended one, this patch boosted scheduling priority to
-20 (highest allowed by setpriority() call). And I did
10 runs with on-demand libbpf CI with this patch and I
didn't observe any failures.
[1] https://github.com/libbpf/libbpf/actions/workflows/ondemand.yml
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210817190923.3186725-1-yhs@fb.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f667d1d667 ]
In skip_account(), test->skip_cnt is set to 0 at the end, this makes next print
statement never display SKIP status for the subtest. This patch moves the
accounting logic after the print statement, fixing the issue.
This patch also added SKIP status display for normal tests.
Signed-off-by: Yucong Sun <fallentree@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210817044732.3263066-3-fallentree@fb.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5ac49f3c27 ]
As follow-up to the discussion with Jakub Kicinski about iavf locking
being insufficient [1] convert iavf to use mutexes instead of bitops.
The locking logic is kept as is, just a drop-in replacement of
enum iavf_critical_section_t with separate mutexes.
The only difference is that the mutexes will be destroyed before the
module is unloaded.
[1] https://lwn.net/ml/netdev/20210316150210.00007249%40intel.com/
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3faa49bce ]
Since the original TFO server code was implemented in commit
168a8f5805 ("tcp: TCP Fast Open Server -
main code path") the TFO server code has supported the sysctl bit flag
TFO_SERVER_COOKIE_NOT_REQD. Currently, when the TFO_SERVER_ENABLE and
TFO_SERVER_COOKIE_NOT_REQD sysctl bit flags are set, a server connection
will accept a SYN with N bytes of data (N > 0) that has no TFO cookie,
create a new fast open connection, process the incoming data in the SYN,
and make the connection ready for accepting. After accepting, the
connection is ready for read()/recvmsg() to read the N bytes of data in
the SYN, ready for write()/sendmsg() calls and data transmissions to
transmit data.
This commit changes an edge case in this feature by changing this
behavior to apply to (N >= 0) bytes of data in the SYN rather than only
(N > 0) bytes of data in the SYN. Now, a server will accept a data-less
SYN without a TFO cookie if TFO_SERVER_COOKIE_NOT_REQD is set.
Caveat! While this enables a new kind of TFO (data-less empty-cookie
SYN), some firewall rules setup may not work if they assume such packets
are not legit TFOs and will filter them.
Signed-off-by: Luke Hsiao <lukehsiao@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210816205105.2533289-1-luke.w.hsiao@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b69eea82d3 ]
Modern-day mapping_set_error has the ability to squash the usual
negative error code into something appropriate for long-term storage in
a struct address_space -- ENOSPC becomes AS_ENOSPC, and everything else
becomes EIO. iomap squashes /everything/ to EIO, just as XFS did before
that, but this doesn't make sense.
Fix this by making it so that we can pass ENOSPC to userspace when
writeback fails due to space problems.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 87b8061bad ]
This fixes two issues that cause the sysrq sequence to be inadvertently
aborted on SCIF serial consoles:
- a NUL character remains in the RX queue after a break has been detected,
which is then passed on to uart_handle_sysrq_char()
- the break interrupt is handled twice on controllers with multiplexed ERI
and BRI interrupts
Signed-off-by: Ulrich Hecht <uli+renesas@fpond.eu>
Link: https://lore.kernel.org/r/20210816162201.28801-1-uli+renesas@fpond.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 020d86fc0d ]
The 'required-opps' property is considered optional, hence remove
the pr_err() in of_parse_required_opp() when we find the property is
missing.
While at it, also fix the return value of
of_get_required_opp_performance_state() when of_parse_required_opp()
fails, return a -ENODEV instead of the -EINVAL.
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ecb71f2566 ]
For NOP command, need to cancel work scheduled on cmd_timer,
on receiving command status or commmand complete event.
Below use case might lead to race condition multiple when NOP
commands are queued sequentially:
hci_cmd_work() {
if (atomic_read(&hdev->cmd_cnt) {
.
.
.
atomic_dec(&hdev->cmd_cnt);
hci_send_frame(hdev,...);
schedule_delayed_work(&hdev->cmd_timer,...);
}
}
On receiving event for first NOP, the work scheduled on hdev->cmd_timer
is not cancelled and second NOP is dequeued and sent to controller.
While waiting for an event for second NOP command, work scheduled on
cmd_timer for the first NOP can get scheduled, resulting in sending third
NOP command (sending back to back NOP commands). This might
cause issues at controller side (like memory overrun, controller going
unresponsive) resulting in hci tx timeouts, hardware errors etc.
The fix to this issue is to cancel the delayed work scheduled on
cmd_timer on receiving command status or command complete event for
NOP command (this patch handles NOP command same as any other SIG
command).
Signed-off-by: Kiran K <kiran.k@intel.com>
Reviewed-by: Chethan T N <chethan.tumkur.narayan@intel.com>
Reviewed-by: Srivatsa Ravishankar <ravishankar.srivatsa@intel.com>
Acked-by: Manish Mandlik <mmandlik@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cafae4cd62 ]
LE Enhanced Connection Complete contains the Local RPA used in the
connection which must be used when set otherwise there could problems
when pairing since the address used by the remote stack could be the
Local RPA:
BLUETOOTH CORE SPECIFICATION Version 5.2 | Vol 4, Part E
page 2396
'Resolvable Private Address being used by the local device for this
connection. This is only valid when the Own_Address_Type (from the
HCI_LE_Create_Connection, HCI_LE_Set_Advertising_Parameters,
HCI_LE_Set_Extended_Advertising_Parameters, or
HCI_LE_Extended_Create_Connection commands) is set to 0x02 or
0x03, and the Controller generated a resolvable private address for the
local device using a non-zero local IRK. For other Own_Address_Type
values, the Controller shall return all zeros.'
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e7006de6c2 ]
We cannot detect a (perhaps buggy) controller that is sending us
a completion for a request that was already completed (for example
sending a completion twice), this phenomenon was seen in the wild
a few times.
So to protect against this, we use the upper 4 msbits of the nvme sqe
command_id to use as a 4-bit generation counter and verify it matches
the existing request generation that is incrementing on every execution.
The 16-bit command_id structure now is constructed by:
| xxxx | xxxxxxxxxxxx |
gen request tag
This means that we are giving up some possible queue depth as 12 bits
allow for a maximum queue depth of 4095 instead of 65536, however we
never create such long queues anyways so no real harm done.
Suggested-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Tested-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3b01a9d0ca ]
We already validate it when receiving the c2hdata pdu header
and this is not changing so this is a redundant check.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bd306fdb4e ]
The GW71xx has a USB Type-C connector with USB 2.0 signaling. GPIO1_12
is the power-enable to the TPS25821 Source controller and power switch
responsible for monitoring the CC pins and enabling VBUS. Therefore
GPIO1_12 must always be enabled and the vbus output enable from the
IMX8MM can be ignored.
To fix USB OTG VBUS enable a pull-up on GPIO1_12 to always power the
TPS25821 and change the regulator output to GPIO1_10 which is
unconnected.
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f865d0292f ]
The documented compatible string for the CPUs found on Tegra132 is
"nvidia,tegra132-denver", rather than the previously used compatible
string "nvidia,denver".
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 70e740ad55 ]
The configuration of USB VBUS regulators was borrowed from downstream
kernel, which is incorrect because the corresponding GPIOs are connected
to PROX_EN (A501 3G model) and LED_EN pins in accordance to the board
schematics. USB works fine with both GPIOs being disabled, so remove the
bogus USB VBUS regulators. The USB VBUS of USB3 is supplied from the fixed
5v system regulator and device-mode USB1 doesn't have VBUS switches.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 79f5962bae ]
The maximum MTU was set to 2304, which is the maximum MSDU size. While
this is valid for normal WLAN interfaces, it is too low for monitor
interfaces. A monitor interface may receive and inject MPDU frames, and
the maximum MPDU frame size is larger than 2304. The MPDU may also
contain an A-MSDU frame, in which case the size may be much larger than
the MTU limit. Since the maximum size of an A-MSDU depends on the PHY
mode of the transmitting STA, it is not possible to set an exact MTU
limit for a monitor interface. Now the maximum MTU for a monitor
interface is unrestricted.
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Link: https://lore.kernel.org/r/20210628123246.2070558-1-johan.almbladh@anyfinetworks.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 554594567b ]
The variable dc->clk_mgr is checked in:
if (dc->clk_mgr && dc->clk_mgr->funcs->get_clock)
This indicates dc->clk_mgr can be NULL.
However, it is dereferenced in:
if (!dc->clk_mgr->funcs->get_clock)
To fix this null-pointer dereference, check dc->clk_mgr and the function
pointer dc->clk_mgr->funcs->get_clock earlier, and return if one of them
is NULL.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a211260c34 ]
The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
addr, *val);
Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
val &= ~amdgpu_connector->router.ddc_mux_control_pin;
To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 979aa51967 ]
Fix the following smatch warning:
wait_func_handle_exec_timeout() warn: should '1 << ent->idx' be a 64 bit type?
Use 1ULL, to have a 64 bit type variable.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e0adc765d ]
Initialize both pre-emphasis and voltage swing level to 0 before
start link training and do not end link training until video is
ready to reduce the period between end of link training and video
start to meet Link Layer CTS requirement. Some dongle main link
symbol may become unlocked again if host did not end link training
soon enough after completion of link training 2. Host have to re
train main link if loss of symbol locked detected before end link
training so that the coming video stream can be transmitted to sink
properly. This fixes Link Layer CTS cases 4.3.2.1, 4.3.2.2, 4.3.2.3
and 4.3.2.4.
Changes in v3:
-- merge retrain link if loss of symbol locked happen into this patch
-- replace dp_ctrl_loss_symbol_lock() with dp_ctrl_channel_eq_ok()
Signed-off-by: Kuogee Hsieh <khsieh@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/1628196295-7382-7-git-send-email-khsieh@codeaurora.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b324564ff ]
Aux hardware calibration sequence requires resetting the aux controller
in order for the new setting to take effect. However resetting the AUX
controller will also clear HPD interrupt status which may accidentally
cause pending unplug interrupt to get lost. Therefore reset aux
controller only when link is in connection state when dp_aux_cmd_fifo_tx()
fail. This fixes Link Layer CTS cases 4.2.1.1 and 4.2.1.2.
Signed-off-by: Kuogee Hsieh <khsieh@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/1628196295-7382-4-git-send-email-khsieh@codeaurora.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4b85d405cf ]
Reduce link rate and re start link training if link training 1
failed due to loss of clock recovery done to fix Link Layer
CTS case 4.3.1.7. Also only update voltage and pre-emphasis
swing level after link training started to fix Link Layer CTS
case 4.3.1.6.
Changes in V2:
-- replaced cr_status with link_status[DP_LINK_STATUS_SIZE]
-- replaced dp_ctrl_any_lane_cr_done() with dp_ctrl_colco_recovery_any_ok()
-- replaced dp_ctrl_any_ane_cr_lose() with !drm_dp_clock_recovery_ok()
Changes in V3:
-- return failed if lane_count <= 1
Signed-off-by: Kuogee Hsieh <khsieh@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/1628196295-7382-3-git-send-email-khsieh@codeaurora.org
[remove unused cr_status variable]
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ba316be1b6 ]
struct sock.sk_timer should be used as a sock cleanup timer. However,
SCO uses it to implement sock timeouts.
This causes issues because struct sock.sk_timer's callback is run in
an IRQ context, and the timer callback function sco_sock_timeout takes
a spin lock on the socket. However, other functions such as
sco_conn_del and sco_conn_ready take the spin lock with interrupts
enabled.
This inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} lock usage could
lead to deadlocks as reported by Syzbot [1]:
CPU0
----
lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
<Interrupt>
lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
To fix this, we use delayed work to implement SCO sock timouts
instead. This allows us to avoid taking the spin lock on the socket in
an IRQ context, and corrects the misuse of struct sock.sk_timer.
As a note, cancel_delayed_work is used instead of
cancel_delayed_work_sync in sco_sock_set_timer and
sco_sock_clear_timer to avoid a deadlock. In the future, the call to
bh_lock_sock inside sco_sock_timeout should be changed to lock_sock to
synchronize with other functions using lock_sock. However, since
sco_sock_set_timer and sco_sock_clear_timer are sometimes called under
the locked socket (in sco_connect and __sco_sock_close),
cancel_delayed_work_sync might cause them to sleep until an
sco_sock_timeout that has started finishes running. But
sco_sock_timeout would also sleep until it can grab the lock_sock.
Using cancel_delayed_work is fine because sco_sock_timeout does not
change from run to run, hence there is no functional difference
between:
1. waiting for a timeout to finish running before scheduling another
timeout
2. scheduling another timeout while a timeout is running.
Link: https://syzkaller.appspot.com/bug?id=9089d89de0502e120f234ca0fc8a703f7368b31e [1]
Reported-by: syzbot+2f6d7c28bb4bf7e82060@syzkaller.appspotmail.com
Tested-by: syzbot+2f6d7c28bb4bf7e82060@syzkaller.appspotmail.com
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 42716425ad ]
In tb_switch_default_link_ports(), while linking of ports,
only odd-numbered ports (1,3,5..) are considered and even-numbered
ports are not considered.
AMD host router has lane adapters at 2 and 3 and link ports at adapter 2
is not considered due to which lane bonding gets disabled.
Hence added a fix such that all ports are considered during
linking of ports.
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Sanjay R Mehta <sanju.mehta@amd.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f775d2150c ]
The PCI hosts had bad IRQ semantics, these are all active low.
Use the proper define and fix all in-tree users.
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a338619bd7 ]
When insmod zynqmp-dpsub.ko after rmmod it, system will hang with the
error log as below:
root@xilinx-zynqmp:~# insmod zynqmp-dpsub.ko
[ 88.391289] [drm] Initialized zynqmp-dpsub 1.0.0 20130509 for fd4a0000.display on minor 0
[ 88.529906] Console: switching to colour frame buffer device 128x48
[ 88.549402] zynqmp-dpsub fd4a0000.display: [drm] fb0: zynqmp-dpsubdrm frame buffer device
[ 88.571624] zynqmp-dpsub fd4a0000.display: ZynqMP DisplayPort Subsystem driver probed
root@xilinx-zynqmp:~# rmmod zynqmp_dpsub
[ 94.023404] Console: switching to colour dummy device 80x25
root@xilinx-zynqmp:~# insmod zynqmp-dpsub.ko
<hang here>
This is because that in zynqmp_dp_probe it tries to access some DP
registers while the DP controller is still in the reset state. When
running "rmmod zynqmp_dpsub", zynqmp_dp_reset(dp, true) in
zynqmp_dp_phy_exit is called to force the DP controller into the reset
state. Then insmod will call zynqmp_dp_probe to program the DP registers,
but at this moment the DP controller hasn't been brought out of the reset
state yet since the function zynqmp_dp_reset(dp, false) is called later and
this will result the system hang.
Releasing the reset to DP controller before any read/write operation to it
will fix this issue. And for symmetry, move zynqmp_dp_reset() call from
zynqmp_dp_phy_exit() to zynqmp_dp_remove().
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a19effb6db ]
The Runtime PM subsystem will force the device "fd4a0000.zynqmp-display"
to enter suspend state while booting if the following conditions are met:
- the usage counter is zero (pm_runtime_get_sync hasn't been called yet)
- no 'active' children (no zynqmp-dp-snd-xx node under dpsub node)
- no other device in the same power domain (dpdma node has no
"power-domains = <&zynqmp_firmware PD_DP>" property)
So there is a scenario as below:
1) DP device enters suspend state <- call zynqmp_gpd_power_off
2) zynqmp_disp_crtc_setup_clock <- configurate register VPLL_FRAC_CFG
3) pm_runtime_get_sync <- call zynqmp_gpd_power_on and clear previous
VPLL_FRAC_CFG configuration
4) clk_prepare_enable(disp->pclk) <- enable failed since VPLL_FRAC_CFG
configuration is corrupted
From above, we can see that pm_runtime_get_sync may clear register
VPLL_FRAC_CFG configuration and result the failure of clk enabling.
Putting pm_runtime_get_sync at the very beginning of the function
zynqmp_disp_crtc_atomic_enable can resolve this issue.
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 56bd931ae5 ]
msm_atomic is doing vblank get/put's already,
currently there no need to duplicate the effort in MDP4
Fix warning:
...
WARNING: CPU: 3 PID: 79 at drivers/gpu/drm/drm_vblank.c:1194 drm_vblank_put+0x1cc/0x1d4
...
and multiple vblank time-outs:
...
msm 5100000.mdp: vblank time out, crtc=1
...
Tested on Nexus 7 2013 (deb), LTS 5.10.50.
Introduced by: 119ecb7fd3 ("drm/msm/mdp4: request vblank during modeset")
Signed-off-by: David Heidelberg <david@ixit.cz>
Link: https://lore.kernel.org/r/20210715060925.7880-1-david@ixit.cz
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4367355dd9 ]
When compiling with clang in certain configurations, an objtool warning
appears:
drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.o: warning: objtool:
ipq806x_gmac_probe() falls through to next function phy_modes()
This happens because the unreachable annotation in the third switch
statement is not eliminated. The compiler should know that the first
default case would prevent the second and third from being reached as
the comment notes but sanitizer options can make it harder for the
compiler to reason this out.
Help the compiler out by eliminating the unreachable() annotation and
unifying the default case error handling so that there is no objtool
warning, the meaning of the code stays the same, and there is less
duplication.
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 84f3efbe5b ]
We have underscore (_) in node name leading to warning:
arch/arm64/boot/dts/qcom/apq8096-db820c.dt.yaml: clocks: $nodename:0: 'clocks' does not match '^([a-z][a-z0-9\\-]+-bus|bus|soc|axi|ahb|apb)(@[0-9a-f]+)?$'
arch/arm64/boot/dts/qcom/apq8096-db820c.dt.yaml: clocks: xo_board: {'type': 'object'} is not allowed for {'compatible': ['fixed-clock'], '#clock-cells': [[0]], 'clock-frequency': [[19200000]], 'clock-output-names': ['xo_board'], 'phandle': [[115]]}
Fix this by changing node name to use dash (-)
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20210308060826.3074234-10-vkoul@kernel.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8c678beca7 ]
We have underscore (_) in node name leading to warning:
arch/arm64/boot/dts/qcom/msm8994-msft-lumia-octagon-cityman.dt.yaml: clocks: xo_board: {'type': 'object'} is not allowed for {'compatible': ['fixed-clock'], '#clock-cells': [[0]], 'clock-frequency': [[19200000]], 'phandle': [[26]]}
arch/arm64/boot/dts/qcom/msm8994-msft-lumia-octagon-cityman.dt.yaml: clocks: sleep_clk: {'type': 'object'} is not allowed for {'compatible': ['fixed-clock'], '#clock-cells': [[0]], 'clock-frequency': [[32768]]}
Fix this by changing node name to use dash (-)
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20210308060826.3074234-9-vkoul@kernel.org
[bjorn: Added clock-output-names to satisfy parent_names]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 52c9887fba ]
reg property should be array of values, here it is a single array,
leading to below warning:
arch/arm64/boot/dts/qcom/ipq8074-hk01.dt.yaml: soc: pci@10000000:reg:0: [268435456, 3869, 268439328, 168, 557056, 8192, 269484032, 4096] is too long
arch/arm64/boot/dts/qcom/ipq8074-hk01.dt.yaml: soc: pci@10000000:ranges: 'oneOf' conditional failed, one must be fixed:
arch/arm64/boot/dts/qcom/ipq8074-hk01.dt.yaml: soc: pci@10000000:ranges: 'oneOf' conditional failed, one must be fixed:
[[2164260864, 0, 270532608, 270532608, 0, 1048576, 2181038080, 0, 271581184, 271581184, 0, 13631488]] is not of type 'null'
[2164260864, 0, 270532608, 270532608, 0, 1048576, 2181038080, 0, 271581184, 271581184, 0, 13631488] is too long
arch/arm64/boot/dts/qcom/ipq8074-hk01.dt.yaml: soc: pci@20000000:reg:0: [536870912, 3869, 536874784, 168, 524288, 8192, 537919488, 4096] is too long
arch/arm64/boot/dts/qcom/ipq8074-hk01.dt.yaml: soc: pci@20000000:ranges: 'oneOf' conditional failed, one must be fixed:
arch/arm64/boot/dts/qcom/ipq8074-hk01.dt.yaml: soc: pci@20000000:ranges: 'oneOf' conditional failed, one must be fixed:
[[2164260864, 0, 538968064, 538968064, 0, 1048576, 2181038080, 0, 540016640, 540016640, 0, 13631488]] is not of type 'null'
[2164260864, 0, 538968064, 538968064, 0, 1048576, 2181038080, 0, 540016640, 540016640, 0, 13631488] is too long
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20210308060826.3074234-17-vkoul@kernel.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fec29bf049 ]
On Tegra186 and later, a portion of the SYSRAM may be reserved for use
by TZ. Non-TZ memory accesses to this portion, including speculative
accesses, trigger SErrors that bring down the system. This does also
happen in practice occasionally (due to speculative accesses).
To fix the issue, add a flag to the SRAM driver to only map the
device tree-specified reserved areas depending on a flag set
based on the compatibility string. This would not affect non-Tegra
systems that rely on the entire thing being memory mapped.
If 64K pages are being used, we cannot exactly map the 4K regions
that are placed in SYSRAM - ioremap code instead aligns to closest
64K pages. However, since in practice the non-accessible memory area
is 64K aligned, these mappings do not overlap with the non-accessible
memory area and things work out.
Reviewed-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Link: https://lore.kernel.org/r/20210715103423.1811101-1-mperttunen@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1fe0e1fa32 ]
Handle optional overrun-throttle-ms property as done for 8250_fsl in commit
6d7f677a2a ("serial: 8250: Rate limit serial port rx interrupts during
input overruns"). This can be used to rate limit the UART interrupts on
noisy lines that end up producing messages like the following:
ttyS ttyS2: 4 input overrun(s)
At least on droid4, the multiplexed USB and UART port is left to UART mode
by the bootloader for a debug console, and if a USB charger is connected
on boot, we get noise on the UART until the PMIC related drivers for PHY
and charger are loaded.
With this patch and overrun-throttle-ms = <500> we avoid the extra rx
interrupts.
Cc: Carl Philipp Klemm <philipp@uvos.xyz>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210727103533.51547-2-tony@atomide.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0fd75f5760 ]
Three interconnects are defined for IPA version 4.9, but there
should only be two. They should also use names that match what's
used for other platforms (and specified in the Device Tree binding).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d9b16054b ]
We must not call gfs2_consist (which does a file system withdraw) from
the freeze glock's freeze_go_xmote_bh function because the withdraw
will try to use the freeze glock, thus causing a glock recursion error.
This patch changes freeze_go_xmote_bh to call function
gfs2_assert_withdraw_delayed instead of gfs2_consist to avoid recursion.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4108b3e6db ]
These for-loops should test against v4l2_dv_timings_presets[i].bt.width,
not if i < v4l2_dv_timings_presets[i].bt.width. Luckily nothing ever broke,
since the smallest width is still a lot higher than the total number of
presets, but it is wrong.
The last item in the presets array is all 0, so the for-loop must stop
when it reaches that sentinel.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0ada1697ed ]
When the stream fails to start, the first two buffers in the queue have
been moved to the active_vb2_buf array and are returned to vb2 by
imx7_csi_dma_unsetup_vb2_buf(). The function is called with the buffer
state set to VB2_BUF_STATE_ERROR unconditionally, which is correct when
stopping the stream, but not when the start operation fails. In that
case, the state should be set to VB2_BUF_STATE_QUEUED. Fix it.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Reviewed-by: Rui Miguel Silva <rmfrfs@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f809665ee7 ]
The range for analog gain mentioned in the datasheet is [0, 480].
The real gain formula mentioned in the datasheet is:
Gain = 512 / (512 – X)
Hence, values larger than 511 clearly makes no sense. The gain
register field is also documented to be of 9-bits in the datasheet.
Certainly, it is enough to infer that, the kernel driver currently
advertises an arbitrary analog gain max. Fix it by rectifying the
value as per the data sheet i.e. 480.
Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 51f93add36 ]
The frame_length_lines (0x0340) registers are hard-coded as follows:
- 4208x3118
frame_length_lines = 0x0c50
- 2104x1560
frame_length_lines = 0x0638
- 1048x780
frame_length_lines = 0x034c
The driver exposes the V4L2_CID_VBLANK control in read-only mode and
sets its value to vts_def - height, where vts_def is a mode-dependent
value coming from the supported_modes array. It is set using one of
the following macros defined in the driver:
#define IMX258_VTS_30FPS 0x0c98
#define IMX258_VTS_30FPS_2K 0x0638
#define IMX258_VTS_30FPS_VGA 0x034c
There's a clear mismatch in the value for the full resolution mode i.e.
IMX258_VTS_30FPS. Fix it by rectifying the macro with the value set for
the frame_length_lines register as stated above.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
Reviewed-by: Bingbu Cao <bingbu.cao@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 042ad90ca7 ]
We should not enable the switch interfaces at probe time since this is
trigged by the open callback. Remove the call dpsw_enable() which does
exactly this.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c45074d68a ]
Code was checking if random_addr and hdev->rpa match without first
checking if the RPA has not been set (BDADDR_ANY), furthermore it was
clearing HCI_RPA_EXPIRED before the command completes and the RPA is
actually programmed which in case of failure would leave the expired
RPA still set.
Since advertising instance have a similar problem the clearing of
HCI_RPA_EXPIRED has been moved to hci_event.c after checking the random
address is in fact the hdev->rap and then proceed to set the expire
timeout.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf2942a8b7 ]
The initialization sequence performed by the generic platform driver
pcie-designware-plat.c for a DWC based implementation doesn't work for
Tegra194. Tegra194 has a different initialization sequence requirement
which can only be satisfied by the Tegra194 specific platform driver
pcie-tegra194.c. So, remove the generic compatible string "snps,dw-pcie-ep"
from Tegra194's endpoint controller nodes.
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 220ade7745 ]
Some time ago, I reported a calltrace issue
"did not find a suitable aggregator", please see[1].
After a period of analysis and reproduction, I find
that this problem is caused by concurrency.
Before the problem occurs, the bond structure is like follows:
bond0 - slaver0(eth0) - agg0.lag_ports -> port0 - port1
\
port0
\
slaver1(eth1) - agg1.lag_ports -> NULL
\
port1
If we run 'ifenslave bond0 -d eth1', the process is like below:
excuting __bond_release_one()
|
bond_upper_dev_unlink()[step1]
| | |
| | bond_3ad_lacpdu_recv()
| | ->bond_3ad_rx_indication()
| | spin_lock_bh()
| | ->ad_rx_machine()
| | ->__record_pdu()[step2]
| | spin_unlock_bh()
| | |
| bond_3ad_state_machine_handler()
| spin_lock_bh()
| ->ad_port_selection_logic()
| ->try to find free aggregator[step3]
| ->try to find suitable aggregator[step4]
| ->did not find a suitable aggregator[step5]
| spin_unlock_bh()
| |
| |
bond_3ad_unbind_slave() |
spin_lock_bh()
spin_unlock_bh()
step1: already removed slaver1(eth1) from list, but port1 remains
step2: receive a lacpdu and update port0
step3: port0 will be removed from agg0.lag_ports. The struct is
"agg0.lag_ports -> port1" now, and agg0 is not free. At the
same time, slaver1/agg1 has been removed from the list by step1.
So we can't find a free aggregator now.
step4: can't find suitable aggregator because of step2
step5: cause a calltrace since port->aggregator is NULL
To solve this concurrency problem, put bond_upper_dev_unlink()
after bond_3ad_unbind_slave(). In this way, we can invalid the port
first and skip this port in bond_3ad_state_machine_handler(). This
eliminates the situation that the slaver has been removed from the
list but the port is still valid.
[1]https://lore.kernel.org/netdev/10374.1611947473@famine/
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 241d1af4c1 ]
Use nfnetlink_unicast() which already translates EAGAIN to ENOBUFS,
since EAGAIN is reserved to report missing module dependencies to the
nfnetlink core.
e0241ae6ac ("netfilter: use nfnetlink_unicast() forgot to update
this spot.
Reported-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1e6bc5987a ]
Swap reg and reg-names order and drop adi,input-justification
and adi,input-style to fix the following dtbs_check warnings:
arch/arm/boot/dts/stm32mp157a-dhcor-avenger96.dt.yaml: hdmi-transmitter@3d: adi,input-justification: False schema does not allow ['evenly']
arch/arm/boot/dts/stm32mp157a-dhcor-avenger96.dt.yaml: hdmi-transmitter@3d: adi,input-style: False schema does not allow [[1]]
arch/arm/boot/dts/stm32mp157a-dhcor-avenger96.dt.yaml: hdmi-transmitter@3d: reg-names:1: 'edid' was expected
arch/arm/boot/dts/stm32mp157a-dhcor-avenger96.dt.yaml: hdmi-transmitter@3d: reg-names:2: 'cec' was expected
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Patrice Chotard <patrice.chotard@foss.st.com>
Cc: Patrick Delaunay <patrick.delaunay@foss.st.com>
Cc: linux-stm32@st-md-mailman.stormreply.com
To: linux-arm-kernel@lists.infradead.org
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f728c4a9e8 ]
In error handling branch "if (WARN_ON(node == NUMA_NO_NODE))", the
previously allocated memories are not released. Doing this before
allocating memory eliminates memory leaks.
tj: Note that the condition only occurs when the arch code is pretty broken
and the WARN_ON might as well be BUG_ON().
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 015f2ebb93 ]
When the system shuts down or warm reboots, the display may be active,
with the hardware accessing system memory. Upon reboot, the DDR will not
be accessible, which may cause issues.
Implement the platform_driver .shutdown() operation and shut down the
display to fix this.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 043c5bb3c4 ]
When loading in parallel multiple programs which use the same to-be
pinned map, it is possible that two instances of the loader will call
bpf_object__create_maps() at the same time. If the map doesn't exist
when both instances call bpf_object__reuse_map(), then one of the
instances will fail with EEXIST when calling bpf_map__pin().
Fix the race by retrying reusing a map if bpf_map__pin() returns
EEXIST. The fix is similar to the one in iproute2: e4c4685fd6e4 ("bpf:
Fix race condition with map pinning").
Before retrying the pinning, we don't do any special cleaning of an
internal map state. The closer code inspection revealed that it's not
required:
- bpf_object__create_map(): map->inner_map is destroyed after a
successful call, map->fd is closed if pinning fails.
- bpf_object__populate_internal_map(): created map elements is
destroyed upon close(map->fd).
- init_map_slots(): slots are freed after their initialization.
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210726152001.34845-1-m@lambda.lt
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7d07006f05 ]
The current behavior of 'tracex7' doesn't consist with other bpf samples
tracex{1..6}. Other samples do not require any argument to run with, but
tracex7 should be run with btrfs device argument. (it should be executed
with test_override_return.sh)
Currently, tracex7 doesn't have any description about how to run this
program and raises an unexpected error. And this result might be
confusing since users might not have a hunch about how to run this
program.
// Current behavior
# ./tracex7
sh: 1: Syntax error: word unexpected (expecting ")")
// Fixed behavior
# ./tracex7
ERROR: Run with the btrfs device argument!
In order to fix this error, this commit adds logic to report a message
and exit when running this program with a missing argument.
Additionally in test_override_return.sh, there is a problem with
multiple directory(tmpmnt) creation. So in this commit adds a line with
removing the directory with every execution.
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210727041056.23455-1-claudiajkang@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit af1f2b19fd ]
[why]
For dual eDP when setting the new settings we need to set
command version to DMUB_CMD_PSR_CONTROL_VERSION_1, otherwise
DMUB will not read panel_inst parameter.
[how]
Instead of PSR_VERSION_1 pass DMUB_CMD_PSR_CONTROL_VERSION_1
Reviewed-by: Wood Wyatt <Wyatt.Wood@amd.com>
Acked-by: Solomon Chiu <solomon.chiu@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7ccbdcc4d0 ]
The alloc_tty_driver failure is handled gracefully in hvsi_init. But
tty_register_driver is not. panic is called if that one fails.
So handle the failure of tty_register_driver gracefully too. This will
keep at least the console functional as it was enabled earlier by
console_initcall in hvsi_console_init. Instead of shooting down the
whole system.
This means, we disable interrupts and restore hvsi_wait back to
poll_for_state().
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Link: https://lore.kernel.org/r/20210723074317.32690-3-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23411c7200 ]
While alloc_tty_driver failure in rs_init would mean we have much bigger
problem, there is no reason to panic when tty_register_driver fails
there. It can fail for various reasons.
So handle the failure gracefully. Actually handle them both while at it.
This will make at least the console functional as it was enabled earlier
by console_initcall in iss_console_init. Instead of shooting down the
whole system.
We move tty_port_init() after alloc_tty_driver(), so that we don't need
to destroy the port in case the latter function fails.
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: linux-xtensa@linux-xtensa.org
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Link: https://lore.kernel.org/r/20210723074317.32690-2-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d7aff291d0 ]
Oxford Semiconductor 950 serial port devices have a 128-byte FIFO and in
the enhanced (650) mode, which we select in `autoconfig_has_efr' with
the ECB bit set in the EFR register, they support the receive interrupt
trigger level selectable with FCR bits 7:6 from the set of 16, 32, 112,
120. This applies to the original OX16C950 discrete UART[1] as well as
950 cores embedded into more complex devices.
For these devices we set the default to 112, which sets an excessively
high level of 112 or 7/8 of the FIFO capacity, unlike with other port
types where we choose at most 1/2 of their respective FIFO capacities.
Additionally we don't make the trigger level configurable. Consequently
frequent input overruns happen with high bit rates where hardware flow
control cannot be used (e.g. terminal applications) even with otherwise
highly-performant systems.
Lower the default receive interrupt trigger level to 32 then, and make
it configurable. Document the trigger levels along with other port
types, including the set of 16, 32, 64, 112 for the transmit interrupt
as well[2].
References:
[1] "OX16C950 rev B High Performance UART with 128 byte FIFOs", Oxford
Semiconductor, Inc., DS-0031, Sep 05, Table 10: "Receiver Trigger
Levels", p. 22
[2] same, Table 9: "Transmit Interrupt Trigger Levels", p. 22
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2106260608480.37803@angie.orcam.me.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3322ba0d7b ]
Kernel support for the newer PCI mio instructions can be toggled off
with the pci=nomio command line option which needs to integrate with
common code PCI option parsing. However this option then toggles static
branches which can't be toggled yet in an early_param() call.
Thus commit 9964f396f1 ("s390: fix setting of mio addressing control")
moved toggling the static branches to the PCI init routine.
With this setup however we can't check for mio support outside the PCI
code during early boot, i.e. before switching the static branches, which
we need to be able to export this as an ELF HWCAP.
Improve on this by turning mio availability into a machine flag that
gets initially set based on CONFIG_PCI and the facility bit and gets
toggled off if pci=nomio is found during PCI option parsing allowing
simple access to this machine flag after early init.
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5492886c14 ]
In case of a jump label print the real address of the piece of code
where a mismatch was detected. This is right before the system panics,
so there is nothing revealed.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 323e0cb473 ]
Fix the following out-of-bounds warnings:
net/core/flow_dissector.c: In function '__skb_flow_dissect':
>> net/core/flow_dissector.c:1104:4: warning: 'memcpy' offset [24, 39] from the object at '<unknown>' is out of the bounds of referenced subobject 'saddr' with type 'struct in6_addr' at offset 8 [-Warray-bounds]
1104 | memcpy(&key_addrs->v6addrs, &iph->saddr,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1105 | sizeof(key_addrs->v6addrs));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/ipv6.h:5,
from net/core/flow_dissector.c:6:
include/uapi/linux/ipv6.h:133:18: note: subobject 'saddr' declared here
133 | struct in6_addr saddr;
| ^~~~~
>> net/core/flow_dissector.c:1059:4: warning: 'memcpy' offset [16, 19] from the object at '<unknown>' is out of the bounds of referenced subobject 'saddr' with type 'unsigned int' at offset 12 [-Warray-bounds]
1059 | memcpy(&key_addrs->v4addrs, &iph->saddr,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1060 | sizeof(key_addrs->v4addrs));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from include/linux/ip.h:17,
from net/core/flow_dissector.c:5:
include/uapi/linux/ip.h:103:9: note: subobject 'saddr' declared here
103 | __be32 saddr;
| ^~~~~
The problem is that the original code is trying to copy data into a
couple of struct members adjacent to each other in a single call to
memcpy(). So, the compiler legitimately complains about it. As these
are just a couple of members, fix this by copying each one of them in
separate calls to memcpy().
This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/109
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/d5ae2e65-1f18-2577-246f-bada7eee6ccd@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6321c7acb8 ]
Fix the following out-of-bounds warning:
In function 'ip_copy_addrs',
inlined from '__ip_queue_xmit' at net/ipv4/ip_output.c:517:2:
net/ipv4/ip_output.c:449:2: warning: 'memcpy' offset [40, 43] from the object at 'fl' is out of the bounds of referenced subobject 'saddr' with type 'unsigned int' at offset 36 [-Warray-bounds]
449 | memcpy(&iph->saddr, &fl4->saddr,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
450 | sizeof(fl4->saddr) + sizeof(fl4->daddr));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The problem is that the original code is trying to copy data into a
couple of struct members adjacent to each other in a single call to
memcpy(). This causes a legitimate compiler warning because memcpy()
overruns the length of &iph->saddr and &fl4->saddr. As these are just
a couple of struct members, fix this by using direct assignments,
instead of memcpy().
This helps with the ongoing efforts to globally enable -Warray-bounds
and get us closer to being able to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://github.com/KSPP/linux/issues/109
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/d5ae2e65-1f18-2577-246f-bada7eee6ccd@intel.com/
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 546948bf36 ]
All checks in ipa_table_validate_build() are computed at build time,
so build that unconditionally.
In ipa_table_valid() calls to ipa_table_valid_one() are missing the
IPA pointer parameter is missing in (a bug that shows up only when
IPA_VALIDATE is defined). Don't bother checking whether hashed
table memory regions are valid if hashed tables are not supported.
With those things fixed, have these table validation functions built
unconditionally (not dependent on IPA_VALIDATE).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f2c1dac0ab ]
Stop supporting different sizes for hashed and non-hashed filter or
route tables. Add BUILD_BUG_ON() calls to verify the sizes of the
fields in the filter/route table initialization immediate command
are the same.
Add a check to ipa_cmd_table_valid() to ensure the size of the
memory region being checked fits within the immediate command field
that must hold it.
Remove two Boolean parameters used only for error reporting. This
actually fixes a bug that would only show up if IPA_VALIDATE were
defined. Define ipa_cmd_table_valid() unconditionally (no longer
dependent on IPA_VALIDATE).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2b7e9f25e5 ]
Each test case can have a set of sub-tests, where each sub-test can
run the cBPF/eBPF test snippet with its own data_size and expected
result. Before, the end of the sub-test array was indicated by both
data_size and result being zero. However, most or all of the internal
eBPF tests has a data_size of zero already. When such a test also had
an expected value of zero, the test was never run but reported as
PASS anyway.
Now the test runner always runs the first sub-test, regardless of the
data_size and result values. The sub-test array zero-termination only
applies for any additional sub-tests.
There are other ways fix it of course, but this solution at least
removes the surprise of eBPF tests with a zero result always succeeding.
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721103822.3755111-1-johan.almbladh@anyfinetworks.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 95f71f12aa ]
The printing message "PSP loading VCN firmware" is mis-leading because
people might think driver is loading VCN firmware. Actually when this
message is printed, driver is just preparing some VCN ucode, not loading
VCN firmware yet. The actual VCN firmware loading will be in the PSP block
hw_init. Fix the printing message
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3addbde269 ]
[Why]
During headless boot, DIG may be on which causes HW/SW discrepancies.
To avoid this we power down hardware on boot if DIG is turned on. With
introduction of multiple eDP, hardware power down is being bypassed
under certain conditions.
[How]
Fixed hardware power down bypass, and ensured hardware will power down
if DIG is on and seamless boot is not enabled.
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Jake Wang <haonan.wang2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dd98d2895d ]
The ethtool compat ioctl handling is hidden away in net/socket.c,
which introduces a couple of minor oddities:
- The implementation may end up diverging, as seen in the RXNFC
extension in commit 84a1d9c482 ("net: ethtool: extend RXNFC
API to support RSS spreading of filter matches") that does not work
in compat mode.
- Most architectures do not need the compat handling at all
because u64 and compat_u64 have the same alignment.
- On x86, the conversion is done for both x32 and i386 user space,
but it's actually wrong to do it for x32 and cannot work there.
- On 32-bit Arm, it never worked for compat oabi user space, since
that needs to do the same conversion but does not.
- It would be nice to get rid of both compat_alloc_user_space()
and copy_in_user() throughout the kernel.
None of these actually seems to be a serious problem that real
users are likely to encounter, but fixing all of them actually
leads to code that is both shorter and more readable.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 23e55639b8 ]
[why]
The units of the time_per_pixel variable were incorrect, this had to be
changed for the code to properly function.
[how]
The change was very straightforward, only required one line of code to
be changed where the calculation was done.
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Oliver Logush <oliver.logush@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28b6a003bc ]
The virtual machine monitor (QEMU) exposes the pvpanic-pci
device to the guest. On guest side the module exists but
currently isn't loaded automatically. So the driver fails
to be probed and does not its job of handling guest panic
events.
Instead of requiring manual modprobe, let's include a device
database using the MODULE_DEVICE_TABLE macro and let the
module auto-load when the guest gets exposed with such a
pvpanic-pci device.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20210629072214.901004-1-eric.auger@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8990f96a01 ]
Some versions of the MC firmware wrongly report 0 for register base
address of the DPMCP associated with child DPRC objects thus rendering
them unusable. This is particularly troublesome in ACPI boot scenarios
where the legacy way of extracting this base address from the device
tree does not apply.
Given that DPMCPs share the same base address, workaround this by using
the base address extracted from the root DPRC container.
Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Link: https://lore.kernel.org/r/20210715140718.8513-8-laurentiu.tudor@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df00609821 ]
On Armadillo-800-EVA with CONFIG_DEBUG_SPINLOCK=y:
BUG: spinlock bad magic on CPU#0, swapper/1
lock: lcdc0_device+0x10c/0x308, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
CPU: 0 PID: 1 Comm: swapper Not tainted 5.11.0-rc5-armadillo-00036-gbbca04be7a80-dirty #287
Hardware name: Generic R8A7740 (Flattened Device Tree)
[<c010c3c8>] (unwind_backtrace) from [<c010a49c>] (show_stack+0x10/0x14)
[<c010a49c>] (show_stack) from [<c0159534>] (do_raw_spin_lock+0x20/0x94)
[<c0159534>] (do_raw_spin_lock) from [<c040858c>] (dev_pm_get_subsys_data+0x8c/0x11c)
[<c040858c>] (dev_pm_get_subsys_data) from [<c05fbcac>] (genpd_add_device+0x78/0x2b8)
[<c05fbcac>] (genpd_add_device) from [<c0412db4>] (of_genpd_add_device+0x34/0x4c)
[<c0412db4>] (of_genpd_add_device) from [<c0a1ea74>] (board_staging_register_device+0x11c/0x148)
[<c0a1ea74>] (board_staging_register_device) from [<c0a1eac4>] (board_staging_register_devices+0x24/0x28)
of_genpd_add_device() is called before platform_device_register(), as it
needs to attach the genpd before the device is probed. But the spinlock
is only initialized when the device is registered.
Fix this by open-coding the spinlock initialization, cfr.
device_pm_init_common() in the internal drivers/base code, and in the
SuperH early platform code.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/57783ece7ddae55f2bda2f59f452180bff744ea0.1626257398.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bcacbf06c8 ]
Currently the composite driver encodes the MaxPower field of
the configuration descriptor by reading the c->MaxPower of the
usb_configuration only if it is non-zero, otherwise it falls back
to using the value hard-coded in CONFIG_USB_GADGET_VBUS_DRAW.
However, there are cases when a configuration must explicitly set
bMaxPower to 0, particularly if its bmAttributes also has the
Self-Powered bit set, which is a valid combination.
This is specifically called out in the USB PD specification section
9.1, in which a PDUSB device "shall report zero in the bMaxPower
field after negotiating a mutually agreeable Contract", and also
verified by the USB Type-C Functional Test TD.4.10.2 Sink Power
Precedence Test.
The fix allows the c->MaxPower to be used for encoding the bMaxPower
even if it is 0, if the self-powered bit is also set. An example
usage of this would be for a ConfigFS gadget to be dynamically
updated by userspace when the Type-C connection is determined to be
operating in Power Delivery mode.
Co-developed-by: Ronak Vijay Raheja <rraheja@codeaurora.org>
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Ronak Vijay Raheja <rraheja@codeaurora.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Link: https://lore.kernel.org/r/20210720080907.30292-1-jackp@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 61136a12cb ]
mv_ehci_enable() did not disable and unprepare clocks in case of
failures of phy_init(). Besides, it did not take into account failures
of ehci_clock_enable() (in effect, failures of clk_prepare_enable()).
The patch fixes both issues and gets rid of redundant wrappers around
clk_prepare_enable() and clk_disable_unprepare() to simplify this a bit.
Found by Linux Driver Verification project (linuxtesting.org).
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Link: https://lore.kernel.org/r/20210708083056.21543-1-novikov@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ae0123960 ]
f_ncm tx timeout can call us with null skb to flush
a pending frame. In this case skb is NULL to begin
with but ceases to be null after dev->wrap() completes.
In such a case in->maxpacket will be read, even though
we've failed to check that 'in' is not NULL.
Though I've never observed this fail in practice,
however the 'flush operation' simply does not make sense with
a null usb IN endpoint - there's nowhere to flush to...
(note that we're the gadget/device, and IN is from the point
of view of the host, so here IN actually means outbound...)
Cc: Brooke Basile <brookebasile@gmail.com>
Cc: "Bryan O'Donoghue" <bryan.odonoghue@linaro.org>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20210701114834.884597-6-zenczykowski@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 373e2829e7 ]
Ensure that the adapter->q_vector[MAX_Q_VECTORS] array isn't accessed
beyond its size. It was fixed by using a local variable num_q_vectors
as a limit for loop index, and ensure that num_q_vectors is not bigger
than MAX_Q_VECTORS.
Suggested-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fed31a4dd3 ]
This commit fixes several typos where CONFIG_TASKS_RCU_TRACE should
instead be CONFIG_TASKS_TRACE_RCU. Among other things, these typos
could cause CONFIG_TASKS_TRACE_RCU_READ_MB=y kernels to suffer from
memory-ordering bugs that could result in false-positive quiescent
states and too-short grace periods.
Signed-off-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 56f0729a51 ]
drm_file->master pointers should be protected by
drm_device.master_mutex or drm_file.master_lookup_lock when being
dereferenced.
However, in drm_lease.c, there are multiple instances where
drm_file->master is accessed and dereferenced while neither lock is
held. This makes drm_lease.c vulnerable to use-after-free bugs.
We address this issue in 2 ways:
1. Add a new drm_file_get_master() function that calls drm_master_get
on drm_file->master while holding on to
drm_file.master_lookup_lock. Since drm_master_get increments the
reference count of master, this prevents master from being freed until
we unreference it with drm_master_put.
2. In each case where drm_file->master is directly accessed and
eventually dereferenced in drm_lease.c, we wrap the access in a call
to the new drm_file_get_master function, then unreference the master
pointer once we are done using it.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210712043508.11584-6-desmondcheongzx@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b0860a3cf ]
Currently, drm_file.master pointers should be protected by
drm_device.master_mutex when being dereferenced. This is because
drm_file.master is not invariant for the lifetime of drm_file. If
drm_file is not the creator of master, then drm_file.is_master is
false, and a call to drm_setmaster_ioctl will invoke
drm_new_set_master, which then allocates a new master for drm_file and
puts the old master.
Thus, without holding drm_device.master_mutex, the old value of
drm_file.master could be freed while it is being used by another
concurrent process.
However, it is not always possible to lock drm_device.master_mutex to
dereference drm_file.master. Through the fbdev emulation code, this
might occur in a deep nest of other locks. But drm_device.master_mutex
is also the outermost lock in the nesting hierarchy, so this leads to
potential deadlocks.
To address this, we introduce a new spin lock at the bottom of the
lock hierarchy that only serializes drm_file.master. With this change,
the value of drm_file.master changes only when both
drm_device.master_mutex and drm_file.master_lookup_lock are
held. Hence, any process holding either of those locks can ensure that
the value of drm_file.master will not change concurrently.
Since no lock depends on the new drm_file.master_lookup_lock, when
drm_file.master is dereferenced, but drm_device.master_mutex cannot be
held, we can safely protect the master pointer with
drm_file.master_lookup_lock.
Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210712043508.11584-5-desmondcheongzx@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5eff9585de ]
Inside drm_clients_info, the rcu_read_lock is held to lock
pid_task()->comm. However, within this protected section, a call to
drm_is_current_master is made, which involves a mutex lock in a future
patch. However, this is illegal because the mutex lock might block
while in the RCU read-side critical section.
Since drm_is_current_master isn't protected by rcu_read_lock, we avoid
this by moving it out of the RCU critical section.
The following report came from intel-gfx ci's
igt@debugfs_test@read_all_entries testcase:
=============================
[ BUG: Invalid wait context ]
5.13.0-CI-Patchwork_20515+ #1 Tainted: G W
-----------------------------
debugfs_test/1101 is trying to lock:
ffff888132d901a8 (&dev->master_mutex){+.+.}-{3:3}, at:
drm_is_current_master+0x1e/0x50
other info that might help us debug this:
context-{4:4}
3 locks held by debugfs_test/1101:
#0: ffff88810fdffc90 (&p->lock){+.+.}-{3:3}, at:
seq_read_iter+0x53/0x3b0
#1: ffff888132d90240 (&dev->filelist_mutex){+.+.}-{3:3}, at:
drm_clients_info+0x63/0x2a0
#2: ffffffff82734220 (rcu_read_lock){....}-{1:2}, at:
drm_clients_info+0x1b1/0x2a0
stack backtrace:
CPU: 8 PID: 1101 Comm: debugfs_test Tainted: G W
5.13.0-CI-Patchwork_20515+ #1
Hardware name: Intel Corporation CometLake Client Platform/CometLake S
UDIMM (ERB/CRB), BIOS CMLSFWR1.R00.1263.D00.1906260926 06/26/2019
Call Trace:
dump_stack+0x7f/0xad
__lock_acquire.cold.78+0x2af/0x2ca
lock_acquire+0xd3/0x300
? drm_is_current_master+0x1e/0x50
? __mutex_lock+0x76/0x970
? lockdep_hardirqs_on+0xbf/0x130
__mutex_lock+0xab/0x970
? drm_is_current_master+0x1e/0x50
? drm_is_current_master+0x1e/0x50
? drm_is_current_master+0x1e/0x50
drm_is_current_master+0x1e/0x50
drm_clients_info+0x107/0x2a0
seq_read_iter+0x178/0x3b0
seq_read+0x104/0x150
full_proxy_read+0x4e/0x80
vfs_read+0xa5/0x1b0
ksys_read+0x5a/0xd0
do_syscall_64+0x39/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210712043508.11584-3-desmondcheongzx@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6d14f5c702 ]
In the smk_access_entry() function, if no matching rule is found
in the rust_list, a negative error code will be used to perform bit
operations with the MAY_ enumeration value. This is semantically
wrong. This patch fixes this issue.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0ac2627134 ]
Currently three interconnects are defined for the Qualcomm SC7280
SoC, but this was based on a misunderstanding. There should only be
two interconnects defined: one between the IPA and system memory;
and another between the AP and IPA config space. The bandwidths
defined for the memory and config interconnects do not match what I
understand to be proper values, so update these.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 12dd4ebda4 ]
SA8155p adp board has two USB A-type receptacles called
USB-portB and USB-portC respectively.
While USB-portB is a USB High-Speed connector/interface, the
USB-portC one is a USB 3.1 Super-Speed connector/interface.
Also the USB-portB is used as the USB emergency
download port (for image download purposes).
Enable both the ports on the board in USB Host mode (since all
the USB interfaces are brought out to USB Type A
connectors).
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Link: https://lore.kernel.org/r/20210627114616.717101-4-bhupesh.sharma@linaro.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fef773fc81 ]
Yonghong Song report:
The bpf selftest tc_bpf failed with latest bpf-next.
The following is the command to run and the result:
$ ./test_progs -n 132
[ 40.947571] bpf_testmod: loading out-of-tree module taints kernel.
test_tc_bpf:PASS:test_tc_bpf__open_and_load 0 nsec
test_tc_bpf:PASS:bpf_tc_hook_create(BPF_TC_INGRESS) 0 nsec
test_tc_bpf:PASS:bpf_tc_hook_create invalid hook.attach_point 0 nsec
test_tc_bpf_basic:PASS:bpf_obj_get_info_by_fd 0 nsec
test_tc_bpf_basic:PASS:bpf_tc_attach 0 nsec
test_tc_bpf_basic:PASS:handle set 0 nsec
test_tc_bpf_basic:PASS:priority set 0 nsec
test_tc_bpf_basic:PASS:prog_id set 0 nsec
test_tc_bpf_basic:PASS:bpf_tc_attach replace mode 0 nsec
test_tc_bpf_basic:PASS:bpf_tc_query 0 nsec
test_tc_bpf_basic:PASS:handle set 0 nsec
test_tc_bpf_basic:PASS:priority set 0 nsec
test_tc_bpf_basic:PASS:prog_id set 0 nsec
libbpf: Kernel error message: Failed to send filter delete notification
test_tc_bpf_basic:FAIL:bpf_tc_detach unexpected error: -3 (errno 3)
test_tc_bpf:FAIL:test_tc_internal ingress unexpected error: -3 (errno 3)
The failure seems due to the commit
cfdf0d9ae7 ("rtnetlink: use nlmsg_notify() in rtnetlink_send()")
Deal with ESRCH error in nlmsg_notify() even the report variable is zero.
Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Link: https://lore.kernel.org/r/20210719051816.11762-1-yajun.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f34bf652d6 ]
[Bug][AST2500]
V1:
When AST2500 acts as stand-alone VGA so that DRAM and DVO initialization
have to be achieved by VGA driver with P2A (PCI to AHB) enabling.
However, HW suggests disable Fast reset mode after DRAM initializaton,
because fast reset mode is mainly designed for ARM ICE debugger.
Once Fast reset is checked as enabling, WDT (Watch Dog Timer) should be
first enabled to avoid system deadlock before disable fast reset mode.
V2:
Use to_pci_dev() to get revision of PCI configuration.
V3:
If SCU00 is not unlocked, just enter its password again.
It is unnecessary to clear AHB lock condition and restore WDT default
setting again, before Fast-reset clearing.
V4:
repatch after "error : could not build fake ancestor" resolved.
V5:
Since CVE_2019_6260 item3, Most of AST2500 have disabled P2A(PCIe to AMBA).
However, for backward compatibility, some patches about P2A, such as items
of v5.2 and v5.3, are considered to be upstreamed with comments.
1. Add define macro to improve source readability.
ast_drv.h, ast_main.c, ast_post.c
2. Add comment about "Fast restet" is enabled for ARM-ICE debugger
ast_post.c
3. Add comment about Reset USB port to patch USB unknown device issue
ast_post.c
Signed-off-by: KuoHsiang Chou <kuohsiang_chou@aspeedtech.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210709080900.4056-1-kuohsiang_chou@aspeedtech.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 98a6543917 ]
The user can pass in any value to the driver through the 'ioctl'
interface. The driver dost not check, which may cause DoS bugs.
The following log reveals it:
divide error: 0000 [#1] PREEMPT SMP KASAN PTI
RIP: 0010:SetOverlayViewPort+0x133/0x5f0 drivers/video/fbdev/kyro/STG4000OverlayDevice.c:476
Call Trace:
kyro_dev_overlay_viewport_set drivers/video/fbdev/kyro/fbdev.c:378 [inline]
kyrofb_ioctl+0x2eb/0x330 drivers/video/fbdev/kyro/fbdev.c:603
do_fb_ioctl+0x1f3/0x700 drivers/video/fbdev/core/fbmem.c:1171
fb_ioctl+0xeb/0x130 drivers/video/fbdev/core/fbmem.c:1185
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x19b/0x220 fs/ioctl.c:739
do_syscall_64+0x32/0x80 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1626235762-2590-1-git-send-email-zheyuma97@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0dc6c59892 ]
Since new code doesn't take old clk names in account, it does fixes
error:
msm_dsi 4700000.mdss_dsi: dev_pm_opp_set_clkname: Couldn't find clock: -2
and following kernel oops introduced by
b0530eb119 ("drm/msm/dpu: Use OPP API to set clk/perf state").
Also removes warning about deprecated clock names.
Tested against linux-5.10.y LTS on Nexus 7 2013.
Reviewed-by: Brian Masney <masneyb@onstation.org>
Signed-off-by: David Heidelberg <david@ixit.cz>
Link: https://lore.kernel.org/r/20210707131453.24041-1-david@ixit.cz
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 226d528512 ]
To avoid races between iavf_init_task(), iavf_reset_task(),
iavf_watchdog_task(), iavf_adminq_task() as well as the shutdown and
remove functions more locking is required.
The current protection by __IAVF_IN_CRITICAL_TASK is needed in
additional places.
- The reset task performs state transitions, therefore needs locking.
- The adminq task acts on replies from the PF in
iavf_virtchnl_completion() which may alter the states.
- The init task is not only run during probe but also if a VF gets stuck
to reinitialize it.
- The shutdown function performs a state transition.
- The remove function performs a state transition and also free's
resources.
iavf_lock_timeout() is introduced to avoid waiting infinitely
and cause a deadlock. Rather unlock and print a warning.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 22c8fd71d3 ]
The iavf watchdog task overrides adapter->state to __IAVF_RESETTING
when it detects a pending reset. Then schedules iavf_reset_task() which
takes care of the reset.
The reset task is capable of handling the reset without changing
adapter->state. In fact we lose the state information when the watchdog
task prematurely changes the adapter state. This may lead to a crash if
instead of the reset task the iavf_remove() function gets called before
the reset task.
In that case (if we were in state __IAVF_RUNNING previously) the
iavf_remove() function triggers iavf_close() which fails to close the
device because of the incorrect state information.
This may result in a crash due to pending interrupts.
kernel BUG at drivers/pci/msi.c:357!
[...]
Call Trace:
[<ffffffffbddf24dd>] pci_disable_msix+0x3d/0x50
[<ffffffffc08d2a63>] iavf_reset_interrupt_capability+0x23/0x40 [iavf]
[<ffffffffc08d312a>] iavf_remove+0x10a/0x350 [iavf]
[<ffffffffbddd3359>] pci_device_remove+0x39/0xc0
[<ffffffffbdeb492f>] __device_release_driver+0x7f/0xf0
[<ffffffffbdeb49c3>] device_release_driver+0x23/0x30
[<ffffffffbddcabb4>] pci_stop_bus_device+0x84/0xa0
[<ffffffffbddcacc2>] pci_stop_and_remove_bus_device+0x12/0x20
[<ffffffffbddf361f>] pci_iov_remove_virtfn+0xaf/0x160
[<ffffffffbddf3bcc>] sriov_disable+0x3c/0xf0
[<ffffffffbddf3ca3>] pci_disable_sriov+0x23/0x30
[<ffffffffc0667365>] i40e_free_vfs+0x265/0x2d0 [i40e]
[<ffffffffc0667624>] i40e_pci_sriov_configure+0x144/0x1f0 [i40e]
[<ffffffffbddd5307>] sriov_numvfs_store+0x177/0x1d0
Code: 00 00 e8 3c 25 e3 ff 49 c7 86 88 08 00 00 00 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 48 8b 7b 28 e8 0d 44
RIP [<ffffffffbbbf1068>] free_msi_irqs+0x188/0x190
The solution is to not touch the adapter->state in iavf_watchdog_task()
and let the reset task handle the state transition.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97683c851f ]
The naming of the regulator is problematic. VCC is usually a supply
voltage whereas these devices have a separate VREF pin.
Secondly, the regulator core might have provided a stub regulator if
a real regulator wasn't provided. That would in turn have failed to
provide a voltage when queried. So reality was that there was no way
to use the internal reference.
In order to avoid breaking any dts out in the wild, make sure to fallback
to the original vcc naming if vref is not available.
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20210627163244.1090296-9-jic23@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f4919ff59c ]
Currently, when userspace reads a datagram with a buffer that is
smaller than this datagram, the data will be truncated and only
part of it can be received by users. It doesn't seem right that
users don't know the datagram size and have to use a huge buffer
to read it to avoid the truncation.
This patch to fix it by keeping the skb in rcv queue until the
whole data is read by users. Only the last msg of the datagram
will be marked with MSG_EOR, just as TCP/SCTP does.
Note that this will work as above only when MSG_EOR is set in the
flags parameter of recvmsg(), so that it won't break any old user
applications.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 14858dcc3b ]
Updating the current_state field of struct pci_dev the way it is done
in pci_enable_device_flags() before calling do_pci_enable_device() may
not work. For example, if the given PCI device depends on an ACPI
power resource whose _STA method initially returns 0 ("off"), but the
config space of the PCI device is accessible and the power state
retrieved from the PCI_PM_CTRL register is D0, the current_state
field in the struct pci_dev representing that device will get out of
sync with the power.state of its ACPI companion object and that will
lead to power management issues going forward.
To avoid such issues, make pci_enable_device_flags() call
pci_update_current_state() which takes ACPI device power management
into account, if present, to retrieve the current power state of the
device.
Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/
Reported-by: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c445535c3e ]
Marking TSC as unstable has a side effect of marking sched_clock as
unstable when TSC is still being used as the sched_clock. This is not
desirable. Hyper-V ultimately uses a paravirtualized clock source that
provides a stable scheduler clock even on systems without TscInvariant
CPU capability. Hence, mark_tsc_unstable() call should be called _after_
scheduler clock has been changed to the paravirtualized clocksource. This
will prevent any unwanted manipulation of the sched_clock. Only TSC will
be correctly marked as unstable.
Signed-off-by: Ani Sinha <ani@anisinha.ca>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210713030522.1714803-1-ani@anisinha.ca
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b43e2ec03b ]
Replace vkms' prepare_fb and cleanup_fb functions with the generic
code for shadow-buffered planes. No functional changes.
This change also fixes a problem where IGT kms_flip tests would
create a segmentation fault within vkms. The driver's prepare_fb
function did not report an error if a BO's vmap operation failed.
The kernel later tried to operate on the non-mapped memory areas.
The shared shadow-plane helpers handle errors correctly, so that
the driver now avoids the segmantation fault.
v2:
* include paragraph about IGT tests in commit message (Melissa)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210705074633.9425-4-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97eb31384a ]
When loading a BPF program with a pinned map, the loader checks whether
the pinned map can be reused, i.e. their properties match. To derive
such of the pinned map, the loader invokes BPF_OBJ_GET_INFO_BY_FD and
then does the comparison.
Unfortunately, on < 4.12 kernels the BPF_OBJ_GET_INFO_BY_FD is not
available, so loading the program fails with the following error:
libbpf: failed to get map info for map FD 5: Invalid argument
libbpf: couldn't reuse pinned map at
'/sys/fs/bpf/tc/globals/cilium_call_policy': parameter
mismatch"
libbpf: map 'cilium_call_policy': error reusing pinned map
libbpf: map 'cilium_call_policy': failed to create:
Invalid argument(-22)
libbpf: failed to load object 'bpf_overlay.o'
To fix this, fallback to derivation of the map properties via
/proc/$PID/fdinfo/$MAP_FD if BPF_OBJ_GET_INFO_BY_FD fails with EINVAL,
which can be used as an indicator that the kernel doesn't support
the latter.
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210712125552.58705-1-m@lambda.lt
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 892c37f8a3 ]
When starting streaming the driver currently programs the buffer
address to the CAL base-address register and assigns the buffer pointer
to ctx->dma.pending. This is not correct, as the buffer is not
"pending", but active, and causes the first buffer to be needlessly
written twice.
Fix this by assigning the buffer pointer to ctx->dma.active.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8db11aebdb ]
The logic at dib8000_get_init_prbs() has a few issues:
1. the tables used there has an extra unused value at the beginning;
2. the dprintk() message doesn't write the right value when
transmission mode is not 8K;
3. the array overflow validation is done by the callers.
Rewrite the code to fix such issues.
This should also shut up those smatch warnings:
drivers/media/dvb-frontends/dib8000.c:2125 dib8000_get_init_prbs() error: buffer overflow 'lut_prbs_8k' 14 <= 14
drivers/media/dvb-frontends/dib8000.c:2129 dib8000_get_init_prbs() error: buffer overflow 'lut_prbs_2k' 14 <= 14
drivers/media/dvb-frontends/dib8000.c:2131 dib8000_get_init_prbs() error: buffer overflow 'lut_prbs_4k' 14 <= 14
drivers/media/dvb-frontends/dib8000.c:2134 dib8000_get_init_prbs() error: buffer overflow 'lut_prbs_8k' 14 <= 14
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bbdd3f4dbe ]
The DIT mode support has not been tested due to lack of platform where it
can be tested.
To be able to use the McASP on OMAP4/5 (only supporting DIT mode) we need
to have DIT mode working in the McASP driver on a know platform.
After hacking around (on BBW, mcasp1.axr1 can be routed out for this) it
appeared that DIT mode is broken.
This patch fixes it up and 16/24 bit audio works along with passthrough,
but I have only tested with DTS example and test files.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Link: https://lore.kernel.org/r/20210705194249.2385-2-peter.ujfalusi@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0b066a6809 ]
Adjust the DVP enable/disable sequence to avoid a pixel getting stuck
in an internal, non resettable FIFO within PixelValve when changing
HDMI resolution.
The blank pixels features of the DVP can prevent signals back to
pixelvalve causing it to not clear the FIFO. Adjust the ordering
and timing of operations to ensure the clear signal makes it through to
pixelvalve.
Signed-off-by: Tim Gover <tim.gover@raspberrypi.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210628130533.144617-1-maxime@cerno.tech
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d9d2ca85b ]
Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1698ecb218 ]
Symptom is random switching of speakers when using multichannel.
Repeatedly running speakertest -c8 occasionally starts with
channels jumbled. This is fixed with HD_CTL_WHOLSMP.
The other bit looks beneficial and apears harmless in testing so
I'd suggest adding it too.
Documentation says: HD_CTL_WHILSMP_SET
Wait for whole sample. When this bit is set MAI transmit will start
only when there is at least one whole sample available in the fifo.
Documentation says: HD_CTL_CHALIGN_SET
Channel Align When Overflow. This bit is used to realign the audio
channels in case of an overflow.
If this bit is set, after the detection of an overflow, equal
amount of dummy words to the missing words will be written to fifo,
filling up the broken sample and maintaining alignment.
Signed-off-by: Dom Cobley <popcornmix@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Nicolas Saenz Julienne <nsaenz@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210525132354.297468-7-maxime@cerno.tech
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 22e5fe2a2a ]
userfaultfd assumes that the enabled features are set once and never
changed after UFFDIO_API ioctl succeeded.
However, currently, UFFDIO_API can be called concurrently from two
different threads, succeed on both threads and leave userfaultfd's
features in non-deterministic state. Theoretically, other uffd operations
(ioctl's and page-faults) can be dispatched while adversely affected by
such changes of features.
Moreover, the writes to ctx->state and ctx->features are not ordered,
which can - theoretically, again - let userfaultfd_ioctl() think that
userfaultfd API completed, while the features are still not initialized.
To avoid races, it is arguably best to get rid of ctx->state. Since there
are only 2 states, record the API initialization in ctx->features as the
uppermost bit and remove ctx->state.
Link: https://lkml.kernel.org/r/20210808020724.1022515-3-namit@vmware.com
Fixes: 9cd75c3cd4 ("userfaultfd: non-cooperative: add ability to report non-PF events from uffd descriptor")
Signed-off-by: Nadav Amit <namit@vmware.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 52d83df682 ]
When CONFIG_TRIM_UNUSED_KSYMS is enabled, I see some warnings like this:
nm: arch/x86/entry/vdso/vdso32/note.o: no symbols
$NM (both GNU nm and llvm-nm) warns when no symbol is found in the
object. Suppress the stderr.
Fangrui Song mentioned binutils>=2.37 `nm -q` can be used to suppress
"no symbols" [1], and llvm-nm>=13.0.0 supports -q as well.
We cannot use it for now, but note it as a TODO.
[1]: https://sourceware.org/bugzilla/show_bug.cgi?id=27408
Fixes: bbda5ec671 ("kbuild: simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bea6a94a27 ]
Starting with following patch MIPS Malta is not able to boot:
| commit 79edff1206
| Author: Rob Herring <robh@kernel.org>
| scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9
The reason is the alignment test added to the fdt_ro_probe_(). To fix
this issue, we need to make sure that fdt_buf is aligned.
Since the dtc patch was designed to uncover potential issue, I handle
initial MIPS Malta patch as initial bug.
Fixes: e81a8c7dab ("MIPS: Malta: Setup RAM regions via DT")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9605f75cf3 ]
The prepare_compress_overwrite() gets/locks a page to prepare a read, and calls
f2fs_read_multi_pages() which checks EOF first. If there's any page beyond EOF,
we unlock the page and set cc->rpages[i] = NULL, which we can't put the page
anymore. This makes page leak, so let's fix by putting that page.
Fixes: a949dc5f2c ("f2fs: compress: fix race condition of overwrite vs truncate")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 827f02842e ]
In f2fs_write_multi_pages(), f2fs_compress_pages() allocates pages for
compression work in cc->cpages[]. Then, f2fs_write_compressed_pages() initiates
bio submission. But, if there's any error before submitting the IOs like early
f2fs_cp_error(), previously it didn't free cpages by f2fs_compress_free_page().
Let's fix memory leak by putting that just before deallocating cc->cpages.
Fixes: 4c8ff7095b ("f2fs: support data compression")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c8dc3047c4 ]
We need to unmap pages from userspace process before removing pagecache
in punch_hole() like we did in f2fs_setattr().
Similar change:
commit 5e44f8c374 ("ext4: hole-punch use truncate_pagecache_range")
Fixes: fbfa2cc58d ("f2fs: add file operations")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit adf9ea89c7 ]
In below path, it will return ENOENT if filesystem is shutdown:
- f2fs_map_blocks
- f2fs_get_dnode_of_data
- f2fs_get_node_page
- __get_node_page
- read_node_page
- is_sbi_flag_set(sbi, SBI_IS_SHUTDOWN)
return -ENOENT
- force return value from ENOENT to 0
It should be fine for read case, since it indicates a hole condition,
and caller could use .m_next_pgofs to skip the hole and continue the
lookup.
However it may cause confusing for write case, since leaving a hole
there, and said nothing was wrong doesn't help.
There is at least one case from dax_iomap_actor() will complain that,
so fix this in prior to supporting dax in f2fs.
xfstest generic/388 reports below warning:
ubuntu godown: xfstests-induced forced shutdown of /mnt/scratch_f2fs:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 485833 at fs/dax.c:1127 dax_iomap_actor+0x339/0x370
Call Trace:
iomap_apply+0x1c4/0x7b0
? dax_iomap_rw+0x1c0/0x1c0
dax_iomap_rw+0xad/0x1c0
? dax_iomap_rw+0x1c0/0x1c0
f2fs_file_write_iter+0x5ab/0x970 [f2fs]
do_iter_readv_writev+0x273/0x2e0
do_iter_write+0xab/0x1f0
vfs_iter_write+0x21/0x40
iter_file_splice_write+0x287/0x540
do_splice+0x37c/0xa60
__x64_sys_splice+0x15f/0x3a0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
ubuntu godown: xfstests-induced forced shutdown of /mnt/scratch_f2fs:
------------[ cut here ]------------
RIP: 0010:dax_iomap_pte_fault.isra.0+0x72e/0x14a0
Call Trace:
dax_iomap_fault+0x44/0x70
f2fs_dax_huge_fault+0x155/0x400 [f2fs]
f2fs_dax_fault+0x18/0x30 [f2fs]
__do_fault+0x4e/0x120
do_fault+0x3cf/0x7a0
__handle_mm_fault+0xa8c/0xf20
? find_held_lock+0x39/0xd0
handle_mm_fault+0x1b6/0x480
do_user_addr_fault+0x320/0xcd0
? rcu_read_lock_sched_held+0x67/0xc0
exc_page_fault+0x77/0x3f0
? asm_exc_page_fault+0x8/0x30
asm_exc_page_fault+0x1e/0x30
Fixes: 83a3bfdb5a ("f2fs: indicate shutdown f2fs to allow unmount successfully")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ad126ebdde ]
There is a missing place we forgot to account .skipped_gc_rwsem, fix it.
Fixes: 6f8d445506 ("f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1d78dfde33 ]
Since commit e1a1ef84cd ("KVM: PPC: Book3S: Allocate guest TCEs on
demand too"), pages for TCE tables for KVM guests are allocated only
when needed. This allows skipping any update when clearing TCEs. This
works mostly fine as TCE updates are handled when the MMU is enabled.
The realmode handlers fail with H_TOO_HARD when pages are not yet
allocated, except when clearing a TCE in which case KVM prints a warning
and proceeds to dereference a NULL pointer, which crashes the host OS.
This has not been caught so far as the change in commit e1a1ef84cd is
reasonably new, and POWER9 runs mostly radix which does not use realmode
handlers. With hash, the default TCE table is memset() by QEMU when the
machine is reset which triggers page faults and the KVM TCE device's
kvm_spapr_tce_fault() handles those with MMU on. And the huge DMA
windows are not cleared by VMs which instead successfully create a DMA
window big enough to map the VM memory 1:1 and then VMs just map
everything without clearing.
This started crashing now as commit 381ceda88c ("powerpc/pseries/iommu:
Make use of DDW for indirect mapping") added a mode when a dymanic DMA
window not big enough to map the VM memory 1:1 but it is used anyway,
and the VM now is the first (i.e. not QEMU) to clear a just created
table. Note that upstream QEMU needs to be modified to trigger the VM to
trigger the host OS crash.
This replaces WARN_ON_ONCE_RM() with a check and return, and adds
another warning if TCE is not being cleared.
Fixes: e1a1ef84cd ("KVM: PPC: Book3S: Allocate guest TCEs on demand too")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210827040706.517652-1-aik@ozlabs.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c16edf5ff8 ]
'clk_init_data' for gates is setting up 'CLK_IS_CRITICAL'
flag for all of them. This was being doing because some
drivers of this SoC might not be ready to use the clock
and we don't wanted the kernel to disable them since default
behaviour without clock driver was to set all gate bits to
enabled state. After a bit more testing and checking driver
code it is safe to remove this flag and just let the kernel
to disable those gates that are not in use. No regressions
seems to appear.
Fixes: 48df7a26f4 ("clk: ralink: add clock driver for mt7621 SoC")
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Link: https://lore.kernel.org/r/20210727055537.11785-1-sergio.paracuellos@gmail.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 35b72573e9 ]
The current hash algorithm used for hashing cookie keys is really bad,
producing almost no dispersion (after a test kernel build, ~30000 files
were split over just 18 out of the 32768 hash buckets).
Borrow the full_name_hash() hash function into fscache to do the hashing
for cookie keys and, in the future, volume keys.
I don't want to use full_name_hash() as-is because I want the hash value to
be consistent across arches and over time as the hash value produced may
get used on disk.
I can also optimise parts of it away as the key will always be a padded
array of aligned 32-bit words.
Fixes: ec0328e46d ("fscache: Maintain a catalogue of allocated cookies")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/162431201844.2908479.8293647220901514696.stgit@warthog.procyon.org.uk/
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b8b9280303 ]
lscpu() uses core_siblings to list the number of sockets in the
system. core_siblings is set using topology_core_cpumask.
While optimizing the powerpc bootup path, Commit 4ca234a9cb
("powerpc/smp: Stop updating cpu_core_mask"). it was found that
updating cpu_core_mask() ended up taking a lot of time. It was thought
that on Powerpc, cpu_core_mask() would always be same as
cpu_cpu_mask() i.e number of sockets will always be equal to number of
nodes. As an optimization, cpu_core_mask() was made a snapshot of
cpu_cpu_mask().
However that was found to be false with PowerPc KVM guests, where each
node could have more than one socket. So with Commit c47f892d7a
("powerpc/smp: Reintroduce cpu_core_mask"), cpu_core_mask was updated
based on chip_id but in an optimized way using some mask manipulations
and chip_id caching.
However on non-PowerNV and non-pseries KVM guests (i.e not
implementing cpu_to_chip_id(), continued to use a copy of
cpu_cpu_mask().
There are two issues that were noticed on such systems
1. lscpu would report one extra socket.
On a IBM,9009-42A (aka zz system) which has only 2 chips/ sockets/
nodes, lscpu would report
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 160
On-line CPU(s) list: 0-159
Thread(s) per core: 8
Core(s) per socket: 6
Socket(s): 3 <--------------
NUMA node(s): 2
Model: 2.2 (pvr 004e 0202)
Model name: POWER9 (architected), altivec supported
Hypervisor vendor: pHyp
Virtualization type: para
L1d cache: 32K
L1i cache: 32K
L2 cache: 512K
L3 cache: 10240K
NUMA node0 CPU(s): 0-79
NUMA node1 CPU(s): 80-159
2. Currently cpu_cpu_mask is updated when a core is
added/removed. However its not updated when smt mode switching or on
CPUs are explicitly offlined. However all other percpu masks are
updated to ensure only active/online CPUs are in the masks.
This results in build_sched_domain traces since there will be CPUs in
cpu_cpu_mask() but those CPUs are not present in SMT / CACHE / MC /
NUMA domains. A loop of threads running smt mode switching and core
add/remove will soon show this trace.
Hence cpu_cpu_mask has to be update at smt mode switch.
This will have impact on cpu_core_mask(). cpu_core_mask() is a
snapshot of cpu_cpu_mask. Different CPUs within the same socket will
end up having different cpu_core_masks since they are snapshots at
different points of time. This means when lscpu will start reporting
many more sockets than the actual number of sockets/ nodes / chips.
Different ways to handle this problem:
A. Update the snapshot aka cpu_core_mask for all CPUs whenever
cpu_cpu_mask is updated. This would a non-optimal solution.
B. Instead of a cpumask_var_t, make cpu_core_map a cpumask pointer
pointing to cpu_cpu_mask. However percpu cpumask pointer is frowned
upon and we need a clean way to handle PowerPc KVM guest which is
not a snapshot.
C. Update cpu_core_masks all PowerPc systems like in PowerPc KVM
guests using mask manipulations. This approach is relatively simple
and unifies with the existing code.
D. On top of 3, we could also resurrect get_physical_package_id which
could return a nid for the said CPU. However this is not needed at this
time.
Option C is the preferred approach for now.
While this is somewhat a revert of Commit 4ca234a9cb ("powerpc/smp:
Stop updating cpu_core_mask").
1. Plain revert has some conflicts
2. For chip_id == -1, the cpu_core_mask is made identical to
cpu_cpu_mask, unlike previously where cpu_core_mask was set to a core
if chip_id doesn't exist.
This goes by the principle that if chip_id is not exposed, then
sockets / chip / node share the same set of CPUs.
With the fix, lscpu o/p would be
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 160
On-line CPU(s) list: 0-159
Thread(s) per core: 8
Core(s) per socket: 6
Socket(s): 2 <--------------
NUMA node(s): 2
Model: 2.2 (pvr 004e 0202)
Model name: POWER9 (architected), altivec supported
Hypervisor vendor: pHyp
Virtualization type: para
L1d cache: 32K
L1i cache: 32K
L2 cache: 512K
L3 cache: 10240K
NUMA node0 CPU(s): 0-79
NUMA node1 CPU(s): 80-159
Fixes: 4ca234a9cb ("powerpc/smp: Stop updating cpu_core_mask")
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210826100401.412519-3-srikar@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8efd249bab ]
Aneesh reported a crash with a fairly recent upstream kernel when
booting kernel whose commandline was appended with nr_cpus=2
1:mon> e
cpu 0x1: Vector: 300 (Data Access) at [c000000008a67bd0]
pc: c00000000002557c: cpu_to_chip_id+0x3c/0x100
lr: c000000000058380: start_secondary+0x460/0xb00
sp: c000000008a67e70
msr: 8000000000001033
dar: 10
dsisr: 80000
current = 0xc00000000891bb00
paca = 0xc0000018ff981f80 irqmask: 0x03 irq_happened: 0x01
pid = 0, comm = swapper/1
Linux version 5.13.0-rc3-15704-ga050a6d2b7e8 (kvaneesh@ltc-boston8) (gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #433 SMP Tue May 25 02:38:49 CDT 2021
1:mon> t
[link register ] c000000000058380 start_secondary+0x460/0xb00
[c000000008a67e70] c000000008a67eb0 (unreliable)
[c000000008a67eb0] c0000000000589d4 start_secondary+0xab4/0xb00
[c000000008a67f90] c00000000000c654 start_secondary_prolog+0x10/0x14
Current code assumes that num_possible_cpus() is always greater than
threads_per_core. However this may not be true when using nr_cpus=2 or
similar options. Handle the case where num_possible_cpus() is not an
exact multiple of threads_per_core.
Fixes: c1e53367da ("powerpc/smp: Cache CPU to chip lookup")
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Debugged-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210826100401.412519-2-srikar@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit eb653eda1e ]
dip_idx and dgid should be a one-to-one mapping relationship, but when
qp_num loops back to the start number, it may happen that two different
dgid are assiociated to the same dip_idx incorrectly.
One solution is to store the qp_num that is not assigned to dip_idx in an
array. When a dip_idx needs to be allocated to a new dgid, an spare qp_num
is extracted and assigned to dip_idx.
Fixes: f91696f2f0 ("RDMA/hns: Support congestion control type selection according to the FW")
Link: https://lore.kernel.org/r/1629884592-23424-4-git-send-email-liangwenpeng@huawei.com
Signed-off-by: Junxian Huang <huangjunxian4@hisilicon.com>
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3c69a5f222 ]
Incase of random sampling, there can be scenarios where
Sample Instruction Address Register(SIAR) may not latch
to the sampled instruction and could result in
the value of 0. In these scenarios it is preferred to
return regs->nip. These corner cases are seen in the
previous generation (p9) also.
Patch adds the check for SIAR value along with regs_use_siar
and siar_valid checks so that the function will return
regs->nip incase SIAR is zero.
Patch drops the code under PPMU_P10_DD1 flag check
which handles SIAR 0 case only for Power10 DD1.
Fixes: 2ca13a4cc5 ("powerpc/perf: Use regs->nip when SIAR is zero")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818171556.36912-3-kjain@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1782663897 ]
After the L1 saves its PMU SPRs but before loading the L2's PMU SPRs,
switch the pmcregs_in_use field in the L1 lppaca to the value advertised
by the L2 in its VPA. On the way out of the L2, set it back after saving
the L2 PMU registers (if they were in-use).
This transfers the PMU liveness indication between the L1 and L2 at the
points where the registers are not live.
This fixes the nested HV bug for which a workaround was added to the L0
HV by commit 63279eeb7f ("KVM: PPC: Book3S HV: Always save guest pmu
for guest capable of nesting"), which explains the problem in detail.
That workaround is no longer required for guests that include this bug
fix.
Fixes: 360cae3137 ("KVM: PPC: Book3S HV: Nested guest entry via hypercall")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>
Link: https://lore.kernel.org/r/20210811160134.904987-10-npiggin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 313bf281f2 ]
clk_get_rate() returns unsigned long and currently this driver stores the
return value in u32 type, resulting the below warning:
Fixed smatch warnings:
drivers/scsi/ufs/ufs-exynos.c:286 exynos_ufs_get_clk_info()
warn: wrong type for 'ufs->mclk_rate' (should be 'ulong')
drivers/scsi/ufs/ufs-exynos.c:287 exynos_ufs_get_clk_info()
warn: wrong type for 'pclk_rate' (should be 'ulong')
Link: https://lore.kernel.org/r/20210819171131.55912-1-alim.akhtar@samsung.com
Fixes: 55f4b1f736 ("scsi: ufs: ufs-exynos: Add UFS host support for Exynos SoCs")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5d7d6dac8f ]
The __kvmhv_copy_tofrom_guest_radix function was introduced along with
nested HV guest support. It uses the platform's Radix MMU quadrants to
provide a nested hypervisor with fast access to its nested guests
memory (H_COPY_TOFROM_GUEST hypercall). It has also since been added
as a fast path for the kvmppc_ld/st routines which are used during
instruction emulation.
The commit def0bfdbd6 ("powerpc: use probe_user_read() and
probe_user_write()") changed the low level copy function from
raw_copy_from_user to probe_user_read, which adds a check to
access_ok. In powerpc that is:
static inline bool __access_ok(unsigned long addr, unsigned long size)
{
return addr < TASK_SIZE_MAX && size <= TASK_SIZE_MAX - addr;
}
and TASK_SIZE_MAX is 0x0010000000000000UL for 64-bit, which means that
setting the two MSBs of the effective address (which correspond to the
quadrant) now cause access_ok to reject the access.
This was not caught earlier because the most common code path via
kvmppc_ld/st contains a fallback (kvm_read_guest) that is likely to
succeed for L1 guests. For nested guests there is no fallback.
Another issue is that probe_user_read (now __copy_from_user_nofault)
does not return the number of bytes not copied in case of failure, so
the destination memory is not being cleared anymore in
kvmhv_copy_from_guest_radix:
ret = kvmhv_copy_tofrom_guest_radix(vcpu, eaddr, to, NULL, n);
if (ret > 0) <-- always false!
memset(to + (n - ret), 0, ret);
This patch fixes both issues by skipping access_ok and open-coding the
low level __copy_to/from_user_inatomic.
Fixes: def0bfdbd6 ("powerpc: use probe_user_read() and probe_user_write()")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210805212616.2641017-2-farosas@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d36207b848 ]
On the i.MX8M*, the TF-A exposes a SiP (Silicon Provider) service
for DDR frequency scaling. The imx8m-ddrc-devfreq driver calls the
SiP and then does clk_set_parent on the DDR muxes to synchronize
the clock tree.
Since 936c383673 ("clk: imx: fix composite peripheral flags"),
these TF-A managed muxes have SET_PARENT_GATE set, which results
in imx8m-ddrc-devfreq's clk_set_parent after SiP failing with -EBUSY:
echo 25000000 > userspace/set_freq
imx8m-ddrc-devfreq 3d400000.memory-controller: failed to set
dram_apb parent: -16
Fix this by adding a new i.MX composite flag for firmware managed
clocks, which clears SET_PARENT_GATE.
This is safe to do, because updating the Linux clock tree to reflect
reality will always be glitch-free.
Fixes: 936c383673 ("clk: imx: fix composite peripheral flags")
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Abel Vesa <abel.vesa@nxp.com>
Link: https://lore.kernel.org/r/20210810151432.9228-1-a.fatoum@pengutronix.de
Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 462ba66198 ]
Commit c49c336378 ("HID: support for initialization of some Thrustmaster
wheels") messed up the Makefile and quirks during the refactoring of this
commit.
Luckily, ./scripts/checkkconfigsymbols.py warns on non-existing configs:
HID_TMINIT
Referencing files: drivers/hid/Makefile, drivers/hid/hid-quirks.c
Following the discussion (see Link), CONFIG_HID_THRUSTMASTER is the
intended config for CONFIG_HID_TMINIT and the file hid-tminit.c was
actually added as hid-thrustmaster.c.
So, clean up Makefile and adapt quirks to that refactoring.
Fixes: c49c336378 ("HID: support for initialization of some Thrustmaster wheels")
Link: https://lore.kernel.org/linux-input/CAKXUXMx6dByO03f3dX0X5zjvQp0j2AhJBg0vQFDmhZUhtKxRxw@mail.gmail.com/
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 786537063b ]
A quirk was recently added for Elan devices that has same device match
as an entry earlier in the list. The i2c_hid_lookup_quirk function will
always return the last match in the list, so the new entry shadows the
old entry. The quirk in the previous entry, I2C_HID_QUIRK_BOGUS_IRQ,
silenced a flood of messages which have reappeared in the 5.13 kernel.
This change moves the two quirk flags into the same entry.
Fixes: ca66a6770b (HID: i2c-hid: Skip ELAN power-on command after reset)
Signed-off-by: Jim Broadus <jbroadus@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3978f54817 ]
Existing amd-sfh driver is programming the MP2 firmware period field in
units of jiffies, but the MP2 firmware expects in milliseconds unit.
Changing it to milliseconds.
Fixes: 4b2c53d93a ("SFH:Transport Driver to add support of AMD Sensor Fusion Hub (SFH)")
Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b96d9b3b09 ]
The value of FAULT_* macros and its description in f2fs.rst became
inconsistent, fix this to keep compatibility of fault injection
interface.
Fixes: 67883ade7a ("f2fs: remove FAULT_ALLOC_BIO")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d0e28a6145 ]
CONFIG_MTD_PHYSMAP_OF is not longer enabled as it depends on
MTD_PHYSMAP which is not enabled.
This is a regression from commit 642b1e8dbe ("mtd: maps: Merge
physmap_of.c into physmap-core.c"), which added the extra dependency.
Add CONFIG_MTD_PHYSMAP=y so this stays in the config, as Christophe said
it is useful for build coverage.
Fixes: 642b1e8dbe ("mtd: maps: Merge physmap_of.c into physmap-core.c")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210817045407.2445664-3-joel@jms.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ccc89737aa ]
This driver has some left over "return 1" on failure style code mixed with
"return negative error codes" style code. The caller doesn't care so we
should just convert everything to return negative error codes.
Then there was a problem that there were two variables used to store error
codes which just resulted in confusion. If qedf_alloc_bdq() returned a
negative error code, we accidentally returned success instead of
propagating the error code. So get rid of the "rc" variable and use
"status" every where.
Also remove the "status = 0" initialization so that these sorts of bugs
will be detected by the compiler in the future.
Link: https://lore.kernel.org/r/20210810085023.GA23998@kili
Fixes: 61d8658b4a ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.")
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4dbe57d46d ]
This function had some left over code that returned 1 on error instead
negative error codes. Convert everything to use negative error codes. The
caller treats all non-zero returns the same so this does not affect run
time.
A couple places set "rc" instead of "status" so those error paths ended up
returning success by mistake. Get rid of the "rc" variable and use
"status" everywhere.
Remove the bogus "status = 0" initialization, as a future proofing measure
so the compiler will warn about uninitialized error codes.
Link: https://lore.kernel.org/r/20210810084753.GD23810@kili
Fixes: ace7f46ba5 ("scsi: qedi: Add QLogic FastLinQ offload iSCSI driver framework.")
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d36d4a1d75 ]
When numa is used to map CPU to PCI device, the optimized path to read
from cached data is not working and still calls _isst_if_get_pci_dev().
The reason is that when caching the mapping, numa information is not
available as it is read later. So move the assignment of
isst_cpu_info[cpu].numa_node before calling _isst_if_get_pci_dev().
Fixes: aa2ddd2425 ("platform/x86: ISST: Use numa node id for cpu pci dev mapping")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://lore.kernel.org/r/20210727165052.427238-1-srinivas.pandruvada@linux.intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9c7248bb8d ]
When a LPAR is migratable, we should consider the maximum possible NUMA
node instead of the number of NUMA nodes from the actual system.
The DT property 'ibm,current-associativity-domains' defines the maximum
number of nodes the LPAR can see when running on that box. But if the
LPAR is being migrated on another box, it may see up to the nodes
defined by 'ibm,max-associativity-domains'. So if a LPAR is migratable,
that value should be used.
Unfortunately, there is no easy way to know if an LPAR is migratable or
not. The hypervisor exports the property 'ibm,migratable-partition' in
the case it set to migrate partition, but that would not mean that the
current partition is migratable.
Without this patch, when a LPAR is started on a 2 node box and then
migrated to a 3 node box, the hypervisor may spread the LPAR's CPUs on
the 3rd node. In that case if a CPU from that 3rd node is added to the
LPAR, it will be wrongly assigned to the node because the kernel has
been set to use up to 2 nodes (the configuration of the departure node).
With this patch applies, the CPU is correctly added to the 3rd node.
Fixes: f9f130ff2e ("powerpc/numa: Detect support for coregroup")
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210511073136.17795-1-ldufour@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf25967ac5 ]
Managed device links are deleted by device_del(). However it is possible to
add a device link to a consumer before device_add(), and then discovering
an error prevents the device from being used. In that case normally
references to the device would be dropped and the device would be deleted.
However the device link holds a reference to the device, so the device link
and device remain indefinitely (unless the supplier is deleted).
For UFSHCD, if a LUN fails to probe (e.g. absent BOOT WLUN), the device
will not have been registered but can still have a device link holding a
reference to the device. The unwanted device link will prevent runtime
suspend indefinitely.
Amend device link removal to accept removal of a link with an unregistered
consumer device (suggested by Rafael), and fix UFSHCD by explicitly
deleting the device link when SCSI destroys the SCSI device.
Link: https://lore.kernel.org/r/a1c9bac8-b560-b662-f0aa-58c7e000cbbd@intel.com
Fixes: b294ff3e34 ("scsi: ufs: core: Enable power management for wlun")
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5d46dd04cb ]
Since bc1c56e9bb transport->srcport may by unset, causing
get_srcport() to return 0 when called. Fix this by querying the port
from the underlying socket instead of the transport.
Fixes: bc1c56e9bb (SUNRPC: prevent port reuse on transports which don't request it)
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f99fa50880 ]
The xprtrdma client code currently relies on the task that initiated the
connect to hold the XPRT_LOCK for the duration of the connection
attempt. If the task is woken early, due to some other event, then that
lock could get released early.
Avoid races by using the same mechanism that the socket code uses of
transferring lock ownership to the RDMA connect worker itself. That
frees us to call rpcrdma_xprt_disconnect() directly since we're now
guaranteed exclusion w.r.t. other callers.
Fixes: 4cf44be6f1 ("xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c2dc3e5fad ]
We really should not call rpc_wake_up_queued_task_set_status() with
xprt->snd_task as an argument unless we are certain that is actually an
rpc_task.
Fixes: 0445f92c5d ("SUNRPC: Fix disconnection races")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d6236a98b3 ]
The intention of the layout barrier is to ensure that we do not update
the layout to match an older value than the current expectation. Fix the
test in pnfs_layout_stateid_blocked() to reflect that it is legal for
the seqid of the stateid to match that of the barrier.
Fixes: aa95edf309 ("NFSv4/pnfs: Fix the layout barrier update")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 45baadaad7 ]
A zero value for the layout barrier indicates that it has been cleared
(since seqid '0' is an illegal value), so we should always allow it to
be updated.
Fixes: d29b468da4 ("pNFS/NFSv4: Improve rejection of out-of-order layouts")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e20772cbdf ]
If NFS_LAYOUT_RETURN_REQUESTED is set, but there is no value set for
the layout plh_return_seq, we can end up in a livelock loop in which
every layout segment retrieved by a new call to layoutget is immediately
invalidated by pnfs_layout_need_return().
To get around this, we should just set plh_return_seq to the current
value of the layout stateid's seqid.
Fixes: d474f96104 ("NFS: Don't return layout segments that are in use")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 97480cae13 ]
Ensure the tear-down completion is awoken only /after/ we've stopped
fiddling with rpcrdma_rep objects in rpcrdma_post_recvs().
Fixes: 15788d1d10 ("xprtrdma: Do not refresh Receive Queue while it is draining")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 173735c346 ]
Due to link order, dma_debug_init is called before debugfs has a chance
to initialize (via debugfs_init which also happens in the core initcall
stage), so the directories for dma-debug are never created.
Decouple dma_debug_fs_init from dma_debug_init and defer its init until
core_initcall_sync (after debugfs has been initialized) while letting
dma-debug initialization occur as soon as possible to catch any early
mappings, as suggested in [1].
[1] https://lore.kernel.org/linux-iommu/YIgGa6yF%2Fadg8OSN@kroah.com/
Fixes: 15b28bbcd5 ("dma-debug: move initialization to common code")
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 946e1052cd ]
Don't call printk() when CONFIG_PRINTK is not set.
Fixes the following build errors:
or1k-linux-ld: arch/openrisc/kernel/entry.o: in function `_external_irq_handler':
(.text+0x804): undefined reference to `printk'
(.text+0x804): relocation truncated to fit: R_OR1K_INSN_REL_26 against undefined symbol `printk'
Fixes: 9d02a4283e ("OpenRISC: Boot code")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: openrisc@lists.librecores.org
Signed-off-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d4bf15a7ce ]
I recently found a case where de->name_len is 0 in f2fs_fill_dentries()
easily reproduced, and finally set the fsck flag.
Thread A Thread B
- f2fs_readdir
- f2fs_read_inline_dir
- ctx->pos = d.max
- f2fs_add_dentry
- f2fs_add_inline_entry
- do_convert_inline_dir
- f2fs_add_regular_entry
- f2fs_readdir
- f2fs_fill_dentries
- set_sbi_flag(sbi, SBI_NEED_FSCK)
Process A opens the folder, and has been reading without closing it.
During this period, Process B created a file under the folder (occupying
multiple f2fs_dir_entry, exceeding the d.max of the inline dir). After
creation, process A uses the d.max of inline dir to read it again, and
it will read that de->name_len is 0.
And Chao pointed out that w/o inline conversion, the race condition still
can happen as below:
dir_entry1: A
dir_entry2: B
dir_entry3: C
free slot: _
ctx->pos: ^
Thread A is traversing directory,
ctx-pos moves to below position after readdir() by thread A:
AAAABBBB___
^
Then thread B delete dir_entry2, and create dir_entry3.
Thread A calls readdir() to lookup dirents starting from middle
of new dirent slots as below:
AAAACCCCCC_
^
In these scenarios, the file system is not damaged, and it's hard to
avoid it. But we can bypass tagging FSCK flag if:
a) bit_pos (:= ctx->pos % d->max) is non-zero and
b) before bit_pos moves to first valid dir_entry.
Fixes: ddf06b753a ("f2fs: fix to trigger fsck if dirent.name_len is zero")
Signed-off-by: Yangtao Li <frank.li@vivo.com>
[Chao: clean up description]
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d04691d373 ]
After commit 7cbd631d4dec ("cpuidle: pseries: Fixup CEDE0 latency only
for POWER10 onwards"), pseries_idle_probe() is no longer inlined when
compiling with clang, which causes a modpost warning:
WARNING: modpost: vmlinux.o(.text+0xc86a54): Section mismatch in
reference from the function pseries_idle_probe() to the function
.init.text:fixup_cede0_latency()
The function pseries_idle_probe() references
the function __init fixup_cede0_latency().
This is often because pseries_idle_probe lacks a __init
annotation or the annotation of fixup_cede0_latency is wrong.
pseries_idle_probe() is a non-init function, which calls
fixup_cede0_latency(), which is an init function, explaining the
mismatch. pseries_idle_probe() is only called from
pseries_processor_idle_init(), which is an init function, so mark
pseries_idle_probe() as __init so there is no more warning.
Fixes: 054e44ba99 ("cpuidle: pseries: Add function to parse extended CEDE records")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210803211547.1093820-1-nathan@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6cae77f1b ]
commit 7c6986ade6 ("powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()")
introduces udelay() call without including the linux/delay.h header.
This may happen to work on master but the header that declares the
functionshould be included nonetheless.
Fixes: 7c6986ade6 ("powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210729180103.15578-1-msuchanek@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50741b70b0 ]
Commit d947fb4c96 ("cpuidle: pseries: Fixup exit latency for
CEDE(0)") sets the exit latency of CEDE(0) based on the latency values
of the Extended CEDE states advertised by the platform
On POWER9 LPARs, the firmwares advertise a very low value of 2us for
CEDE1 exit latency on a Dedicated LPAR. The latency advertized by the
PHYP hypervisor corresponds to the latency required to wakeup from the
underlying hardware idle state. However the wakeup latency from the
LPAR perspective should include
1. The time taken to transition the CPU from the Hypervisor into the
LPAR post wakeup from platform idle state
2. Time taken to send the IPI from the source CPU (waker) to the idle
target CPU (wakee).
1. can be measured via timer idle test, where we queue a timer, say
for 1ms, and enter the CEDE state. When the timer fires, in the timer
handler we compute how much extra timer over the expected 1ms have we
consumed. On a a POWER9 LPAR the numbers are
CEDE latency measured using a timer (numbers in ns)
N Min Median Avg 90%ile 99%ile Max Stddev
400 2601 5677 5668.74 5917 6413 9299 455.01
1. and 2. combined can be determined by an IPI latency test where we
send an IPI to an idle CPU and in the handler compute the time
difference between when the IPI was sent and when the handler ran. We
see the following numbers on POWER9 LPAR.
CEDE latency measured using an IPI (numbers in ns)
N Min Median Avg 90%ile 99%ile Max Stddev
400 711 7564 7369.43 8559 9514 9698 1200.01
Suppose, we consider the 99th percentile latency value measured using
the IPI to be the wakeup latency, the value would be 9.5us This is in
the ballpark of the default value of 10us.
Hence, use the exit latency of CEDE(0) based on the latency values
advertized by platform only from POWER10 onwards. The values
advertized on POWER10 platforms is more realistic and informed by the
latency measurements. For earlier platforms stick to the default value
of 10us. The fix was suggested by Michael Ellerman.
Fixes: d947fb4c96 ("cpuidle: pseries: Fixup exit latency for CEDE(0)")
Reported-by: Enrico Joedecke <joedecke@de.ibm.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1626676399-15975-2-git-send-email-ego@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6418074260 ]
Make the following changes in ufshcd_abort():
- Return FAILED instead of SUCCESS if the abort handler notices that a
SCSI command has already been completed. Returning SUCCESS in this case
triggers a use-after-free and may trigger a kernel crash.
- Fix the code for aborting SCSI commands submitted to a WLUN.
The current approach for aborting SCSI commands that have been submitted to
a WLUN and that timed out is as follows:
- Report to the SCSI core that the command has completed successfully.
Let the block layer free any data buffers associated with the command.
- Mark the command as outstanding in 'outstanding_reqs'.
- If the block layer tries to reuse the tag associated with the aborted
command, busy-wait until the tag is freed.
This approach can result in:
- Memory corruption if the controller accesses the data buffer after the
block layer has freed the associated data buffers.
- A race condition if ufshcd_queuecommand() or ufshcd_exec_dev_cmd()
checks the bit that corresponds to an aborted command in
'outstanding_reqs' after it has been cleared and before it is reset.
- High energy consumption if ufshcd_queuecommand() repeatedly returns
SCSI_MLQUEUE_HOST_BUSY.
Fix this by reporting to the SCSI error handler that aborting a SCSI
command failed if the SCSI command was submitted to a WLUN.
Link: https://lore.kernel.org/r/20210722033439.26550-15-bvanassche@acm.org
Fixes: 7a7e66c65d ("scsi: ufs: Fix a race condition between ufshcd_abort() and eh_work()")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Cc: Can Guo <cang@codeaurora.org>
Cc: Asutosh Das <asutoshd@codeaurora.org>
Cc: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 277afbde6c ]
In f2fs_remount(), return value of test_opt() is an unsigned int type
variable, however when we compare it to a bool type variable, it cause
wrong result, fix it.
Fixes: 4354994f09 ("f2fs: checkpoint disabling")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b7ec206173 ]
After the below patch, give cp is errored, we drop dirty node pages. This
can give NEW_ADDR to read node pages. Don't do WARN_ON() which gives
generic/475 failure.
Fixes: 28607bf3aa ("f2fs: drop dirty node pages when cp is in error status")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 62004871e1 ]
It is possible for the primary IPoIB network device associated with any
RDMA device to fail to join certain multicast groups preventing IPv6
neighbor discovery and possibly other network ULPs from working
correctly. The IPv4 broadcast group is not affected as the IPoIB network
device handles joining that multicast group directly.
This is because the primary IPoIB network device uses the pkey at ndex 0
in the associated RDMA device's pkey table. Anytime the pkey value of
index 0 changes, the primary IPoIB network device automatically modifies
it's broadcast address (i.e. /sys/class/net/[ib0]/broadcast), since the
broadcast address includes the pkey value, and then bounces carrier. This
includes initial pkey assignment, such as when the pkey at index 0
transitions from the opa default of invalid (0x0000) to some value such as
the OPA default pkey for Virtual Fabric 0: 0x8001 or when the fabric
manager is restarted with a configuration change causing the pkey at index
0 to change. Many network ULPs are not sensitive to the carrier bounce and
are not expecting the broadcast address to change including the linux IPv6
stack. This problem does not affect IPoIB child network devices as their
pkey value is constant for all time.
To mitigate this issue, change the default pkey in at index 0 to 0x8001 to
cover the predominant case and avoid issues as ipoib comes up and the FM
sweeps.
At some point, ipoib multicast support should automatically fix
non-broadcast addresses as it does with the primary broadcast address.
Fixes: 7724105686 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210715160445.142451.47651.stgit@awfm-01.cornelisnetworks.com
Suggested-by: Josh Collier <josh.d.collier@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6fffe52fb3 ]
The rk3036/rk3328 pll types were converted to checking the lock status
via the internal register in january 2020, so don't need the grf
reference since then.
But it was forgotten to remove grf check when deciding between the
pll rate ops (read-only vs. read-write), so a clock driver without
the needed grf reference might've been put into the read-only mode
just because the grf reference was missing.
This affected the rk356x that needs to reclock certain plls at boot.
Fix this by removing the check for the grf for selecting the utilized
operations.
Suggested-by: Heiko Stuebner <heiko@sntech.de>
Fixes: 7f6ffbb885 ("clk: rockchip: convert rk3036 pll type to use internal lock status")
Signed-off-by: Peter Geis <pgwipeout@gmail.com>
[adjusted the commit message, adjusted the fixes tag]
Link: https://lore.kernel.org/r/20210728180034.717953-3-pgwipeout@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit baf8d6899b ]
The PWM pins on North Bridge on Armada 37xx can be configured into PWM
or GPIO functions. When in PWM function, each pin can also be configured
to drive low on 0 and tri-state on 1 (LED mode).
The current definitions handle this by declaring two pin groups for each
pin:
- group "pwmN" with functions "pwm" and "gpio"
- group "ledN_od" ("od" for open drain) with functions "led" and "gpio"
This is semantically incorrect. The correct definition for each pin
should be one group with three functions: "pwm", "led" and "gpio".
Change the "pwmN" groups to support "led" function.
Remove "ledN_od" groups. This cannot break backwards compatibility with
older device trees: no device tree uses it since there is no PWM driver
for this SOC yet. Also "ledN_od" groups are not even documented.
Fixes: b835d69530 ("pinctrl: armada-37xx: swap polarity on LED group")
Signed-off-by: Marek Behún <kabel@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210719112938.27594-1-kabel@kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f4abaa9eeb ]
The power supply states of discharging, charging, full, etc, represent
state of charging, not the capacity level of the battery (for which
we have a separate property). Current HID usage tables to not allow
for expressing charging state of the batteries found in generic
styli, so we should simply assume that the battery is discharging
even if current capacity is at 100% when battery strength reporting
is done via HID interface. In fact, we were doing just that before
commit 581c448476.
This change helps UIs to not mis-represent fully charged batteries in
styli as being charging/topping-off.
Fixes: 581c448476 ("HID: input: map digitizer battery usage")
Reported-by: Kenneth Albanowski <kenalba@google.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e2d98504c6 ]
On idle session, because we do not do signal for heartbeat, it will
overflow the send queue after sometime.
To avoid that, we need to enable the signal for heartbeat. To do that, add
a new member signal_interval in rtrs_path, which will set min of
queue_depth and SERVICE_CON_QUEUE_DEPTH, and track it for both heartbeat
and IO, so the sq queue full accounting is correct.
Fixes: b38041d50a ("RDMA/rtrs: Do not signal for heatbeat")
Link: https://lore.kernel.org/r/20210712060750.16494-4-jinpu.wang@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Aleksei Marov <aleksei.marov@ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 02bcec3ea5 upstream.
Measurements in different conditions showed that aardvark hardware PIO
response can take up to 1.44s. Increase wait timeout from 1ms to 1.5s to
ensure that we do not miss responses from hardware. After 1.44s hardware
returns errors (e.g. Completer abort).
The previous two patches fixed checking for PIO status, so now we can use
it to also catch errors which are reported by hardware after 1.44s.
After applying this patch, kernel can detect and print PIO errors to dmesg:
[ 6.879999] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100004
[ 6.896436] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100004
[ 6.913049] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100010
[ 6.929663] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100010
[ 6.953558] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100014
[ 6.970170] advk-pcie d0070000.pcie: Non-posted PIO Response Status: CA, 0xe00 @ 0x100014
[ 6.994328] advk-pcie d0070000.pcie: Posted PIO Response Status: COMP_ERR, 0x804 @ 0x100004
Without this patch kernel prints only a generic error to dmesg:
[ 5.246847] advk-pcie d0070000.pcie: config read/write timed out
Link: https://lore.kernel.org/r/20210722144041.12661-3-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org # 7fbcb5da81 ("PCI: aardvark: Don't rely on jiffies while holding spinlock")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fcb461e2bc upstream.
There is an issue that when PCIe switch is connected to an Armada 3700
board, there will be lots of warnings about PIO errors when reading the
config space. According to Aardvark PIO read and write sequence in HW
specification, the current way to check PIO status has the following
issues:
1) For PIO read operation, it reports the error message, which should be
avoided according to HW specification.
2) For PIO read and write operations, it only checks PIO operation complete
status, which is not enough, and error status should also be checked.
This patch aligns the code with Aardvark PIO read and write sequence in HW
specification on PIO status check and fix the warnings when reading config
space.
[pali: Fix CRS handling when CRSSVE is not enabled]
Link: https://lore.kernel.org/r/20210722144041.12661-2-pali@kernel.org
Tested-by: Victor Gu <xigu@marvell.com>
Signed-off-by: Evan Wang <xswang@marvell.com>
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Victor Gu <xigu@marvell.com>
Reviewed-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org # b1bd571447 ("PCI: aardvark: Indicate error in 'val' when config read fails")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64f160e19e upstream.
In commit 6df6ba974a ("PCI: aardvark: Remove PCIe outbound window
configuration") was removed aardvark PCIe outbound window configuration and
commit description said that was recommended solution by HW designers.
But that commit completely removed support for configuring PCIe IO
resources without removing PCIe IO 'ranges' from DTS files. After that
commit PCIe IO space started to be treated as PCIe MEM space and accessing
it just caused kernel crash.
Moreover implementation of PCIe outbound windows prior that commit was
incorrect. It completely ignored offset between CPU address and PCIe bus
address and expected that in DTS is CPU address always same as PCIe bus
address without doing any checks. Also it completely ignored size of every
PCIe resource specified in 'ranges' DTS property and expected that every
PCIe resource has size 128 MB (also for PCIe IO range). Again without any
check. Apparently none of PCIe resource has in DTS specified size of 128
MB. So it was completely broken and thanks to how aardvark mask works,
configuration was completely ignored.
This patch reverts back support for PCIe outbound window configuration but
implementation is a new without issues mentioned above. PCIe outbound
window is required when DTS specify in 'ranges' property non-zero offset
between CPU and PCIe address space. To address recommendation by HW
designers as specified in commit description of 6df6ba974a, set default
outbound parameters as PCIe MEM access without translation and therefore
for this PCIe 'ranges' it is not needed to configure PCIe outbound window.
For PCIe IO space is needed to configure aardvark PCIe outbound window.
This patch fixes kernel crash when trying to access PCIe IO space.
Link: https://lore.kernel.org/r/20210624215546.4015-2-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org # 6df6ba974a ("PCI: aardvark: Remove PCIe outbound window configuration")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8bd29bd49 upstream.
The pciconfig_read() syscall reads PCI configuration space using
hardware-dependent config accessors.
If the read fails on PCI, most accessors don't return an error; they
pretend the read was successful and got ~0 data from the device, so the
syscall returns success with ~0 data in the buffer.
When the accessor does return an error, pciconfig_read() normally fills the
user's buffer with ~0 and returns an error in errno. But after
e4585da22a ("pci syscall.c: Switch to refcounting API"), we don't fill
the buffer with ~0 for the EPERM "user lacks CAP_SYS_ADMIN" error.
Userspace may rely on the ~0 data to detect errors, but after e4585da22a,
that would not detect CAP_SYS_ADMIN errors.
Restore the original behaviour of filling the buffer with ~0 when the
CAP_SYS_ADMIN check fails.
[bhelgaas: commit log, fold in Nathan's fix
https://lore.kernel.org/r/20210803200836.500658-1-nathan@kernel.org]
Fixes: e4585da22a ("pci syscall.c: Switch to refcounting API")
Link: https://lore.kernel.org/r/20210729233755.1509616-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 00823dcbdd upstream.
Previously we assumed that all Root Ports and Switch Downstream Ports
supported Link Bandwidth Notification. Per spec, this is only required
for Ports supporting Links wider than x1 and/or multiple Link speeds
(PCIe r5.0, sec 7.5.3.6).
Because we assumed all Ports supported it, we tried to set up a Bandwidth
Notification IRQ, which failed for devices that don't support IRQs at all,
which meant pcieport didn't attach to the Port at all.
Check the Link Bandwidth Notification Capability bit and enable the service
only when the Port supports it.
[bhelgaas: commit log]
Fixes: e8303bb7a7 ("PCI/LINK: Report degraded links via link bandwidth notification")
Link: https://lore.kernel.org/r/20210512213314.7778-1-stuart.w.hayes@gmail.com
Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65ddf65648 upstream.
This patch fixes below problems of sb/cp sanity check:
- in sanity_check_raw_superi(), it missed to consider log header
blocks while cp_payload check.
- in f2fs_sanity_check_ckpt(), it missed to check nat_bits_blocks.
Cc: <stable@kernel.org>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ffc8f5f77 upstream.
SBI_NEED_FSCK is an indicator that fsck.f2fs needs to be triggered, so it
is not fully critical to stop any IO writes. So, let's allow to write data
instead of reporting EIO forever given SBI_NEED_FSCK, but do keep OPU.
Fixes: 9557727876 ("f2fs: drop inplace IO if fs status is abnormal")
Cc: <stable@kernel.org> # v5.13+
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 868ad33bfa upstream.
sched_setscheduler() and rt_mutex_setprio() invoke the run-queue balance
callback after changing priorities or the scheduling class of a task. The
run-queue for which the callback is invoked can be local or remote.
That's not a problem for the regular rq::push_work which is serialized with
a busy flag in the run-queue struct, but for the balance_push() work which
is only valid to be invoked on the outgoing CPU that's wrong. It not only
triggers the debug warning, but also leaves the per CPU variable push_work
unprotected, which can result in double enqueues on the stop machine list.
Remove the warning and validate that the function is invoked on the
outgoing CPU.
Fixes: ae79270232 ("sched: Optimize finish_lock_switch()")
Reported-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87zgt1hdw7.ffs@tglx
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b30d0289de upstream.
The merge_fdt_bootargs() function by definition consumes more than 1024
bytes of stack because it has a 1024 byte command line on the stack,
meaning that we always get a warning when building this file:
arch/arm/boot/compressed/atags_to_fdt.c: In function 'merge_fdt_bootargs':
arch/arm/boot/compressed/atags_to_fdt.c:98:1: warning: the frame size of 1032 bytes is larger than 1024 bytes [-Wframe-larger-than=]
However, as this is the decompressor and we know that it has a very shallow
call chain, and we do not actually risk overflowing the kernel stack
at runtime here.
This just shuts up the warning by disabling the warning flag for this
file.
Tested on Nexus 7 2012 builds.
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a6430ab9c upstream.
Commit ca6bfcb2f6 ("libata: Enable queued TRIM for Samsung SSD 860")
limited the existing ATA_HORKAGE_NO_NCQ_TRIM quirk from "Samsung SSD 8*",
covering all Samsung 800 series SSDs, to only apply to "Samsung SSD 840*"
and "Samsung SSD 850*" series based on information from Samsung.
But there is a large number of users which is still reporting issues
with the Samsung 860 and 870 SSDs combined with Intel, ASmedia or
Marvell SATA controllers and all reporters also report these problems
going away when disabling queued trims.
Note that with AMD SATA controllers users are reporting even worse
issues and only completely disabling NCQ helps there, this will be
addressed in a separate patch.
Fixes: ca6bfcb2f6 ("libata: Enable queued TRIM for Samsung SSD 860")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=203475
Cc: stable@vger.kernel.org
Cc: Kate Hsuan <hpa@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20210823095220.30157-1-hdegoede@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c1dc8bda3 upstream.
When the ESTABLISH ccw does not complete within the specified timeout,
try our best to cancel the ccw program that is still active on the
device. Otherwise the IO subsystem might be accessing it even after
the driver eg. called qdio_free().
Fixes: 779e6e1c72 ("[S390] qdio: new qdio driver.")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Cc: <stable@vger.kernel.org> # 2.6.27
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c197870e4 upstream.
When qdio_establish() times out while waiting for the ESTABLISH ccw to
complete, it calls qdio_shutdown() to roll back all of its previous
actions. But at this point the qdio_irq's state is still
QDIO_IRQ_STATE_INACTIVE, so qdio_shutdown() will exit immediately
without doing any actual work.
Which means that eg. the qdio_irq's thinint-indicator stays registered,
and cdev->handler isn't restored to its old value. And since
commit 954d6235be ("s390/qdio: make thinint registration symmetric")
the qdio_irq also stays on the tiq_list, so on the next qdio_establish()
we might get a helpful BUG from the list-debugging code:
...
[ 4633.512591] list_add double add: new=00000000005a4110, prev=00000001b357db78, next=00000000005a4110.
[ 4633.512621] ------------[ cut here ]------------
[ 4633.512623] kernel BUG at lib/list_debug.c:29!
...
[ 4633.512796] [<00000001b2c6ee9a>] __list_add_valid+0x82/0xa0
[ 4633.512798] ([<00000001b2c6ee96>] __list_add_valid+0x7e/0xa0)
[ 4633.512800] [<00000001b2fcecce>] qdio_establish_thinint+0x116/0x190
[ 4633.512805] [<00000001b2fcbe58>] qdio_establish+0x128/0x498
...
Fix this by extracting a goto-chain from the existing error exits in
qdio_establish(), and check the return value of the wait_event_...()
to detect the timeout condition.
Fixes: 779e6e1c72 ("[S390] qdio: new qdio driver.")
Root-caused-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Cc: <stable@vger.kernel.org> # 2.6.27
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1a10d7fdb6 upstream.
As warned by smatch:
drivers/media/usb/uvc/uvc_v4l2.c:911 uvc_ioctl_g_input() error: doing dma on the stack (&i)
drivers/media/usb/uvc/uvc_v4l2.c:943 uvc_ioctl_s_input() error: doing dma on the stack (&i)
those two functions call uvc_query_ctrl passing a pointer to
a data at the DMA stack. those are used to send URBs via
usb_control_msg(). Using DMA stack is not supported and should
not work anymore on modern Linux versions.
So, use a kmalloc'ed buffer.
Cc: stable@vger.kernel.org # Kernel 4.9 and upper
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a30dc6cf0d upstream.
I got a NULL pointer dereference report when doing fuzz test:
Call Trace:
qp_release_pages+0xae/0x130
qp_host_unregister_user_memory.isra.25+0x2d/0x80
vmci_qp_broker_unmap+0x191/0x320
? vmci_host_do_alloc_queuepair.isra.9+0x1c0/0x1c0
vmci_host_unlocked_ioctl+0x59f/0xd50
? do_vfs_ioctl+0x14b/0xa10
? tomoyo_file_ioctl+0x28/0x30
? vmci_host_do_alloc_queuepair.isra.9+0x1c0/0x1c0
__x64_sys_ioctl+0xea/0x120
do_syscall_64+0x34/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
When a queue pair is created by the following call, it will not
register the user memory if the page_store is NULL, and the
entry->state will be set to VMCIQPB_CREATED_NO_MEM.
vmci_host_unlocked_ioctl
vmci_host_do_alloc_queuepair
vmci_qp_broker_alloc
qp_broker_alloc
qp_broker_create // set entry->state = VMCIQPB_CREATED_NO_MEM;
When unmapping this queue pair, qp_host_unregister_user_memory() will
be called to unregister the non-existent user memory, which will
result in a null pointer reference. It will also change
VMCIQPB_CREATED_NO_MEM to VMCIQPB_CREATED_MEM, which should not be
present in this operation.
Only when the qp broker has mem, it can unregister the user
memory when unmapping the qp broker.
Only when the qp broker has no mem, it can register the user
memory when mapping the qp broker.
Fixes: 06164d2b72 ("VMCI: queue pairs implementation.")
Cc: stable <stable@vger.kernel.org>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Link: https://lore.kernel.org/r/20210818124845.488312-1-wanghai38@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5441a07a12 upstream.
The commit 97f9ac3db6 ("crypto: ccp - Add support for SEV-ES to the
PSP driver") added support to allocate Trusted Memory Region (TMR)
used during the SEV-ES firmware initialization. The TMR gets locked
during the firmware initialization and unlocked during the shutdown.
While the TMR is locked, access to it is disallowed.
Currently, the CCP driver does not shutdown the firmware during the
kexec reboot, leaving the TMR memory locked.
Register a callback to shutdown the SEV firmware on the kexec boot.
Fixes: 97f9ac3db6 ("crypto: ccp - Add support for SEV-ES to the PSP driver")
Reported-by: Lucas Nussbaum <lucas.nussbaum@inria.fr>
Tested-by: Lucas Nussbaum <lucas.nussbaum@inria.fr>
Cc: <stable@kernel.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Acked-by: Tom Lendacky <thomas.lendacky@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 528b16bfc3 upstream.
On systems with many cores using dm-crypt, heavy spinlock contention in
percpu_counter_compare() can be observed when the page allocation limit
for a given device is reached or close to be reached. This is due
to percpu_counter_compare() taking a spinlock to compute an exact
result on potentially many CPUs at the same time.
Switch to non-exact comparison of allocated and allowed pages by using
the value returned by percpu_counter_read_positive() to avoid taking
the percpu_counter spinlock.
This may over/under estimate the actual number of allocated pages by at
most (batch-1) * num_online_cpus().
Currently, batch is bounded by 32. The system on which this issue was
first observed has 256 CPUs and 512GB of RAM. With a 4k page size, this
change may over/under estimate by 31MB. With ~10G (2%) allowed dm-crypt
allocations, this seems an acceptable error. Certainly preferred over
running into the spinlock contention.
This behavior was reproduced on an EC2 c5.24xlarge instance with 96 CPUs
and 192GB RAM as follows, but can be provoked on systems with less CPUs
as well.
* Disable swap
* Tune vm settings to promote regular writeback
$ echo 50 > /proc/sys/vm/dirty_expire_centisecs
$ echo 25 > /proc/sys/vm/dirty_writeback_centisecs
$ echo $((128 * 1024 * 1024)) > /proc/sys/vm/dirty_background_bytes
* Create 8 dmcrypt devices based on files on a tmpfs
* Create and mount an ext4 filesystem on each crypt devices
* Run stress-ng --hdd 8 within one of above filesystems
Total %system usage collected from sysstat goes to ~35%. Write throughput
on the underlying loop device is ~2GB/s. perf profiling an individual
kworker kcryptd thread shows the following profile, indicating spinlock
contention in percpu_counter_compare():
99.98% 0.00% kworker/u193:46 [kernel.kallsyms] [k] ret_from_fork
|
--ret_from_fork
kthread
worker_thread
|
--99.92%--process_one_work
|
|--80.52%--kcryptd_crypt
| |
| |--62.58%--mempool_alloc
| | |
| | --62.24%--crypt_page_alloc
| | |
| | --61.51%--__percpu_counter_compare
| | |
| | --61.34%--__percpu_counter_sum
| | |
| | |--58.68%--_raw_spin_lock_irqsave
| | | |
| | | --58.30%--native_queued_spin_lock_slowpath
| | |
| | --0.69%--cpumask_next
| | |
| | --0.51%--_find_next_bit
| |
| |--10.61%--crypt_convert
| | |
| | |--6.05%--xts_crypt
...
After applying this patch and running the same test, %system usage is
lowered to ~7% and write throughput on the loop device increases
to ~2.7GB/s. perf report shows mempool_alloc() as ~8% rather than ~62%
in the profile and not hitting the percpu_counter() spinlock anymore.
|--8.15%--mempool_alloc
| |
| |--3.93%--crypt_page_alloc
| | |
| | --3.75%--__alloc_pages
| | |
| | --3.62%--get_page_from_freelist
| | |
| | --3.22%--rmqueue_bulk
| | |
| | --2.59%--_raw_spin_lock
| | |
| | --2.57%--native_queued_spin_lock_slowpath
| |
| --3.05%--_raw_spin_lock_irqsave
| |
| --2.49%--native_queued_spin_lock_slowpath
Suggested-by: DJ Gregor <dj@corelight.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Arne Welzel <arne.welzel@corelight.com>
Fixes: 5059353df8 ("dm crypt: limit the number of allocated pages")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54784ffa5b upstream.
Reading status register can fail in the interrupt handler. In such
case, the regmap_read() will not store anything useful under passed
'val' variable and random stack value will be used to determine type of
interrupt.
Handle the regmap_read() failure to avoid handling interrupt type and
triggering changed power supply event based on random stack value.
Fixes: 39e7213edc ("max17042_battery: Support regmap to access device's registers")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a680dd72ec upstream.
For a request that has a priority level equal to or larger than
IOPRIO_BE_NR, bfq_set_next_ioprio_data() prints a critical warning but
defaults to setting the request new_ioprio field to IOPRIO_BE_NR. This
is not consistent with the warning and the allowed values for priority
levels. Fix this by setting the request new_ioprio field to
IOPRIO_BE_NR - 1, the lowest priority level allowed.
Cc: <stable@vger.kernel.org>
Fixes: aee69d78de ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20210811033702.368488-2-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b76d26d69e upstream.
There is no reason to assume that the IRQ rising edge (indicating that
the device start up phase is done) will happen after we request the IRQ.
If the device is already up by the time we request it, the call to
'wait_for_completion_timeout()' will timeout and we will fail the device
probe even though there's nothing wrong.
Fix it by just polling the status register until we get the indication that
the device is up and running. As a side effect of this fix, requesting the
IRQ is also moved to after the setup function.
Fixes: f110f3188e ("iio: temperature: Add support for LTC2983")
Reported-and-tested-by: Drew Fustini <drew@pdp7.com>
Reviewed-by: Drew Fustini <drew@pdp7.com>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: <Stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210811133220.190264-2-nuno.sa@analog.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90268574a3 upstream.
The `compute_indices` and `populate_entries` macros operate on inclusive
bounds, and thus the `map_memory` macro which uses them also operates
on inclusive bounds.
We pass `_end` and `_idmap_text_end` to `map_memory`, but these are
exclusive bounds, and if one of these is sufficiently aligned (as a
result of kernel configuration, physical placement, and KASLR), then:
* In `compute_indices`, the computed `iend` will be in the page/block *after*
the final byte of the intended mapping.
* In `populate_entries`, an unnecessary entry will be created at the end
of each level of table. At the leaf level, this entry will map up to
SWAPPER_BLOCK_SIZE bytes of physical addresses that we did not intend
to map.
As we may map up to SWAPPER_BLOCK_SIZE bytes more than intended, we may
violate the boot protocol and map physical address past the 2MiB-aligned
end address we are permitted to map. As we map these with Normal memory
attributes, this may result in further problems depending on what these
physical addresses correspond to.
The final entry at each level may require an additional table at that
level. As EARLY_ENTRIES() calculates an inclusive bound, we allocate
enough memory for this.
Avoid the extraneous mapping by having map_memory convert the exclusive
end address to an inclusive end address by subtracting one, and do
likewise in EARLY_ENTRIES() when calculating the number of required
tables. For clarity, comments are updated to more clearly document which
boundaries the macros operate on. For consistency with the other
macros, the comments in map_memory are also updated to describe `vstart`
and `vend` as virtual addresses.
Fixes: 0370b31e48 ("arm64: Extend early page table code to allow for larger kernels")
Cc: <stable@vger.kernel.org> # 4.16.x
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210823101253.55567-1-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e10f9887e upstream.
When switching to an 'mm_struct' for the first time following an ASID
rollover, a new ASID may be allocated and assigned to 'mm->context.id'.
This reassignment can happen concurrently with other operations on the
mm, such as unmapping pages and subsequently issuing TLB invalidation.
Consequently, we need to ensure that (a) accesses to 'mm->context.id'
are atomic and (b) all page-table updates made prior to a TLBI using the
old ASID are guaranteed to be visible to CPUs running with the new ASID.
This was found by inspection after reviewing the VMID changes from
Shameer but it looks like a real (yet hard to hit) bug.
Cc: <stable@vger.kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Jade Alglave <jade.alglave@arm.com>
Cc: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210806113109.2475-2-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b07e990fb upstream.
The check mixes pages (vm_pgoff) with bytes (vm_start, vm_end) on one
side of the comparison, and uses resource address (rather than just the
resource size) on the other side of the comparison.
This can allow malicious userspace to easily bypass the boundary check and
map pages that are located outside memory-region reserved by the driver.
Fixes: 01c60dcea9 ("drivers/misc: Add Aspeed P2A control driver")
Cc: stable@vger.kernel.org
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Tested-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Joel Stanley <joel@aj.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b49a0e69a7 upstream.
The check mixes pages (vm_pgoff) with bytes (vm_start, vm_end) on one
side of the comparison, and uses resource address (rather than just the
resource size) on the other side of the comparison.
This can allow malicious userspace to easily bypass the boundary check and
map pages that are located outside memory-region reserved by the driver.
Fixes: 6c4e976785 ("drivers/misc: Add Aspeed LPC control driver")
Cc: stable@vger.kernel.org
Signed-off-by: Iwona Winiarska <iwona.winiarska@intel.com>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Tested-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Joel Stanley <joel@aj.id.au>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a89f355e46 upstream.
In "qmp_cooling_devices_register", the count value is initially
QMP_NUM_COOLING_RESOURCES, which is 2. Based on the initial count value,
the memory for cooling_devs is allocated. Then while calling the
"qmp_cooling_device_add" function, count value is post-incremented for
each child node.
This makes the out of bound access to the cooling_dev array. Fix it by
passing the QMP_NUM_COOLING_RESOURCES definition to devm_kzalloc() and
initializing the count to 0.
While at it, let's also free the memory allocated to cooling_dev if no
cooling device is found in DT and during unroll phase.
Cc: stable@vger.kernel.org # 5.4
Fixes: 05589b30b2 ("soc: qcom: Extend AOSS QMP driver to support resources that are used to wake up the SoC.")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20210629153249.73428-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2542373195 upstream.
The UFOE (data compression engine) component needs to be enabled to have
the imgtec gpu driver working. If we don't enable it we see a black screen.
Looks like when we switched to use and array for setting the routing
registers in commit 440147639a ("soc: mediatek: mmsys: Use an array for
setting the routing registers") we missed to add this component in the new
routing table, it was present before that commit, so fix it by adding
this component in the mt8173 routing table.
Fixes: 440147639a ("soc: mediatek: mmsys: Use an array for setting the routing registers")
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Eizan Miyamoto <eizan@chromium.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210625062448.3462177-1-enric.balletbo@collabora.com
[mb: taking into account mask value]
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05a444d3f9 upstream.
Currently in the case where kmem_cache_alloc fails the null pointer
cf is dereferenced when assigning cf->is_capsnap = false. Fix this
by adding a null pointer check and return path.
Cc: stable@vger.kernel.org
Addresses-Coverity: ("Dereference null return")
Fixes: b2f9fa1f3b ("ceph: correctly handle releasing an embedded cap flush")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b511d5bfa upstream.
Xen PV guests are specifying the highest used PFN via the max_pfn
field in shared_info. This value is used by the Xen tools when saving
or migrating the guest.
Unfortunately this field is misnamed, as in reality it is specifying
the number of pages (including any memory holes) of the guest, so it
is the highest used PFN + 1. Renaming isn't possible, as this is a
public Xen hypervisor interface which needs to be kept stable.
The kernel will set the value correctly initially at boot time, but
when adding more pages (e.g. due to memory hotplug or ballooning) a
real PFN number is stored in max_pfn. This is done when expanding the
p2m array, and the PFN stored there is even possibly wrong, as it
should be the last possible PFN of the just added P2M frame, and not
one which led to the P2M expansion.
Fix that by setting shared_info->max_pfn to the last possible PFN + 1.
Fixes: 98dd166ea3 ("x86/xen/p2m: hint at the last populated P2M entry")
Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Link: https://lore.kernel.org/r/20210730092622.9973-2-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9addd85fb upstream.
H_GetPerformanceCounterInfo (0xF080) hcall returns the counter data in
the result buffer. Result buffer has specific format defined in the PAPR
specification. One of the fields is counter offset and width of the
counter data returned.
Counter data are returned in a unsigned char array in big endian byte
order. To get the final counter data, the values must be left shifted
byte at a time. But commit 220a0c609a ("powerpc/perf: Add support for
the hv gpci (get performance counter info) interface") made the shifting
bitwise and also assumed little endian order. Because of that, hcall
counters values are reported incorrectly.
In particular this can lead to counters go backwards which messes up the
counter prev vs now calculation and leads to huge counter value
reporting:
#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
-C 0 -I 1000
time counts unit events
1.000078854 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
2.000213293 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
3.000320107 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
4.000428392 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
5.000537864 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
6.000649087 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
7.000760312 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
8.000865218 16,448 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
9.000978985 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
10.001088891 16,384 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
11.001201435 0 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
12.001307937 18,446,744,073,709,535,232 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
Fix the shifting logic to correct match the format, ie. read bytes in
big endian order.
Fixes: e4f226b158 ("powerpc/perf/hv-gpci: Increase request buffer size")
Cc: stable@vger.kernel.org # v4.6+
Reported-by: Nageswara R Sastry<rnsastry@linux.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Nageswara R Sastry<rnsastry@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210813082158.429023-1-kjain@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ead3b768bb upstream.
Zone management send operations (BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE
and BLKFINISHZONE) should be allowed under the same permissions as write().
(write() does not require CAP_SYS_ADMIN).
Additionally, other ioctls like BLKSECDISCARD and BLKZEROOUT only check if
the fd was successfully opened with FMODE_WRITE.
(They do not require CAP_SYS_ADMIN).
Currently, zone management send operations require both CAP_SYS_ADMIN
and that the fd was successfully opened with FMODE_WRITE.
Remove the CAP_SYS_ADMIN requirement, so that zone management send
operations match the access control requirement of write(), BLKSECDISCARD
and BLKZEROOUT.
Fixes: 3ed05a987e ("blk-zoned: implement ioctls")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Aravind Ramesh <aravind.ramesh@wdc.com>
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: stable@vger.kernel.org # v4.10+
Link: https://lore.kernel.org/r/20210811110505.29649-2-Niklas.Cassel@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f79645df80 upstream.
btrfs_add_ordered_extent_*() add num_bytes to fs_info->ordered_bytes.
Then, splitting an ordered extent will call btrfs_add_ordered_extent_*()
again for split extents, leading to double counting of the region of
a split extent. These leaked bytes are finally reported at unmount time
as follow:
BTRFS info (device dm-1): at unmount dio bytes count 364544
Fix the double counting by subtracting split extent's size from
fs_info->ordered_bytes.
Fixes: d22002fd37 ("btrfs: zoned: split ordered extent when bio is sent")
CC: stable@vger.kernel.org # 5.12+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d977e0eba upstream.
This crash was observed with a failed assertion on device close:
BTRFS: Transaction aborted (error -28)
WARNING: CPU: 1 PID: 3902 at fs/btrfs/extent-tree.c:2150 btrfs_run_delayed_refs+0x1d2/0x1e0 [btrfs]
Modules linked in: btrfs blake2b_generic libcrc32c crc32c_intel xor zstd_decompress zstd_compress xxhash lzo_compress lzo_decompress raid6_pq loop
CPU: 1 PID: 3902 Comm: kworker/u8:4 Not tainted 5.14.0-rc5-default+ #1532
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs]
RIP: 0010:btrfs_run_delayed_refs+0x1d2/0x1e0 [btrfs]
RSP: 0018:ffffb7a5452d7d80 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffffabee13c4 RDI: 00000000ffffffff
RBP: ffff97834176a378 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffff97835195d388
R13: 0000000005b08000 R14: ffff978385484000 R15: 000000000000016c
FS: 0000000000000000(0000) GS:ffff9783bd800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056190d003fe8 CR3: 000000002a81e005 CR4: 0000000000170ea0
Call Trace:
flush_space+0x197/0x2f0 [btrfs]
btrfs_async_reclaim_metadata_space+0x139/0x300 [btrfs]
process_one_work+0x262/0x5e0
worker_thread+0x4c/0x320
? process_one_work+0x5e0/0x5e0
kthread+0x144/0x170
? set_kthread_struct+0x40/0x40
ret_from_fork+0x1f/0x30
irq event stamp: 19334989
hardirqs last enabled at (19334997): [<ffffffffab0e0c87>] console_unlock+0x2b7/0x400
hardirqs last disabled at (19335006): [<ffffffffab0e0d0d>] console_unlock+0x33d/0x400
softirqs last enabled at (19334900): [<ffffffffaba0030d>] __do_softirq+0x30d/0x574
softirqs last disabled at (19334893): [<ffffffffab0721ec>] irq_exit_rcu+0x12c/0x140
---[ end trace 45939e308e0dd3c7 ]---
BTRFS: error (device vdd) in btrfs_run_delayed_refs:2150: errno=-28 No space left
BTRFS info (device vdd): forced readonly
BTRFS warning (device vdd): failed setting block group ro: -30
BTRFS info (device vdd): suspending dev_replace for unmount
assertion failed: !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state), in fs/btrfs/volumes.c:1150
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.h:3431!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 3982 Comm: umount Tainted: G W 5.14.0-rc5-default+ #1532
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
RIP: 0010:assertfail.constprop.0+0x18/0x1a [btrfs]
RSP: 0018:ffffb7a5454c7db8 EFLAGS: 00010246
RAX: 0000000000000068 RBX: ffff978364b91c00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffffabee13c4 RDI: 00000000ffffffff
RBP: ffff9783523a4c00 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffff9783523a4d18
R13: 0000000000000000 R14: 0000000000000004 R15: 0000000000000003
FS: 00007f61c8f42800(0000) GS:ffff9783bd800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056190cffa810 CR3: 0000000030b96002 CR4: 0000000000170ea0
Call Trace:
btrfs_close_one_device.cold+0x11/0x55 [btrfs]
close_fs_devices+0x44/0xb0 [btrfs]
btrfs_close_devices+0x48/0x160 [btrfs]
generic_shutdown_super+0x69/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x2c/0xa0
cleanup_mnt+0x144/0x1b0
task_work_run+0x59/0xa0
exit_to_user_mode_loop+0xe7/0xf0
exit_to_user_mode_prepare+0xaf/0xf0
syscall_exit_to_user_mode+0x19/0x50
do_syscall_64+0x4a/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
This happens when close_ctree is called while a dev_replace hasn't
completed. In close_ctree, we suspend the dev_replace, but keep the
replace target around so that we can resume the dev_replace procedure
when we mount the root again. This is the call trace:
close_ctree():
btrfs_dev_replace_suspend_for_unmount();
btrfs_close_devices():
btrfs_close_fs_devices():
btrfs_close_one_device():
ASSERT(!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
&device->dev_state));
However, since the replace target sticks around, there is a device
with BTRFS_DEV_STATE_REPLACE_TGT set on close, and we fail the
assertion in btrfs_close_one_device.
To fix this, if we come across the replace target device when
closing, we should properly reset it back to allocation state. This
fix also ensures that if a non-target device has a corrupted state and
has the BTRFS_DEV_STATE_REPLACE_TGT bit set, the assertion will still
catch the error.
Reported-by: David Sterba <dsterba@suse.com>
Fixes: b2a6166768 ("btrfs: fix rw device counting in __btrfs_free_extra_devids")
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f93e834fa upstream.
The mount option max_inline ranges from 0 to the sectorsize (which is
now equal to page size). But we parse the mount options too early and
before the actual sectorsize is read from the superblock. So the upper
limit of max_inline is unaware of the actual sectorsize and is limited
by the temporary sectorsize 4096, even on a system where the default
sectorsize is 64K.
Fix this by reading the superblock sectorsize before the mount option
parse.
Reported-by: Alexander Tsvetkov <alexander.tsvetkov@oracle.com>
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba86dd9fe6 upstream.
btrfs_relocate_chunk() can fail with -EAGAIN when e.g. send operations are
running. The message can fail btrfs/187 and it's unnecessary because we
anyway add it back to the reclaim list.
btrfs_reclaim_bgs_work()
`-> btrfs_relocate_chunk()
`-> btrfs_relocate_block_group()
`-> reloc_chunk_start()
`-> if (fs_info->send_in_progress)
`-> return -EAGAIN
CC: stable@vger.kernel.org # 5.13+
Fixes: 18bb8bbf13 ("btrfs: zoned: automatically reclaim zones")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ae79c6fe7 upstream.
alloc_offset is offset from the start of a block group and @offset is
actually an address in logical space. Thus, we need to consider
block_group->start when calculating them.
Fixes: 011b41bffa ("btrfs: zoned: advance allocation pointer after tree log node")
CC: stable@vger.kernel.org # 5.12+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1146239794 upstream.
A common characteristic of the bug report where preemptive flushing was
going full tilt was the fact that the vast majority of the free metadata
space was used up by the global reserve. The hard 90% threshold would
cover the majority of these cases, but to be even smarter we should take
into account how much of the outstanding reservations are covered by the
global block reserve. If the global block reserve accounts for the vast
majority of outstanding reservations, skip preemptive flushing, as it
will likely just cause churn and pain.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212185
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93c60b17f2 upstream.
The preemptive flushing code was added in order to avoid needing to
synchronously wait for ENOSPC flushing to recover space. Once we're
almost full however we can essentially flush constantly. We were using
98% as a threshold to determine if we were simply full, however in
practice this is a really high bar to hit. For example reports of
systems running into this problem had around 94% usage and thus
continued to flush. Fix this by lowering the threshold to 90%, which is
a more sane value, especially for smaller file systems.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212185
CC: stable@vger.kernel.org # 5.12+
Fixes: 576fa34830 ("btrfs: improve preemptive background space flushing")
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e16460707e upstream.
I've been debugging an early ENOSPC problem in production and finally
root caused it to this problem. When we switched to the per-inode in
38d715f494 ("btrfs: use btrfs_start_delalloc_roots in
shrink_delalloc") I pulled out the async extent handling, because we
were doing the correct thing by calling filemap_flush() if we had async
extents set. This would properly wait on any async extents by locking
the page in the second flush, thus making sure our ordered extents were
properly set up.
However when I switched us back to page based flushing, I used
sync_inode(), which allows us to pass in our own wbc. The problem here
is that sync_inode() is smarter than the filemap_* helpers, it tries to
avoid calling writepages at all. This means that our second call could
skip calling do_writepages altogether, and thus not wait on the pagelock
for the async helpers. This means we could come back before any ordered
extents were created and then simply continue on in our flushing
mechanisms and ENOSPC out when we have plenty of space to use.
Fix this by putting back the async pages logic in shrink_delalloc. This
allows us to bulk write out everything that we need to, and then we can
wait in one place for the async helpers to catch up, and then wait on
any ordered extents that are created.
Fixes: e076ab2a2c ("btrfs: shrink delalloc pages instead of full inodes")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac98141d14 upstream.
We use the async_delalloc_pages mechanism to make sure that we've
completed our async work before trying to continue our delalloc
flushing. The reason for this is we need to see any ordered extents
that were created by our delalloc flushing. However we're waking up
before we do the submit work, which is before we create the ordered
extents. This is a pretty wide race window where we could potentially
think there are no ordered extents and thus exit shrink_delalloc
prematurely. Fix this by waking us up after we've done the work to
create ordered extents.
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03fe78cc29 upstream.
We have been hitting some early ENOSPC issues in production with more
recent kernels, and I tracked it down to us simply not flushing delalloc
as aggressively as we should be. With tracing I was seeing us failing
all tickets with all of the block rsvs at or around 0, with very little
pinned space, but still around 120MiB of outstanding bytes_may_used.
Upon further investigation I saw that we were flushing around 14 pages
per shrink call for delalloc, despite having around 2GiB of delalloc
outstanding.
Consider the example of a 8 way machine, all CPUs trying to create a
file in parallel, which at the time of this commit requires 5 items to
do. Assuming a 16k leaf size, we have 10MiB of total metadata reclaim
size waiting on reservations. Now assume we have 128MiB of delalloc
outstanding. With our current math we would set items to 20, and then
set to_reclaim to 20 * 256k, or 5MiB.
Assuming that we went through this loop all 3 times, for both
FLUSH_DELALLOC and FLUSH_DELALLOC_WAIT, and then did the full loop
twice, we'd only flush 60MiB of the 128MiB delalloc space. This could
leave a fair bit of delalloc reservations still hanging around by the
time we go to ENOSPC out all the remaining tickets.
Fix this two ways. First, change the calculations to be a fraction of
the total delalloc bytes on the system. Prior to this change we were
calculating based on dirty inodes so our math made more sense, now it's
just completely unrelated to what we're actually doing.
Second add a FLUSH_DELALLOC_FULL state, that we hold off until we've
gone through the flush states at least once. This will empty the system
of all delalloc so we're sure to be truly out of space when we start
failing tickets.
I'm tagging stable 5.10 and forward, because this is where we started
using the page stuff heavily again. This affects earlier kernel
versions as well, but would be a pain to backport to them as the
flushing mechanisms aren't the same.
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 94ffb0a282 upstream.
The attempt to find and activate a free worker for new work is currently
combined with creating a new one if we don't find one, but that opens
io-wq up to a race where the worker that is found and activated can
put itself to sleep without knowing that it has been selected to perform
this new work.
Fix this by moving the activation into where we add the new work item,
then we can retain it within the wqe->lock scope and elimiate the race
with the worker itself checking inside the lock, but sleeping outside of
it.
Cc: stable@vger.kernel.org
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 87df7fb922 upstream.
When new work is added, io_wqe_enqueue() checks if we need to wake or
create a new worker. But that check is done outside the lock that
otherwise synchronizes us with a worker going to sleep, so we can end
up in the following situation:
CPU0 CPU1
lock
insert work
unlock
atomic_read(nr_running) != 0
lock
atomic_dec(nr_running)
no wakeup needed
Hold the wqe lock around the "need to wakeup" check. Then we can also get
rid of the temporary work_flags variable, as we know the work will remain
valid as long as we hold the lock.
Cc: stable@vger.kernel.org
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dadebc350d upstream.
WARNING: CPU: 1 PID: 5870 at fs/io_uring.c:5975 io_try_cancel_userdata+0x30f/0x540 fs/io_uring.c:5975
CPU: 0 PID: 5870 Comm: iou-wrk-5860 Not tainted 5.14.0-rc6-next-20210820-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:io_try_cancel_userdata+0x30f/0x540 fs/io_uring.c:5975
Call Trace:
io_async_cancel fs/io_uring.c:6014 [inline]
io_issue_sqe+0x22d5/0x65a0 fs/io_uring.c:6407
io_wq_submit_work+0x1dc/0x300 fs/io_uring.c:6511
io_worker_handle_work+0xa45/0x1840 fs/io-wq.c:533
io_wqe_worker+0x2cc/0xbb0 fs/io-wq.c:582
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
io_try_cancel_userdata() can be called from io_async_cancel() executing
in the io-wq context, so the warning fires, which is there to alert
anyone accessing task->io_uring->io_wq in a racy way. However,
io_wq_put_and_exit() always first waits for all threads to complete,
so the only detail left is to zero tctx->io_wq after the context is
removed.
note: one little assumption is that when IO_WQ_WORK_CANCEL, the executor
won't touch ->io_wq, because io_wq_destroy() might cancel left pending
requests in such a way.
Cc: stable@vger.kernel.org
Reported-by: syzbot+b0c9d1588ae92866515f@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/dfdd37a80cfa9ffd3e59538929c99cdd55d8699e.1629721757.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aaedb9e00e upstream.
Since a few kernel releases the Pogoplug 4 has crashed like this
during boot:
Unable to handle kernel NULL pointer dereference at virtual address 00000002
(...)
[<c04116ec>] (strlen) from [<c00ead80>] (kstrdup+0x1c/0x4c)
[<c00ead80>] (kstrdup) from [<c04591d8>] (__clk_register+0x44/0x37c)
[<c04591d8>] (__clk_register) from [<c04595ec>] (clk_hw_register+0x20/0x44)
[<c04595ec>] (clk_hw_register) from [<c045bfa8>] (__clk_hw_register_mux+0x198/0x1e4)
[<c045bfa8>] (__clk_hw_register_mux) from [<c045c050>] (clk_register_mux_table+0x5c/0x6c)
[<c045c050>] (clk_register_mux_table) from [<c0acf3e0>] (kirkwood_clk_muxing_setup.constprop.0+0x13c/0x1ac)
[<c0acf3e0>] (kirkwood_clk_muxing_setup.constprop.0) from [<c0aceae0>] (of_clk_init+0x12c/0x214)
[<c0aceae0>] (of_clk_init) from [<c0ab576c>] (time_init+0x20/0x2c)
[<c0ab576c>] (time_init) from [<c0ab3d18>] (start_kernel+0x3dc/0x56c)
[<c0ab3d18>] (start_kernel) from [<00000000>] (0x0)
Code: e3130020 1afffffb e12fff1e c08a1078 (e5d03000)
This is because the "powersave" mux clock 0 was provided in an unterminated
array, which is required by the loop in the driver:
/* Count, allocate, and register clock muxes */
for (n = 0; desc[n].name;)
n++;
Here n will go out of bounds and then call clk_register_mux() on random
memory contents after the mux clock.
Fix this by terminating the array with a blank entry.
Fixes: 105299381d ("cpufreq: kirkwood: use the powersave multiplexer")
Cc: stable@vger.kernel.org
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Cc: Gregory CLEMENT <gregory.clement@bootlin.com>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20210814235514.403426-1-linus.walleij@linaro.org
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c42813b71a upstream.
Kernel v5.14 has various changes to optimize unaligned memory accesses,
e.g. commit 0652035a57 ("asm-generic: unaligned: remove byteshift helpers").
Those changes triggered an unalignment-exception and thus crashed the
bootloader on parisc because the unaligned "output_len" variable now suddenly
was read word-wise while it was read byte-wise in the past.
Fix this issue by declaring the external output_len variable as char which then
forces the compiler to generate byte-accesses.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: John David Anglin <dave.anglin@bell.net>
Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102162
Fixes: 8c031ba63f ("parisc: Unbreak bootloader due to gcc-7 optimizations")
Fixes: 0652035a57 ("asm-generic: unaligned: remove byteshift helpers")
Cc: <stable@vger.kernel.org> # v5.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79fad92f2e upstream.
Currently there are (at least) two problems in the way pwm_bl starts
managing the enable_gpio pin. Both occur when the backlight is initially
off and the driver finds the pin not already in output mode and, as a
result, unconditionally switches it to output-mode and asserts the signal.
Problem 1: This could cause the backlight to flicker since, at this stage
in driver initialisation, we have no idea what the PWM and regulator are
doing (an unconfigured PWM could easily "rest" at 100% duty cycle).
Problem 2: This will cause us not to correctly honour the
post_pwm_on_delay (which also risks flickers).
Fix this by moving the code to configure the GPIO output mode until after
we have examines the handover state. That allows us to initialize
enable_gpio to off if the backlight is currently off and on if the
backlight is on.
Cc: stable@vger.kernel.org
Reported-by: Marek Vasut <marex@denx.de>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Marek Vasut <marex@denx.de>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9660dcbe0d upstream.
In commit 8010d74b99 ("RDMA/mlx5: Split the WR setup out of
mlx5_ib_update_xlt()") the allocation logic was split out of
mlx5_ib_update_xlt() and the logic was changed to enable better OOM
handling. Sadly this change introduced a miscalculation of the number of
entries that were actually allocated when under memory pressure where it
can actually become 0 which on s390 lets dma_map_single() fail.
It can also lead to corruption of the free pages list when the wrong
number of entries is used in the calculation of sg->length which is used
as argument for free_pages().
Fix this by using the allocation size instead of misusing get_order(size).
Cc: stable@vger.kernel.org
Fixes: 8010d74b99 ("RDMA/mlx5: Split the WR setup out of mlx5_ib_update_xlt()")
Link: https://lore.kernel.org/r/20210908081849.7948-1-schnelle@linux.ibm.com
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3265cc3ec5 upstream.
Find and verify PRMT before parsing it, which eliminates a
warning on machines without PRMT:
[ 7.197173] ACPI: PRMT not present
Fixes: cefc7ca462 ("ACPI: PRM: implement OperationRegion handler for the PlatformRtMechanism subtype")
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: 5.14+ <stable@vger.kernel.org> # 5.14+
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8510505d55 upstream.
MD5 is a weak digest algorithm that shouldn't be used for cryptographic
operation. It hinders the efficiency of a patch set that aims to limit
the digests allowed for the extended file attribute namely security.ima.
MD5 is no longer a requirement for IMA, nor should it be used there.
The sole place where we still use the MD5 algorithm inside IMA is setting
the ima_hash algorithm to MD5, if the user supplies 'ima_hash=md5'
parameter on the command line. With commit ab60368ab6 ("ima: Fallback
to the builtin hash algorithm"), setting "ima_hash=md5" fails gracefully
when CRYPTO_MD5 is not set:
ima: Can not allocate md5 (reason: -2)
ima: Allocating md5 failed, going to use default hash algorithm sha256
Remove the CRYPTO_MD5 dependency for IMA.
Signed-off-by: THOBY Simon <Simon.THOBY@viveris.fr>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
[zohar@linux.ibm.com: include commit number in patch description for
stable.]
Cc: stable@vger.kernel.org # 4.17
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 660585b56e upstream.
In case of fuse the MM subsystem doesn't guarantee that page writeback
completes by the time ->sync_fs() is called. This is because fuse
completes page writeback immediately to prevent DoS of memory reclaim by
the userspace file server.
This means that fuse itself must ensure that writes are synced before
sending the SYNCFS request to the server.
Introduce sync buckets, that hold a counter for the number of outstanding
write requests. On syncfs replace the current bucket with a new one and
wait until the old bucket's counter goes down to zero.
It is possible to have multiple syncfs calls in parallel, in which case
there could be more than one waited-on buckets. Descendant buckets must
not complete until the parent completes. Add a count to the child (new)
bucket until the (parent) old bucket completes.
Use RCU protection to dereference the current bucket and to wake up an
emptied bucket. Use fc->lock to protect against parallel assignments to
the current bucket.
This leaves just the counter to be a possible scalability issue. The
fc->num_waiting counter has a similar issue, so both should be addressed at
the same time.
Reported-by: Amir Goldstein <amir73il@gmail.com>
Fixes: 2d82ab251e ("virtiofs: propagate sync() to file server")
Cc: <stable@vger.kernel.org> # v5.14
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 59bda8ecee upstream.
Callers of fuse_writeback_range() assume that the file is ready for
modification by the server in the supplied byte range after the call
returns.
If there's a write that extends the file beyond the end of the supplied
range, then the file needs to be extended to at least the end of the range,
but currently that's not done.
There are at least two cases where this can cause problems:
- copy_file_range() will return short count if the file is not extended
up to end of the source range.
- FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE will not extend the file,
hence the region may not be fully allocated.
Fix by flushing writes from the start of the range up to the end of the
file. This could be optimized if the writes are non-extending, etc, but
it's probably not worth the trouble.
Fixes: a2bc923629 ("fuse: fix copy_file_range() in the writeback case")
Fixes: 6b1bdb56b1 ("fuse: allow fallocate(FALLOC_FL_ZERO_RANGE)")
Cc: <stable@vger.kernel.org> # v5.2
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76224355db upstream.
fuse_finish_open() will be called with FUSE_NOWRITE in case of atomic
O_TRUNC. This can deadlock with fuse_wait_on_page_writeback() in
fuse_launder_page() triggered by invalidate_inode_pages2().
Fix by replacing invalidate_inode_pages2() in fuse_finish_open() with a
truncate_pagecache() call. This makes sense regardless of FOPEN_KEEP_CACHE
or fc->writeback cache, so do it unconditionally.
Reported-by: Xie Yongji <xieyongji@bytedance.com>
Reported-and-tested-by: syzbot+bea44a5189836d956894@syzkaller.appspotmail.com
Fixes: e4648309b8 ("fuse: truncate pending writes on O_TRUNC")
Cc: <stable@vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 46d4703b1d upstream.
We are seeing the following warning in raid10_handle_discard.
[ 695.110751] =============================
[ 695.131439] WARNING: suspicious RCU usage
[ 695.151389] 4.18.0-319.el8.x86_64+debug #1 Not tainted
[ 695.174413] -----------------------------
[ 695.192603] drivers/md/raid10.c:1776 suspicious
rcu_dereference_check() usage!
[ 695.225107] other info that might help us debug this:
[ 695.260940] rcu_scheduler_active = 2, debug_locks = 1
[ 695.290157] no locks held by mkfs.xfs/10186.
In the first loop of function raid10_handle_discard. It already
determines which disk need to handle discard request and add the
rdev reference count rdev->nr_pending. So the conf->mirrors will
not change until all bios come back from underlayer disks. It
doesn't need to use rcu_dereference to get rdev.
Cc: stable@vger.kernel.org
Fixes: d30588b273 ('md/raid10: improve raid10 discard request')
Signed-off-by: Xiao Ni <xni@redhat.com>
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ecc53c48c1 upstream.
For the two places where new workers are created, we diligently check if
we are allowed to create a new worker. If we're currently at the limit
of how many workers of a given type we can have, then we don't create
any new ones.
If you have a mixed workload with various types of bound and unbounded
work, then it can happen that a worker finishes one type of work and
is then transitioned to the other type. For this case, we don't check
if we are actually allowed to do so. This can cause io-wq to temporarily
exceed the allowed number of workers for a given type.
When retrieving work, check that the types match. If they don't, check
if we are allowed to transition to the other type. If not, then don't
handle the new work.
Cc: stable@vger.kernel.org
Reported-by: Johannes Lundberg <johalun0@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3134cc8beb upstream.
When a mapped level interrupt (a timer, for example) is deactivated
by the guest, the corresponding host interrupt is equally deactivated.
However, the fate of the pending state still needs to be dealt
with in SW.
This is specially true when the interrupt was in the active+pending
state in the virtual distributor at the point where the guest
was entered. On exit, the pending state is potentially stale
(the guest may have put the interrupt in a non-pending state).
If we don't do anything, the interrupt will be spuriously injected
in the guest. Although this shouldn't have any ill effect (spurious
interrupts are always possible), we can improve the emulation by
detecting the deactivation-while-pending case and resample the
interrupt.
While we're at it, move the logic into a common helper that can
be shared between the two GIC implementations.
Fixes: e40cc57bac ("KVM: arm/arm64: vgic: Support level-triggered mapped interrupts")
Reported-by: Raghavendra Rao Ananta <rananta@google.com>
Tested-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210819180305.1670525-1-maz@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 47e6223c84 upstream.
Booting a KVM host in protected mode with kmemleak quickly results
in a pretty bad crash, as kmemleak doesn't know that the HYP sections
have been taken away. This is specially true for the BSS section,
which is part of the kernel BSS section and registered at boot time
by kmemleak itself.
Unregister the HYP part of the BSS before making that section
HYP-private. The rest of the HYP-specific data is obtained via
the page allocator or lives in other sections, none of which is
subjected to kmemleak.
Fixes: 90134ac9ca ("KVM: arm64: Protect the .hyp sections from the host")
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org # 5.13
Link: https://lore.kernel.org/r/20210802123830.2195174-3-maz@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7782bb8d8 upstream.
Clear nested.pi_pending on nested VM-Enter even if L2 will run without
posted interrupts enabled. If nested.pi_pending is left set from a
previous L2, vmx_complete_nested_posted_interrupt() will pick up the
stale flag and exit to userspace with an "internal emulation error" due
the new L2 not having a valid nested.pi_desc.
Arguably, vmx_complete_nested_posted_interrupt() should first check for
posted interrupts being enabled, but it's also completely reasonable that
KVM wouldn't screw up a fundamental flag. Not to mention that the mere
existence of nested.pi_pending is a long-standing bug as KVM shouldn't
move the posted interrupt out of the IRR until it's actually processed,
e.g. KVM effectively drops an interrupt when it performs a nested VM-Exit
with a "pending" posted interrupt. Fixing the mess is a future problem.
Prior to vmx_complete_nested_posted_interrupt() interpreting a null PI
descriptor as an error, this was a benign bug as the null PI descriptor
effectively served as a check on PI not being enabled. Even then, the
new flow did not become problematic until KVM started checking the result
of kvm_check_nested_events().
Fixes: 705699a139 ("KVM: nVMX: Enable nested posted interrupt processing")
Fixes: 966eefb896 ("KVM: nVMX: Disable vmcs02 posted interrupts if vmcs12 PID isn't mappable")
Fixes: 47d3530f86c0 ("KVM: x86: Exit to userspace when kvm_check_nested_events fails")
Cc: stable@vger.kernel.org
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210810144526.2662272-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 088acd2352 upstream.
Factor in whether or not the old/new SPTEs are shadow-present when
adjusting the large page stats in the TDP MMU. A modified MMIO SPTE can
toggle the page size bit, as bit 7 is used to store the MMIO generation,
i.e. is_large_pte() can get a false positive when called on a MMIO SPTE.
Ditto for nuking SPTEs with REMOVED_SPTE, which sets bit 7 in its magic
value.
Opportunistically move the logic below the check to verify at least one
of the old/new SPTEs is shadow present.
Use is/was_leaf even though is/was_present would suffice. The code
generation is roughly equivalent since all flags need to be computed
prior to the code in question, and using the *_leaf flags will minimize
the diff in a future enhancement to account all pages, i.e. will change
the check to "is_leaf != was_leaf".
Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Fixes: 1699f65c8b ("kvm/x86: Fix 'lpages' kvm stat for TDM MMU")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20210803044607.599629-3-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ec607a564f upstream.
This change started as a way to make kvm_mmu_hugepage_adjust a bit simpler,
but it does fix two bugs as well.
One bug is in zapping collapsible PTEs. If a large page size is
disallowed but not all of them, kvm_mmu_max_mapping_level will return the
host mapping level and the small PTEs will be zapped up to that level.
However, if e.g. 1GB are prohibited, we can still zap 4KB mapping and
preserve the 2MB ones. This can happen for example when NX huge pages
are in use.
The second would happen when userspace backs guest memory
with a 1gb hugepage but only assign a subset of the page to
the guest. 1gb pages would be disallowed by the memslot, but
not 2mb. kvm_mmu_max_mapping_level() would fall through to the
host_pfn_mapping_level() logic, see the 1gb hugepage, and map the whole
thing into the guest.
Fixes: 2f57b7051f ("KVM: x86/mmu: Persist gfn_lpage_is_disallowed() to max_level")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d9130a2dfd upstream.
When MSR_IA32_TSC_ADJUST is written by guest due to TSC ADJUST feature
especially there's a big tsc warp (like a new vCPU is hot-added into VM
which has been up for a long time), tsc_offset is added by a large value
then go back to guest. This causes system time jump as tsc_timestamp is
not adjusted in the meantime and pvclock monotonic character.
To fix this, just notify kvm to update vCPU's guest time before back to
guest.
Cc: stable@vger.kernel.org
Signed-off-by: Zelin Deng <zelin.deng@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1619576521-81399-2-git-send-email-zelin.deng@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3e03bc136 upstream.
While in practice vcpu->vcpu_idx == vcpu->vcp_id is often true, it may
not always be, and we must not rely on this. Reason is that KVM decides
the vcpu_idx, userspace decides the vcpu_id, thus the two might not
match.
Currently kvm->arch.idle_mask is indexed by vcpu_id, which implies
that code like
for_each_set_bit(vcpu_id, kvm->arch.idle_mask, online_vcpus) {
vcpu = kvm_get_vcpu(kvm, vcpu_id);
do_stuff(vcpu);
}
is not legit. Reason is that kvm_get_vcpu expects an vcpu_idx, not an
vcpu_id. The trouble is, we do actually use kvm->arch.idle_mask like
this. To fix this problem we have two options. Either use
kvm_get_vcpu_by_id(vcpu_id), which would loop to find the right vcpu_id,
or switch to indexing via vcpu_idx. The latter is preferable for obvious
reasons.
Let us make switch from indexing kvm->arch.idle_mask by vcpu_id to
indexing it by vcpu_idx. To keep gisa_int.kicked_mask indexed by the
same index as idle_mask lets make the same change for it as well.
Fixes: 1ee0bc559d ("KVM: s390: get rid of local_int array")
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Bornträger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: <stable@vger.kernel.org> # 3.15+
Link: https://lore.kernel.org/r/20210827125429.1912577-1-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7177339d7 upstream.
Revert a misguided illegal GPA check when "translating" a non-nested GPA.
The check is woefully incomplete as it does not fill in @exception as
expected by all callers, which leads to KVM attempting to inject a bogus
exception, potentially exposing kernel stack information in the process.
WARNING: CPU: 0 PID: 8469 at arch/x86/kvm/x86.c:525 exception_type+0x98/0xb0 arch/x86/kvm/x86.c:525
CPU: 1 PID: 8469 Comm: syz-executor531 Not tainted 5.14.0-rc7-syzkaller #0
RIP: 0010:exception_type+0x98/0xb0 arch/x86/kvm/x86.c:525
Call Trace:
x86_emulate_instruction+0xef6/0x1460 arch/x86/kvm/x86.c:7853
kvm_mmu_page_fault+0x2f0/0x1810 arch/x86/kvm/mmu/mmu.c:5199
handle_ept_misconfig+0xdf/0x3e0 arch/x86/kvm/vmx/vmx.c:5336
__vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6021 [inline]
vmx_handle_exit+0x336/0x1800 arch/x86/kvm/vmx/vmx.c:6038
vcpu_enter_guest+0x2a1c/0x4430 arch/x86/kvm/x86.c:9712
vcpu_run arch/x86/kvm/x86.c:9779 [inline]
kvm_arch_vcpu_ioctl_run+0x47d/0x1b20 arch/x86/kvm/x86.c:10010
kvm_vcpu_ioctl+0x49e/0xe50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3652
The bug has escaped notice because practically speaking the GPA check is
useless. The GPA check in question only comes into play when KVM is
walking guest page tables (or "translating" CR3), and KVM already handles
illegal GPA checks by setting reserved bits in rsvd_bits_mask for each
PxE, or in the case of CR3 for loading PTDPTRs, manually checks for an
illegal CR3. This particular failure doesn't hit the existing reserved
bits checks because syzbot sets guest.MAXPHYADDR=1, and IA32 architecture
simply doesn't allow for such an absurd MAXPHYADDR, e.g. 32-bit paging
doesn't define any reserved PA bits checks, which KVM emulates by only
incorporating the reserved PA bits into the "high" bits, i.e. bits 63:32.
Simply remove the bogus check. There is zero meaningful value and no
architectural justification for supporting guest.MAXPHYADDR < 32, and
properly filling the exception would introduce non-trivial complexity.
This reverts commit ec7771ab47.
Fixes: ec7771ab47 ("KVM: x86: mmu: Add guest physical address check in translate_gpa()")
Cc: stable@vger.kernel.org
Reported-by: syzbot+200c08e88ae818f849ce@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210831164224.1119728-2-seanjc@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb2853a6a4 upstream.
The ops->receive_buf() may be accessed concurrently from these two
functions. If the driver flushes data to the line discipline
receive_buf() method while tiocsti() is waiting for the
ops->receive_buf() to finish its work, the data race will happen.
For example:
tty_ioctl |tty_ldisc_receive_buf
->tioctsi | ->tty_port_default_receive_buf
| ->tty_ldisc_receive_buf
->hci_uart_tty_receive | ->hci_uart_tty_receive
->h4_recv | ->h4_recv
In this case, the h4 receive buffer will be overwritten by the
latecomer, and we will lost the data.
Hence, change tioctsi() function to use the exclusive lock interface
from tty_buffer to avoid the data race.
Reported-by: syzbot+97388eb9d31b997fe1d0@syzkaller.appspotmail.com
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Nguyen Dinh Phi <phind.uet@gmail.com>
Link: https://lore.kernel.org/r/20210823000641.2082292-1-phind.uet@gmail.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3998f0b8bc upstream.
RHBZ: 1994393
If we hit a STATUS_USER_SESSION_DELETED for the Create part in the
Create/QueryDirectory compound that starts a directory scan
we will leak EDEADLK back to userspace and surprise glibc and the application.
Pick this up initiate_cifs_search() and retry a small number of tries before we
return an error to userspace.
Cc: stable@vger.kernel.org
Reported-by: Xiaoli Feng <xifeng@redhat.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fc2a7a62e upstream.
It currently takes a long, and while that's normally OK, the io_uring
limit is an int. Internally in io_uring it's an int, but sometimes it's
passed as a long. That can yield confusing results where a completions
seems to generate a huge result:
ou-sqp-1297-1298 [001] ...1 788.056371: io_uring_complete: ring 000000000e98e046, user_data 0x0, result 4294967171, cflags 0
which is due to -ECANCELED being stored in an unsigned, and then passed
in as a long. Using the right int type, the trace looks correct:
iou-sqp-338-339 [002] ...1 15.633098: io_uring_complete: ring 00000000e0ac60cf, user_data 0x0, result -125, cflags 0
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b3188e7ed upstream.
During some testing, it became evident that using IORING_OP_WRITE doesn't
hash buffered writes like the other writes commands do. That's simply
an oversight, and can cause performance regressions when doing buffered
writes with this command.
Correct that and add the flag, so that buffered writes are correctly
hashed when using the non-iovec based write command.
Cc: stable@vger.kernel.org
Fixes: 3a6820f2bb ("io_uring: add non-vectored read/write commands")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 39ff83f2f6 upstream.
timespec64_ns() prevents multiplication overflows by comparing the seconds
value of the timespec to KTIME_SEC_MAX. If the value is greater or equal it
returns KTIME_MAX.
But that check casts the signed seconds value to unsigned which makes the
comparision true for all negative values and therefore return wrongly
KTIME_MAX.
Negative second values are perfectly valid and required in some places,
e.g. ptp_clock_adjtime().
Remove the cast and add a check for the negative boundary which is required
to prevent undefined behaviour due to multiplication underflow.
Fixes: cb47755725 ("time: Prevent undefined behaviour in timespec64_to_ns()")'
Signed-off-by: Lukas Hannen <lukas.hannen@opensource.tttech-industrial.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/AM6PR01MB541637BD6F336B8FFB72AF80EEC69@AM6PR01MB5416.eurprd01.prod.exchangelabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dddd3d6529 upstream.
We must flush all the dirty data when enabling checkpoint back. Let's guarantee
that first by adding a retry logic on sync_inodes_sb(). In addition to that,
this patch adds to flush data in fsync when checkpoint is disabled, which can
mitigate the sync_inodes_sb() failures in advance.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 92548b0ee2 ]
The UDP length field should be in network order.
This removes the following sparse error:
net/ipv4/route.c:3173:27: warning: incorrect type in assignment (different base types)
net/ipv4/route.c:3173:27: expected restricted __be16 [usertype] len
net/ipv4/route.c:3173:27: got unsigned long
Fixes: 404eb77ea7 ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1e4428b6db ]
With current config, for packets with IPv4 checksum errors,
errorcode is being set to UNKNOWN. Hence added a separate
errorcodes for outer and inner IPv4 checksum and changed
NPC configuration accordingly.
Also turn on L2 multicast address check in NPC protocol check block.
Fixes: 6b3321bacc ("octeontx2-af: Enable packet length and csum validation")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 698a82ebfb ]
This patch fixes the static code analyzer reported issues
in rvu_npc.c. The reported errors are different sizes of
operands in bitops and returning uninitialized values.
Fixes: 651cd26523 ("octeontx2-af: MCAM entry installation support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f2e4568ec9 ]
In npc_update_vf_flow_entry function the loop cursor
'index' is being changed inside the loop causing
the loop to spin forever. This in turn hogs the kworker
thread forever and no other mbox message is processed
by AF driver after that. Fix this by using
another variable in the loop.
Fixes: 55307fcb92 ("octeontx2-af: Add mbox messages to install and delete MCAM rules")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6537e96d74 ]
When the given counter does not belong to the entry
then code ends up in infinite loop because the loop
cursor, entry is not getting updated further. This
patch fixes that by updating entry for every iteration.
Fixes: a958dd59f9 ("octeontx2-af: Map or unmap NPC MCAM entry and counter")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 429205da6c ]
Based on tests the QCA7000 doesn't support checksum offloading. So assume
ip_summed is CHECKSUM_NONE and let the kernel take care of the checksum
handling. This fixes data transfer issues in noisy environments.
Reported-by: Michael Heimpold <michael.heimpold@in-tech.com>
Fixes: 291ab06ecf ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca49bfd90a ]
In HTB offload mode, qdiscs of leaf classes are grafted to netdev
queues. sch_htb expects the dev_queue field of these qdiscs to point to
the corresponding queues. However, qdisc creation may fail, and in that
case noop_qdisc is used instead. Its dev_queue doesn't point to the
right queue, so sch_htb can lose track of used netdev queues, which will
cause internal inconsistencies.
This commit fixes this bug by keeping track of the netdev queue inside
struct htb_class. All reads of cl->leaf.q->dev_queue are replaced by the
new field, the two values are synced on writes, and WARNs are added to
assert equality of the two values.
The driver API has changed: when TC_HTB_LEAF_DEL needs to move a queue,
the driver used to pass the old and new queue IDs to sch_htb. Now that
there is a new field (offload_queue) in struct htb_class that needs to
be updated on this operation, the driver will pass the old class ID to
sch_htb instead (it already knows the new class ID).
Fixes: d03b195b5a ("sch_htb: Hierarchical QoS hardware offload")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20210826115425.1744053-1-maximmi@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aaa8e4922c ]
These checks are still not strict enough. The main problem is that if
"cb->type == QRTR_TYPE_NEW_SERVER" is true then "len - hdrlen" is
guaranteed to be 4 but we need to be at least 16 bytes. In fact, we
can reject everything smaller than sizeof(*pkt) which is 20 bytes.
Also I don't like the ALIGN(size, 4). It's better to just insist that
data is needs to be aligned at the start.
Fixes: 0baa99ee35 ("net: qrtr: Allow non-immediate node routing")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 67d6d681e1 ]
Even after commit 6457378fe7 ("ipv4: use siphash instead of Jenkins in
fnhe_hashfun()"), an attacker can still use brute force to learn
some secrets from a victim linux host.
One way to defeat these attacks is to make the max depth of the hash
table bucket a random value.
Before this patch, each bucket of the hash table used to store exceptions
could contain 6 items under attack.
After the patch, each bucket would contains a random number of items,
between 6 and 10. The attacker can no longer infer secrets.
This is slightly increasing memory size used by the hash table,
by 50% in average, we do not expect this to be a problem.
This patch is more complex than the prior one (IPv6 equivalent),
because IPv4 was reusing the oldest entry.
Since we need to be able to evict more than one entry per
update_or_create_fnhe() call, I had to replace
fnhe_oldest() with fnhe_remove_oldest().
Also note that we will queue extra kfree_rcu() calls under stress,
which hopefully wont be a too big issue.
Fixes: 4895c771c7 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Keyu Man <kman001@ucr.edu>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: David Ahern <dsahern@kernel.org>
Tested-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a00df2caff ]
Even after commit 4785305c05 ("ipv6: use siphash in rt6_exception_hash()"),
an attacker can still use brute force to learn some secrets from a victim
linux host.
One way to defeat these attacks is to make the max depth of the hash
table bucket a random value.
Before this patch, each bucket of the hash table used to store exceptions
could contain 6 items under attack.
After the patch, each bucket would contains a random number of items,
between 6 and 10. The attacker can no longer infer secrets.
This is slightly increasing memory size used by the hash table,
we do not expect this to be a problem.
Following patch is dealing with the same issue in IPv4.
Fixes: 35732d01fe ("ipv6: introduce a hash table to store dst cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Keyu Man <kman001@ucr.edu>
Cc: Wei Wang <weiwan@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d745ca4f2c ]
When resuming from suspend, brcmf_pcie_pm_leave_D3 will first attempt a
hot resume and then fall back to removing the PCI device and then
reprobing. If this probe fails, the kernel will oops, because brcmf_err,
which is called to report the failure will dereference the stale bus
pointer. Open code and use the default bus-less brcmf_err to avoid this.
Fixes: 8602e62441 ("brcmfmac: pass bus to the __brcmf_err() in pcie.c")
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210817063521.22450-1-a.fatoum@pengutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b63aed3ff1 ]
kmemleak reported that dev_name() of internally-handled cores were leaked
on driver unbinding. Let's use device_initialize() to take refcounts for
them and put_device() to properly free the related stuff.
While looking at it, there's another potential issue for those which should
be *registered* into driver core. If device_register() failed, we put
device once and freed bcma_device structures. In bcma_unregister_cores(),
they're treated as unregistered and we hit both UAF and double-free. That
smells not good and has also been fixed now.
Fixes: ab54bc8460 ("bcma: fill core details for every device")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210727025232.663-2-yuzenghui@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 57f780f1c4 ]
Driver crashes when restoring from the Hibernate. In the resume flow,
driver need to clean up the older nic/vec objects and re-initialize them.
Fixes: 8aaa112a57 ("net: atlantic: refactoring pm logic")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4801bee7d5 ]
For making user to switch back to the old playback mode, this patch
adds a new module option 'lowlatency' to snd-usb-audio driver.
When user face a regression due to the recent low-latency playback
support, they can test easily by passing lowlatency=0 option without
rebuilding the kernel.
Fixes: 307cc9baac ("ALSA: usb-audio: Reduce latency at playback start, take#2")
Link: https://lore.kernel.org/r/20210829073830.22686-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d55649d2a ]
Enabling interrupts via device tree for the internal PHYs on the
mv88e6390 DSA switch does not work. The driver insists to use poll mode.
Stage one debugging shows that the fwnode_mdiobus_phy_device_register
function calls fwnode_irq_get properly, and phy->irq is set to a valid
interrupt line initially.
But it is then cleared.
Stage two debugging shows that it is cleared here:
phy_probe:
/* Disable the interrupt if the PHY doesn't support it
* but the interrupt is still a valid one
*/
if (!phy_drv_supports_irq(phydrv) && phy_interrupt_is_valid(phydev))
phydev->irq = PHY_POLL;
Okay, so does the "Marvell 88E6390 Family" PHY driver not have the
.config_intr and .handle_interrupt function pointers? Yes it does.
Stage three debugging shows that the PHY device does not attempt a probe
against the "Marvell 88E6390 Family" driver, but against the "mv88x3310"
driver.
Okay, so why does the "mv88x3310" driver match on a mv88x6390 internal
PHY? The PHY IDs (MARVELL_PHY_ID_88E6390_FAMILY vs
MARVELL_PHY_ID_88X3310) are way different.
Stage four debugging has us looking through:
phy_device_register
-> device_add
-> bus_probe_device
-> device_initial_probe
-> __device_attach
-> bus_for_each_drv
-> driver_match_device
-> drv->bus->match
-> phy_bus_match
Okay, so as we said, the MII_PHYSID1 of mv88e6390 does not match the
mv88x3310 driver's PHY mask & ID, so why would phy_bus_match return...
Ahh, phy_bus_match calls a shortcircuit method,
phydrv->match_phy_device, and does not even bother to compare the PHY ID
if that is implemented.
So of course, we go inside the marvell10g.c driver and sure enough, it
implements .match_phy_device and does not bother to check the PHY ID.
What's interesting though is that at the end of the device_add() from
phy_device_register(), the driver for the internal PHYs _is_ the proper
"Marvell 88E6390 Family". This is because "mv88x3310" ends up failing to
probe after all, and __device_attach_driver(), to quote:
/*
* Ignore errors returned by ->probe so that the next driver can try
* its luck.
*/
The next (and only other) driver that matches is the 6390 driver. For
this one, phy_probe doesn't fail, and everything expects to work as
normal, EXCEPT phydev->irq has already been cleared by the previous
unsuccessful probe of a driver which did not implement PHY interrupts,
and therefore cleared that IRQ.
Okay, so it is not just Marvell 6390 that has PHY interrupts broken.
Stuff like Atheros, Aquantia, Broadcom, Qualcomm work because they are
lexicographically before Marvell, and stuff like NXP, Realtek, Vitesse
are broken.
This goes to show how fragile it is to reset phydev->irq = PHY_POLL from
the actual beginning of phy_probe itself. That seems like an actual bug
of its own too, since phy_probe has side effects which are not restored
on probe failure, but the line of thought probably was, the same driver
will attempt probe again, so it doesn't matter. Well, looks like it
does.
Maybe it would make more sense to move the phydev->irq clearing after
the actual device_add() in phy_device_register() completes, and the
bound driver is the actual final one.
(also, a bit frightening that drivers are permitted to bypass the MDIO
bus matching in such a trivial way and perform PHY reads and writes from
the .match_phy_device method, on devices that do not even belong to
them. In the general case it might not be guaranteed that the MDIO
accesses one driver needs to make to figure out whether to match on a
device is safe for all other PHY devices)
Fixes: a5de4be0aa ("net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210827132541.28953-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ee313433c ]
When we enabled auxiliary input/output support for the E810 device, we
forgot to add logic to restart the output when we change time. This is
important as the periodic output will be incorrect after a time change
otherwise.
This unfortunately includes the adjust time function, even though it
uses an atomic hardware interface. The atomic adjustment can still cause
the pin output to stall permanently, so we need to stop and restart it.
Introduce wrapper functions to temporarily disable and then re-enable
the clock outputs.
Fixes: 172db5f91d ("ice: add support for auxiliary input/output pins")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Sunitha D Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4dd0d5c33c ]
The driver didn't take the lock while flushing the Tx tracker, which
could cause a race where one thread is trying to read timestamps out
while another thread is trying to read the tracker to check the
timestamps.
Avoid this by ensuring that flushing is locked against read accesses.
Fixes: ea9b847cda ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 84c5fb8c42 ]
The driver accidentally copied the ice_for_each_rxq iterator when
implementing enablement of the ptp_tx bit for the Tx rings. We still
load the Tx rings and set the ptp_tx field, but we iterate over the
count of the num_rxq.
If the number of Tx and Rx queues differ, this could either cause
a buffer overrun when accessing the tx_rings list if num_txq is greater
than num_rxq, or it could cause us to fail to enable Tx timestamps for
some rings.
This was not noticed originally as we generally have the same number of
Tx and Rx queues.
Fixes: ea9b847cda ("ice: enable transmit timestamps for E810 devices")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f9d196bd63 ]
If link aggregation is used within stack devices driver rejects encap
rules if PF of the VF tunnel device is down. This happens because route
resolved for other PF and its eswitch instance is used to determine
correct vport.
To fix that use devcom feature to retrieve other eswitch instance if
failed to find vport for the 1st eswitch and LAG is active.
Fixes: 10742efc20 ("net/mlx5e: VF tunnel TX traffic offloading")
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca6891f9b2 ]
When indirect forward group is created, flow is added with vhca id but
without setting vhca id valid flag which violates the PRM.
Fix by setting the missing flag, vhca id valid.
Fixes: 34ca65352d ("net/mlx5: E-Switch, Indirect table infrastructure")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9a5f9cc794 ]
After neigh-update-add failure we are still with a slow path rule but
the driver always assume the rule is an fdb rule.
Fix neigh-update-del by checking slow path tc flag on the flow.
Also fix neigh-update-add for when neigh-update-del fails the same.
Fixes: 5dbe906ff1 ("net/mlx5e: Use a slow path rule instead if vxlan neighbour isn't available")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e7e2e8ed0 ]
The call to mlx5_unregister_device() means that mlx5_core driver is
removed. In such scenario, we need to disregard all other flags like
attach/detach and forcibly remove all auxiliary devices.
Fixes: a5ae8fc905 ("net/mlx5e: Don't create devices during unload flow")
Tested-and-Reported-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2f8b6161cc ]
When handling FIB_EVENT_ENTRY_REPLACE event for a new multipath route,
lag activation can be missed if a stale (struct lag_mp)->mfi pointer
exists, which was associated with an older multipath route that had been
removed.
Normally, when a route is removed, it triggers mlx5_lag_fib_event(),
which handles FIB_EVENT_ENTRY_DEL and clears mfi pointer. But, if
mlx5_lag_check_prereq() condition isn't met, for example when eswitch is
in legacy mode, the fib event is skipped and mfi pointer becomes stale.
Fix by resetting mfi pointer to NULL in mlx5_deactivate_lag().
Fixes: 8a66e45859 ("net/mlx5: Change ownership model for lag")
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7a6a723e98 ]
There is no point in calling 'free_irq()' explicitly for
'WCD9335_IRQ_SLIMBUS' in the remove function.
The irqs are requested in 'wcd9335_setup_irqs()' using a resource managed
function (i.e. 'devm_request_threaded_irq()').
'wcd9335_setup_irqs()' requests all what is defined in the 'wcd9335_irqs'
structure.
This structure has only one entry for 'WCD9335_IRQ_SLIMBUS'.
So 'devm_request...irq()' + explicit 'free_irq()' would lead to a double
free.
Remove the unneeded 'free_irq()' from the remove function.
Fixes: 20aedafdf4 ("ASoC: wcd9335: add support to wcd9335 codec")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Message-Id: <0614d63bc00edd7e81dd367504128f3d84f72efa.1629091028.git.christophe.jaillet@wanadoo.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6f15a2a09c ]
If an error occurs after a successful 'clk_prepare_enable()' call, it must
be undone by a corresponding 'clk_disable_unprepare()' call.
This call is already present in the remove function.
Add this call in the error handling path and reorder the code so that the
'clk_prepare_enable()' call happens later in the function.
The goal is to have as much managed resources functions as possible
before the 'clk_prepare_enable()' call in order to keep the error handling
path simple.
While at it, remove the now unneeded 'clk' variable.
Fixes: c87dca0478 ("usb: bdc: Add clock enable for new chips with a separate BDC clock")
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/f8a4a6897deb0c8cb2e576580790303550f15fcd.1629314734.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4720f1bf4e ]
ehci_orion_drv_probe() did not account for possible errors of
clk_prepare_enable() that in particular could cause invocation of
clk_disable_unprepare() on clocks that were not prepared/enabled yet,
e.g. in remove or on handling errors of usb_add_hcd() in probe. Though,
there were several patches fixing different issues with clocks in this
driver, they did not solve this problem.
Add handling of errors of clk_prepare_enable() in ehci_orion_drv_probe()
to avoid calls of clk_disable_unprepare() without previous successful
invocation of clk_prepare_enable().
Found by Linux Driver Verification project (linuxtesting.org).
Fixes: 8c869edaee ("ARM: Orion: EHCI: Add support for enabling clocks")
Co-developed-by: Kirill Shilimanov <kirill.shilimanov@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Signed-off-by: Kirill Shilimanov <kirill.shilimanov@huawei.com>
Link: https://lore.kernel.org/r/20210825170902.11234-1-novikov@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5e8243e66b ]
If otx2_mbox_get_rsp() fails, otx2_set_flowkey_cfg() need return an
error code.
Fixes: e793836545 ("octeontx2-pf: Fix algorithm index in MCAM rules with RSS action")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 661e8a88e8 ]
Iff platform_get_irq() returns 0 for the main IRQ, the driver's probe()
method will return 0 early (as if the method's call was successful).
Let's consider IRQ0 valid for simplicity -- devm_request_irq() can always
override that decision...
Fixes: 2bbd681ba2 ("i2c: xlp9xx: Driver for Netlogic XLP9XX/5XX I2C controller")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f980d055a0 ]
strlcpy() reads the entire source buffer first. This read may exceed the
destination size limit. This is both inefficient and can lead to linear
read overflows if a source string is not NUL-terminated.
Also, the strnlen() call does not avoid the read overflow in the strlcpy
function when a not NUL-terminated string is passed.
So, replace this block by a call to kstrndup() that avoids this type of
overflow and does the same.
Fixes: 066ce68994 ("cifs: rename cifs_strlcpy_to_host and make it use new functions")
Signed-off-by: Len Baker <len.baker@gmx.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d68cd9120 ]
Commit adae1e931a ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out
of the ring buffer") introduced a notion of maximum packet size and for
KVM and FCOPY drivers set it to the length of the receive buffer. VSS
driver wasn't updated, this means that the maximum packet size is now
VMBUS_DEFAULT_MAX_PKT_SIZE (4k). Apparently, this is not enough. I'm
observing a packet of 6304 bytes which is being truncated to 4096. When
VSS driver tries to read next packet from ring buffer it starts from the
wrong offset and receives garbage.
Set the maximum packet size to 'HV_HYP_PAGE_SIZE * 2' in VSS driver. This
matches the length of the receive buffer and is in line with other utils
drivers.
Fixes: adae1e931a ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210825133857.847866-1-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d7af7e497f ]
Fix a verifier bug found by smatch static checker in [0].
This problem has never been seen in prod to my best knowledge. Fixing it
still seems to be a good idea since it's hard to say for sure whether
it's possible or not to have a scenario where a combination of
convert_ctx_access() and a narrow load would lead to an out of bound
write.
When narrow load is handled, one or two new instructions are added to
insn_buf array, but before it was only checked that
cnt >= ARRAY_SIZE(insn_buf)
And it's safe to add a new instruction to insn_buf[cnt++] only once. The
second try will lead to out of bound write. And this is what can happen
if `shift` is set.
Fix it by making sure that if the BPF_RSH instruction has to be added in
addition to BPF_AND then there is enough space for two more instructions
in insn_buf.
The full report [0] is below:
kernel/bpf/verifier.c:12304 convert_ctx_accesses() warn: offset 'cnt' incremented past end of array
kernel/bpf/verifier.c:12311 convert_ctx_accesses() warn: offset 'cnt' incremented past end of array
kernel/bpf/verifier.c
12282
12283 insn->off = off & ~(size_default - 1);
12284 insn->code = BPF_LDX | BPF_MEM | size_code;
12285 }
12286
12287 target_size = 0;
12288 cnt = convert_ctx_access(type, insn, insn_buf, env->prog,
12289 &target_size);
12290 if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf) ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Bounds check.
12291 (ctx_field_size && !target_size)) {
12292 verbose(env, "bpf verifier is misconfigured\n");
12293 return -EINVAL;
12294 }
12295
12296 if (is_narrower_load && size < target_size) {
12297 u8 shift = bpf_ctx_narrow_access_offset(
12298 off, size, size_default) * 8;
12299 if (ctx_field_size <= 4) {
12300 if (shift)
12301 insn_buf[cnt++] = BPF_ALU32_IMM(BPF_RSH,
^^^^^
increment beyond end of array
12302 insn->dst_reg,
12303 shift);
--> 12304 insn_buf[cnt++] = BPF_ALU32_IMM(BPF_AND, insn->dst_reg,
^^^^^
out of bounds write
12305 (1 << size * 8) - 1);
12306 } else {
12307 if (shift)
12308 insn_buf[cnt++] = BPF_ALU64_IMM(BPF_RSH,
12309 insn->dst_reg,
12310 shift);
12311 insn_buf[cnt++] = BPF_ALU64_IMM(BPF_AND, insn->dst_reg,
^^^^^^^^^^^^^^^
Same.
12312 (1ULL << size * 8) - 1);
12313 }
12314 }
12315
12316 new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
12317 if (!new_prog)
12318 return -ENOMEM;
12319
12320 delta += cnt - 1;
12321
12322 /* keep walking new program and skip insns we just inserted */
12323 env->prog = new_prog;
12324 insn = new_prog->insnsi + i + delta;
12325 }
12326
12327 return 0;
12328 }
[0] https://lore.kernel.org/bpf/20210817050843.GA21456@kili/
v1->v2:
- clarify that problem was only seen by static checker but not in prod;
Fixes: 46f53a65d2 ("bpf: Allow narrow loads with offset > 0")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210820163935.1902398-1-rdna@fb.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6d0b92ac0 ]
This patch reverts commit acbf58e530 ("ASoC: wm_adsp: Let
soc_cleanup_component_debugfs remove debugfs"), and adds an
alternate solution to the issue. That patch removes the call to
debugfs_remove_recursive, which cleans up the DSPs debugfs. The
intention was to avoid an unbinding issue on an out of tree
driver/platform.
The issue with the patch is it means the driver no longer cleans up
its own debugfs, instead relying on ASoC to remove recurive on the
parent debugfs node. This is conceptually rather unclean, but also it
would prevent DSPs being added/removed independently of ASoC and soon
we are going to be upstreaming some non-audio parts with these DSPs,
which will require this.
Finally, it seems the issue on the platform is a result of the
wm_adsp2_cleanup_debugfs getting called twice. This is very likely a
problem on the platform side and will be resolved there. But in the mean
time make the code a little more robust to such issues, and again
conceptually a bit nicer, but clearing the debugfs_root variable in the
DSP structure when the debugfs is removed.
Fixes: acbf58e530 ("ASoC: wm_adsp: Let soc_cleanup_component_debugfs remove debugfs"
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20210824101552.1119-1-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8b374b649 ]
Module configuration may differ between its instances depending on
resources required and input and output audio format. Available
parameters to select from are stored in module resource and interface
(format) lists. These come from topology, together with description of
each of pipe's modules.
Ignoring index value provided by topology and relying always on 0th
entry leads to unexpected module behavior due to under/overbudged
resources assigned or impropper format selection. Fix by taking entry at
index specified by topology.
Fixes: f6fa56e225 ("ASoC: Intel: Skylake: Parse and update module config structure")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Tested-by: Lukasz Majczak <lma@semihalf.com>
Link: https://lore.kernel.org/r/20210818075742.1515155-5-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f6a4f0b424 ]
The clk_enable is supposed work when CONFIG_HAVE_CLK is false, but it
returns -EINVAL. That means some drivers fail during probe.
[ 1.680000] flexcan: probe of flexcan.0 failed with error -22
Fixes: c1fb1bf64b ("m68k: let clk_enable() return immediately if clk is NULL")
Fixes: bea8bcb12d ("m68knommu: Add support for the Coldfire m5441x.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 623da5ca70 ]
RVU SMMU widget stores the final translated PA at
RVU_AF_SMMU_TLN_FLIT0<57:18> instead of FLIT1 register. This patch
fixes the address translation logic to use the correct register.
Fixes: 893ae97214 ("octeontx2-af: cn10k: Support configurable LMTST regions")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e793836545 ]
Otherthan setting action as RSS in NPC MCAM entry, RSS flowkey
algorithm index also needs to be set. Otherwise whatever algorithm
is defined at flowkey index '0' will be considered by HW and pkt
flows will be distributed as such.
Fix this by saving the flowkey index sent by admin function while
initializing RSS and then use it when framing MCAM rules.
Fixes: 81a4362016 ("octeontx2-pf: Add RSS multi group support")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 05209e3570 ]
Whenever user changes interface MAC address both default DMAC based
MCAM rule and VLAN offload (for strip) rules are updated with new
MAC address. To update or install VLAN offload rule PF driver needs
interface's receive channel info, which is retrieved from admin
function at the time of NIXLF initialization.
If user changes MAC address before interface is UP, VLAN offload rule
installation will fail and throw error as receive channel is not valid.
To avoid this, skip VLAN offload rule installation if netdev is not UP.
This rule will anyway be reinslatted as part of open() call.
Fixes: fd9d7859db ("octeontx2-pf: Implement ingress/egress VLAN offload")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 07cccffdbd ]
Bandwidth profiles (ipolicer structure)is implemented only on CN10K
platform. But current code try to free the ipolicer memory without
checking the capibility flag leading to driver crash on OCTEONTX2
platform. This patch fixes the issue by add capability flag check.
Fixes: e8e095b3b3 ("octeontx2-af: cn10k: Bandwidth profiles config support")
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 10df5a13ac ]
This patch corrects the erroneous vlan priority mask field that was
send to npc_install_flow_req.
Fixes: 1d4d9e42c2 ("octeontx2-pf: Add tc flower hardware offload on ingress traffic")
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 477b53f3f9 ]
As per hardware the base channel number configured
for programmable channels of a block must be multiple
of number of channels of that block. This condition
is not met for SDP base channel currently. Hence this
patch ensures all the base channel numbers of all
blocks are multiple of number of channels present in
the blocks. Also instead of hardcoding SDP number
of channels the same is read from the NIX_AF_CONST1
register.
Fixes: 242da43921 ("octeontx2-af: cn10k: Add support for programmable")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b74a29fac6 ]
Add the missing unlock before return from function g2d_runqueue_worker()
in the error handling case.
Fixes: 445d3bed75 ("drm/exynos: use pm_runtime_resume_and_get()")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a28dc123fa ]
Patch 96b1454f2e ("gfs2: move freeze glock outside the make_fs_rw and _ro
functions") changed the gfs2 mount sequence so that it holds the freeze
lock before calling gfs2_make_fs_rw. Before this patch, gfs2_make_fs_rw
called init_threads to initialize the quotad and logd threads. That is a
problem if the system needs to withdraw due to IO errors early in the
mount sequence, for example, while initializing the system statfs inode:
1. An IO error causes the statfs glock to not sync properly after
recovery, and leaves items on the ail list.
2. The leftover items on the ail list causes its do_xmote call to fail,
which makes it want to withdraw. But since the glock code cannot
withdraw (because the withdraw sequence uses glocks) it relies upon
the logd daemon to initiate the withdraw.
3. The withdraw can never be performed by the logd daemon because all
this takes place before the logd daemon is started.
This patch moves function init_threads from super.c to ops_fstype.c
and it changes gfs2_fill_super to start its threads before holding the
freeze lock, and if there's an error, stop its threads after releasing
it. This allows the logd to run unblocked by the freeze lock. Thus,
the logd daemon can perform its withdraw sequence properly.
Fixes: 96b1454f2e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions")
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d6840a5e37 ]
Iff platform_get_irq() returns 0, the driver's probe() method will return 0
early (as if the method's call was successful). Let's consider IRQ0 valid
for simplicity -- devm_request_irq() can always override that decision...
Fixes: e0d1ec9785 ("i2c-s3c2410: Change IRQ to be plain integer.")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a129950516 ]
When adding the code to handle platform_get_irq*() errors in the commit
489447380a ("handle errors returned by platform_get_irq*()"), the
actual error code was enforced to be -ENXIO in the driver for some
strange reason. This didn't matter much until the deferred probing was
introduced -- which requires an actual error code to be propagated
upstream from the failure site.
While fixing this, also stop overriding the errors from request_irq() to
-EIO (done since the pre-git era).
Fixes: 489447380a ("[PATCH] handle errors returned by platform_get_irq*()")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f41a4b2b5e ]
Syzbot hit "task hung" bug in hci_req_sync(). The problem was in
unreasonable huge inquiry timeout passed from userspace.
Fix it by adding sanity check for timeout value to hci_inquiry().
Since hci_inquiry() is the only user of hci_req_sync() with user
controlled timeout value, it makes sense to check timeout value in
hci_inquiry() and don't touch hci_req_sync().
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-and-tested-by: syzbot+be2baed593ea56c6a84c@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d1f278da6b ]
When scsi_dispatch_cmd was moved to scsi_lib.c and made static, some
compilers (i.e., at least gcc 8.4.0) decided to compile this
inline. This is a problem for lkdtm.ko, which inserted a kprobe
on this function for the SCSI_DISPATCH_CMD crashpoint.
Move this crashpoint one function up the call chain to
scsi_queue_rq. Though this is also a static function, it should never be
inlined because it is assigned as a structure entry. Therefore,
kprobe_register should always be able to find it.
Fixes: 82042a2cdb ("scsi: move scsi_dispatch_cmd to scsi_lib.c")
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kevin Mitchell <kevmitch@arista.com>
Link: https://lore.kernel.org/r/20210819022940.561875-2-kevmitch@arista.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 36ca7943ac ]
When the max pages (last_page in the swap header + 1) is smaller than
the total pages (inode size) of the swapfile, iomap_swapfile_activate
overwrites sis->max with total pages.
However, frontswap_map is a swap page state bitmap allocated using the
initial sis->max page count read from the swap header. If swapfile
activation increases sis->max, it's possible for the frontswap code to
walk off the end of the bitmap, thereby corrupting kernel memory.
[djwong: modify the description a bit; the original paragraph reads:
"However, frontswap_map is allocated using max pages. When test and clear
the sis offset, which is larger than max pages, of frontswap_map in
__frontswap_invalidate_page(), neighbors of frontswap_map may be
overwritten, i.e., slab is polluted."
Note also that this bug resulted in a behavioral change: activating a
swap file that was formatted and later extended results in all pages
being activated, not the number of pages recorded in the swap header.]
This fixes the issue by considering the limitation of max pages of swap
info in iomap_swapfile_add_extent().
To reproduce the case, compile kernel with slub RED ZONE, then run test:
$ sudo stress-ng -a 1 -x softlockup,resources -t 72h --metrics --times \
--verify -v -Y /root/tmpdir/stress-ng/stress-statistic-12.yaml \
--log-file /root/tmpdir/stress-ng/stress-logfile-12.txt \
--temp-path /root/tmpdir/stress-ng/
We'll get the error log as below:
[ 1151.015141] =============================================================================
[ 1151.016489] BUG kmalloc-16 (Not tainted): Right Redzone overwritten
[ 1151.017486] -----------------------------------------------------------------------------
[ 1151.017486]
[ 1151.018997] Disabling lock debugging due to kernel taint
[ 1151.019873] INFO: 0x0000000084e43932-0x0000000098d17cae @offset=7392. First byte 0x0 instead of 0xcc
[ 1151.021303] INFO: Allocated in __do_sys_swapon+0xcf6/0x1170 age=43417 cpu=9 pid=3816
[ 1151.022538] __slab_alloc+0xe/0x20
[ 1151.023069] __kmalloc_node+0xfd/0x4b0
[ 1151.023704] __do_sys_swapon+0xcf6/0x1170
[ 1151.024346] do_syscall_64+0x33/0x40
[ 1151.024925] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1151.025749] INFO: Freed in put_cred_rcu+0xa1/0xc0 age=43424 cpu=3 pid=2041
[ 1151.026889] kfree+0x276/0x2b0
[ 1151.027405] put_cred_rcu+0xa1/0xc0
[ 1151.027949] rcu_do_batch+0x17d/0x410
[ 1151.028566] rcu_core+0x14e/0x2b0
[ 1151.029084] __do_softirq+0x101/0x29e
[ 1151.029645] asm_call_irq_on_stack+0x12/0x20
[ 1151.030381] do_softirq_own_stack+0x37/0x40
[ 1151.031037] do_softirq.part.15+0x2b/0x30
[ 1151.031710] __local_bh_enable_ip+0x4b/0x50
[ 1151.032412] copy_fpstate_to_sigframe+0x111/0x360
[ 1151.033197] __setup_rt_frame+0xce/0x480
[ 1151.033809] arch_do_signal+0x1a3/0x250
[ 1151.034463] exit_to_user_mode_prepare+0xcf/0x110
[ 1151.035242] syscall_exit_to_user_mode+0x27/0x190
[ 1151.035970] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1151.036795] INFO: Slab 0x000000003b9de4dc objects=44 used=9 fp=0x00000000539e349e flags=0xfffffc0010201
[ 1151.038323] INFO: Object 0x000000004855ba01 @offset=7376 fp=0x0000000000000000
[ 1151.038323]
[ 1151.039683] Redzone 000000008d0afd3d: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 1151.041180] Object 000000004855ba01: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 1151.042714] Redzone 0000000084e43932: 00 00 00 c0 cc cc cc cc ........
[ 1151.044120] Padding 000000000864c042: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 1151.045615] CPU: 5 PID: 3816 Comm: stress-ng Tainted: G B 5.10.50+ #7
[ 1151.046846] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
[ 1151.048633] Call Trace:
[ 1151.049072] dump_stack+0x57/0x6a
[ 1151.049585] check_bytes_and_report+0xed/0x110
[ 1151.050320] check_object+0x1eb/0x290
[ 1151.050924] ? __x64_sys_swapoff+0x39a/0x540
[ 1151.051646] free_debug_processing+0x151/0x350
[ 1151.052333] __slab_free+0x21a/0x3a0
[ 1151.052938] ? _cond_resched+0x2d/0x40
[ 1151.053529] ? __vunmap+0x1de/0x220
[ 1151.054139] ? __x64_sys_swapoff+0x39a/0x540
[ 1151.054796] ? kfree+0x276/0x2b0
[ 1151.055307] kfree+0x276/0x2b0
[ 1151.055832] __x64_sys_swapoff+0x39a/0x540
[ 1151.056466] do_syscall_64+0x33/0x40
[ 1151.057084] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1151.057866] RIP: 0033:0x150340b0ffb7
[ 1151.058481] Code: Unable to access opcode bytes at RIP 0x150340b0ff8d.
[ 1151.059537] RSP: 002b:00007fff7f4ee238 EFLAGS: 00000246 ORIG_RAX: 00000000000000a8
[ 1151.060768] RAX: ffffffffffffffda RBX: 00007fff7f4ee66c RCX: 0000150340b0ffb7
[ 1151.061904] RDX: 000000000000000a RSI: 0000000000018094 RDI: 00007fff7f4ee860
[ 1151.063033] RBP: 00007fff7f4ef980 R08: 0000000000000000 R09: 0000150340a672bd
[ 1151.064135] R10: 00007fff7f4edca0 R11: 0000000000000246 R12: 0000000000018094
[ 1151.065253] R13: 0000000000000005 R14: 000000000160d930 R15: 00007fff7f4ee66c
[ 1151.066413] FIX kmalloc-16: Restoring 0x0000000084e43932-0x0000000098d17cae=0xcc
[ 1151.066413]
[ 1151.067890] FIX kmalloc-16: Object at 0x000000004855ba01 not freed
Fixes: 67482129cd ("iomap: add a swapfile activation function")
Fixes: a45c0eccc5 ("iomap: move the swapfile code into a separate file")
Signed-off-by: Gang Deng <gavin.dg@linux.alibaba.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2af0c5ffad ]
If IRQ occurs between calling request_irq() and mv_u3d_eps_init(),
then null pointer dereference occurs since u3d->eps[] wasn't
initialized yet but used in mv_u3d_nuke().
The patch puts registration of the interrupt handler after
initializing of neccesery data.
Found by Linux Driver Verification project (linuxtesting.org).
Fixes: 90fccb529d ("usb: gadget: Gadget directory cleanup - group UDC drivers")
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Nadezda Lutovinova <lutovinova@ispras.ru>
Link: https://lore.kernel.org/r/20210818141247.4794-1-lutovinova@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7a8c68c57f ]
In the initial implementation a number of PMBUS_x_WARN_LIMITs were
mapped to MFR fields. This was incorrect as these MFR limits reflect the
rated limit as opposed to a limit which will generate warning. Instead
return -ENXIO like we were already doing for other WARN_LIMITs.
Subsequently these rated limits have been exposed generically as new
fields in the sysfs ABI so the values are still available.
Fixes: 15b2703e5e ("hwmon: (pmbus) Add driver for BluTek BPA-RS600")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Link: https://lore.kernel.org/r/20210812014000.26293-2-chris.packham@alliedtelesis.co.nz
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8d744da241 ]
The driver overrides the error codes returned by platform_get_irq() to
-ENODEV, so if it returns -EPROBE_DEFER, the driver will fail the probe
permanently instead of the deferred probing. Switch to propagating the
error codes upstream.
Fixes: 0d676a6c43 ("i2c: add support for Socionext SynQuacer I2C controller")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cbfa6f33e3 ]
Commit 0a0a66c984 ("clk: staging: Specify IOMEM dependency for Xilinx
Clocking Wizard driver") introduces a dependency on the non-existing config
IOMEM, which basically makes it impossible to include this driver into any
build. Fortunately, ./scripts/checkkconfigsymbols.py warns:
IOMEM
Referencing files: drivers/staging/clocking-wizard/Kconfig
The config for IOMEM support is called HAS_IOMEM. Correct this reference to
the intended config.
Fixes: 0a0a66c984 ("clk: staging: Specify IOMEM dependency for Xilinx Clocking Wizard driver")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20210817105404.13146-1-lukas.bulwahn@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 514ef1e62d ]
Current PCIe MEM space of size 16 MB is not enough for some combination
of PCIe cards (e.g. NVMe disk together with ath11k wifi card). ARM Trusted
Firmware for Armada 3700 platform already assigns 128 MB for PCIe window,
so extend PCIe MEM space to the end of 128 MB PCIe window which allows to
allocate more PCIe BARs for more PCIe cards.
Without this change some combination of PCIe cards cannot be used and
kernel show error messages in dmesg during initialization:
pci 0000:00:00.0: BAR 8: no space for [mem size 0x01800000]
pci 0000:00:00.0: BAR 8: failed to assign [mem size 0x01800000]
pci 0000:00:00.0: BAR 6: assigned [mem 0xe8000000-0xe80007ff pref]
pci 0000:01:00.0: BAR 8: no space for [mem size 0x01800000]
pci 0000:01:00.0: BAR 8: failed to assign [mem size 0x01800000]
pci 0000:02:03.0: BAR 8: no space for [mem size 0x01000000]
pci 0000:02:03.0: BAR 8: failed to assign [mem size 0x01000000]
pci 0000:02:07.0: BAR 8: no space for [mem size 0x00100000]
pci 0000:02:07.0: BAR 8: failed to assign [mem size 0x00100000]
pci 0000:03:00.0: BAR 0: no space for [mem size 0x01000000 64bit]
pci 0000:03:00.0: BAR 0: failed to assign [mem size 0x01000000 64bit]
Due to bugs in U-Boot port for Turris Mox, the second range in Turris Mox
kernel DTS file for PCIe must start at 16 MB offset. Otherwise U-Boot
crashes during loading of kernel DTB file. This bug is present only in
U-Boot code for Turris Mox and therefore other Armada 3700 devices are not
affected by this bug. Bug is fixed in U-Boot version 2021.07.
To not break booting new kernels on existing versions of U-Boot on Turris
Mox, use first 16 MB range for IO and second range with rest of PCIe window
for MEM.
Signed-off-by: Pali Rohár <pali@kernel.org>
Fixes: 76f6386b25 ("arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700")
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f7104cc1a9 ]
This should use the network-namespace-wide client_lock, not the
per-client cl_lock.
You shouldn't see any bugs unless you're actually using the
forced-expiry interface introduced by 89c905becc.
Fixes: 89c905becc "nfsd: allow forced expiration of NFSv4 clients"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5c11720767 ]
Some paths through svc_process() leave rqst->rq_procinfo set to
NULL, which triggers a crash if tracing happens to be enabled.
Fixes: 89ff87494c ("SUNRPC: Display RPC procedure names instead of proc numbers")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cd2d644ddb ]
After calling vfs_test_lock() the pointer to a conflicting lock can be
returned, and that lock is not guarunteed to be owned by nlm. In that
case, we cannot cast it to struct nlm_lockowner. Instead return the pid
of that conflicting lock.
Fixes: 646d73e91b ("lockd: Show pid of lockd for remote locks")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8bbd97ad0 ]
If CONFIG_DEBUG_LOCK_ALLOC=y is enabled then local_lock_t has an 'owner'
member which is checked for consistency, but nothing initialized it to
zero explicitly.
The static initializer does so implicit, and the run time allocated per CPU
storage is usually zero initialized as well, but relying on that is not
really good practice.
Fixes: 91710728d1 ("locking: Introduce local_lock()")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210815211301.969975279@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d45a1373e ]
The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to request_threaded_irq() (which
takes *unsigned* IRQ #), causing it to fail with -EINVAL, overriding an
original error code. Stop calling request_threaded_irq() with the invalid
IRQ #s.
Fixes: 9ba96ae507 ("usb: omap1: Tahvo USB transceiver driver")
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/8280d6a4-8e9a-7cfe-1aa9-db586dc9afdf@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b2f6662ac0 ]
Invoking atomic_notifier_chain_notify() requires acquiring a spinlock_t,
which can block under CONFIG_PREEMPT_RT. Notifications for members of the
cpu_pm notification chain will be issued by the idle task, which can never
block.
Making *all* atomic_notifiers use a raw_spinlock is too big of a hammer, as
only notifications issued by the idle task are problematic.
Special-case cpu_pm_notifier_chain by kludging a raw_notifier and
raw_spinlock_t together, matching the atomic_notifier behavior with a
raw_spinlock_t.
Fixes: 70d9329857 ("notifier: Fix broken error handling pattern")
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0ea53674d0 ]
Commit 0ea9fd001a ("Bluetooth: Shutdown controller after workqueues
are flushed or cancelled") introduced a regression that makes mtkbtsdio
driver stops working:
[ 36.593956] Bluetooth: hci0: Firmware already downloaded
[ 46.814613] Bluetooth: hci0: Execution of wmt command timed out
[ 46.814619] Bluetooth: hci0: Failed to send wmt func ctrl (-110)
The shutdown callback depends on the result of hdev->rx_work, so we
should call it before flushing rx_work:
-> btmtksdio_shutdown()
-> mtk_hci_wmt_sync()
-> __hci_cmd_send()
-> wait for BTMTKSDIO_TX_WAIT_VND_EVT gets cleared
-> btmtksdio_recv_event()
-> hci_recv_frame()
-> queue_work(hdev->workqueue, &hdev->rx_work)
-> clears BTMTKSDIO_TX_WAIT_VND_EVT
So move the shutdown callback before flushing TX/RX queue to resolve the
issue.
Reported-and-tested-by: Mattijs Korpershoek <mkorpershoek@baylibre.com>
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Fixes: 0ea9fd001a ("Bluetooth: Shutdown controller after workqueues are flushed or cancelled")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1975df880b ]
DMA channel status "Transmit buffer unavailable(TBU)" bit is not
considered as a successful dma tx. Hence, it should not affect
all the irq count statistic.
Fixes: 1103d3a553 ("net: stmmac: dwmac4: Also use TBU interrupt to clean TX path")
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Signed-off-by: Vijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0f0c4f1b72 ]
Currently, "sample04" and "sample05" are not working properly when
running with an IPv6 option("-6"). The commit 0f06a6787e ("samples:
Add an IPv6 "-6" option to the pktgen scripts") has omitted the addition
of this option at "sample04" and "sample05".
In order to support IPv6 option, this commit adds logic related to IPv6
option.
Fixes: 0f06a6787e ("samples: Add an IPv6 "-6" option to the pktgen scripts")
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ed43fbac71 ]
The { 0 } doesn't clear all fields in the struct, but tells to the
compiler to set all fields to zero and doesn't touch any sub-fields
if they exists.
The {} is an empty initialiser that instructs to fully initialize whole
struct including sub-fields, which is error-prone for future
devlink_flash_notify extensions.
Fixes: 6700acc5f1 ("devlink: collect flash notify params into a struct")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d164dd9a5c ]
The "probed" part of test_core_autosize copies an integer using
bpf_core_read() into an integer of a potentially different size.
On big-endian machines a destination offset is required for this to
produce a sensible result.
Fixes: 888d83b961 ("selftests/bpf: Validate libbpf's auto-sizing of LD/ST/STX instructions")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210812224814.187460-1-iii@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cea45a3bd2 ]
soc_device_match() is intended as a last resort, to handle e.g. quirks
that cannot be handled by matching based on a compatible value.
As the device nodes for the Renesas USB 3.0 Peripheral Controller on
R-Car E3 and RZ/G2E do have SoC-specific compatible values, the latter
can and should be used to match against these devices.
This also fixes support for the USB 3.0 Peripheral Controller on the
R-Car E3e (R8A779M6) SoC, which is a different grading of the R-Car E3
(R8A77990) SoC, using the same SoC-specific compatible value.
Fixes: 30025efa8b ("usb: gadget: udc: renesas_usb3: add support for r8a77990")
Fixes: 546970fdab ("usb: gadget: udc: renesas_usb3: add support for r8a774c0")
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/760981fb4cd110d7cbfc9dcffa365e7c8b25c6e5.1628696960.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0881e22c06 ]
The driver neglects to check the result of platform_get_irq()'s calls and
blithely passes the negative error codes to request_threaded_irq() (which
takes *unsigned* IRQ #), causing them both to fail with -EINVAL, overriding
an original error code. Stop calling request_threaded_irq() with the
invalid IRQ #s.
Fixes: c33fad0c37 ("usb: otg: Adding twl6030-usb transceiver driver for OMAP4430")
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/9507f50b-50f1-6dc4-f57c-3ed4e53a1c25@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ecc2f30dbb ]
The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to request_irq() (which takes
*unsigned* IRQ #), causing it to fail with -EINVAL, overriding an original
error code. Stop calling request_irq() with the invalid IRQ #s.
Fixes: 0807c500a1 ("USB: add Freescale USB OTG Transceiver driver")
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/b0a86089-8b8b-122e-fd6d-73e8c2304964@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 711087f342 ]
The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to devm_request_irq() (which takes
*unsigned* IRQ #), causing it to fail with -EINVAL, overriding an original
error code. Stop calling devm_request_irq() with the invalid IRQ #s.
Fixes: 517c4c44b3 ("usb: Add driver to allow any GPIO to be used for 7211 USB signals")
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/806d0b1a-365b-93d9-3fc1-922105ca5e61@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 50855c3157 ]
The driver neglects to check the result of platform_get_irq()'s call and
blithely passes the negative error codes to devm_request_irq() (which takes
*unsigned* IRQ #), causing it to fail with -EINVAL, overriding an original
error code. Stop calling devm_request_irq() with the invalid IRQ #s.
Fixes: 8b2e76687b ("USB: AT91 UDC updates, mostly power management")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Acked-by: Felipe Balbi <balbi@kernel.org>
Link: https://lore.kernel.org/r/6654a224-739a-1a80-12f0-76d920f87b6c@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1750069567 ]
In dwc3_qcom_acpi_register_core(), the driver neglects to check the result
of platform_get_irq()'s call and blithely assigns the negative error codes
to the allocated child device's IRQ resource and then passing this resource
to platform_device_add_resources() and later causing dwc3_otg_get_irq() to
fail anyway. Stop calling platform_device_add_resources() with the invalid
IRQ #s, so that there's less complexity in the IRQ error checking.
Fixes: 2bc02355f8 ("usb: dwc3: qcom: Add support for booting with ACPI")
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/45fec3da-1679-5bfe-5d74-219ca3fb28e7@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 772d44526e ]
When I booted up on a board that had a slightly different codec
stuffed on it, I got this message at bootup:
rt5682 9-001a: Device with ID register 6749 is not rt5682
That's normal/expected, but what wasn't normal was the splat that I
got after:
WARNING: CPU: 7 PID: 176 at drivers/regulator/core.c:2151 _regulator_put+0x150/0x158
pc : _regulator_put+0x150/0x158
...
Call trace:
_regulator_put+0x150/0x158
regulator_bulk_free+0x48/0x70
devm_regulator_bulk_release+0x20/0x2c
release_nodes+0x1cc/0x244
devres_release_all+0x44/0x60
really_probe+0x17c/0x378
...
This is because the error paths don't turn off the regulator. Let's
fix that.
Fixes: 0ddce71c21 ("ASoC: rt5682: add rt5682 codec driver")
Fixes: 87b42abae9 ("ASoC: rt5682: Implement remove callback")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20210811081751.v2.1.I4a1d9aa5d99e05aeee15c2768db600158d76cab8@changeid
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6977cc89c8 ]
'of_find_device_by_node()' takes a reference that must be released when
not needed anymore.
This is expected to be done in 'dsi_destroy()'.
However, there are 2 issues in 'dsi_get_phy()'.
First, if 'of_find_device_by_node()' succeeds but 'platform_get_drvdata()'
returns NULL, 'msm_dsi->phy_dev' will still be NULL, and the reference
won't be released in 'dsi_destroy()'.
Secondly, as 'of_find_device_by_node()' already takes a reference, there is
no need for an additional 'get_device()'.
Move the assignment to 'msm_dsi->phy_dev' a few lines above and remove the
unneeded 'get_device()' to solve both issues.
Fixes: ec31abf668 ("drm/msm/dsi: Separate PHY to another platform device")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/f15bc57648a00e7c99f943903468a04639d50596.1628241097.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e1dee2c1de ]
In commit 4e1a720d03 ("Bluetooth: avoid killing an already killed
socket"), a check was added to sco_sock_kill to skip killing a socket
if the SOCK_DEAD flag was set.
This was done after a trace for a use-after-free bug showed that the
same sock pointer was being killed twice.
Unfortunately, this check prevents sco_sock_kill from running on any
socket. sco_sock_kill kills a socket only if it's zapped and orphaned,
however sock_orphan announces that the socket is dead before detaching
it. i.e., orphaned sockets have the SOCK_DEAD flag set.
To fix this, we remove the check for SOCK_DEAD, and avoid repeated
calls to sco_sock_kill by removing incorrect calls in:
1. sco_sock_timeout. The socket should not be killed on timeout as
further processing is expected to be done. For example,
sco_sock_connect sets the timer then waits for the socket to be
connected or for an error to be returned.
2. sco_conn_del. This function should clean up resources for the
connection, but the socket itself should be cleaned up in
sco_sock_release.
3. sco_sock_close. Calls to sco_sock_close in sco_sock_cleanup_listen
and sco_sock_release are followed by sco_sock_kill. Hence the
duplicated call should be removed.
Fixes: 4e1a720d03 ("Bluetooth: avoid killing an already killed socket")
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f4eeaed04e ]
Sparse warnings triggered truncating the IDs of some platform device
tables. Unfortunately some of the IDs in the match tables were missed
which breaks audio. The KBL change has been verified to fix audio, the
CML change was not tested as it was found through grepping the broken
changes and found to match the same situation in anticipation that it
should also be fixed.
Fixes: 94efd726b9 ("ASoC: Intel: kbl_da7219_max98357a: shrink platform_id below 20 characters")
Fixes: 24e46fb811 ("ASoC: Intel: bxt_da7219_max98357a: shrink platform_id below 20 characters")
Signed-off-by: Curtis Malainey <cujomalainey@chromium.org>
Tested-by: Matt Davis <mattedavis@google.com>
Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210809213544.1682444-1-cujomalainey@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6ba34d3c73 ]
The cpuset fields that manage partition root state do not strictly
follow the cpuset locking rule that update to cpuset has to be done
with both the callback_lock and cpuset_mutex held. This is now fixed
by making sure that the locking rule is upheld.
Fixes: 3881b86128 ("cpuset: Add an error state to cpuset.sched.partition")
Fixes: 4b842da276 ("cpuset: Make CPU hotplug work with partition")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0f3adb8a1e ]
Use more descriptive variable names for update_prstate(), remove
unnecessary code and fix some typos. There is no functional change.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8a767e04d ]
Currently at dp_pm_resume() is_connected state is decided base on hpd connection
status only. This will put is_connected in wrongly "true" state at the scenario
that dongle attached to DUT but without hmdi cable connecting to it. Fix this
problem by adding read sink count from dongle and decided is_connected state base
on both sink count and hpd connection status.
Changes in v2:
-- remove dp_get_sink_count() cand call drm_dp_read_sink_count()
Changes in v3:
-- delete status local variable from dp_pm_resume()
Changes in v4:
-- delete un necessary comment at dp_pm_resume()
Fixes: d9aa6571b2 ("drm/msm/dp: check sink_count before update is_connected status")
Signed-off-by: Kuogee Hsieh <khsieh@codeaurora.org>
Link: https://lore.kernel.org/r/1628092261-32346-1-git-send-email-khsieh@codeaurora.org
Tested-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9cbc861095 ]
The one of the latest change to the driver reveals the problem that
the error codes from callee aren't propagated to the caller of
__sso_led_dt_parse(). Fix this accordingly.
Fixes: 9999908ca1 ("leds: lgm-sso: Put fwnode in any case during ->probe()")
Fixes: c3987cd2bc ("leds: lgm: Add LED controller driver for LGM SoC")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7fcc17d0cb ]
The Energy Model (EM) provides useful information about device power in
each performance state to other subsystems like: Energy Aware Scheduler
(EAS). The energy calculation in EAS does arithmetic operation based on
the EM em_cpu_energy(). Current implementation of that function uses
em_perf_state::cost as a pre-computed cost coefficient equal to:
cost = power * max_frequency / frequency.
The 'power' is expressed in milli-Watts (or in abstract scale).
There are corner cases when the EAS energy calculation for two Performance
Domains (PDs) return the same value. The EAS compares these values to
choose smaller one. It might happen that this values are equal due to
rounding error. In such scenario, we need better resolution, e.g. 1000
times better. To provide this possibility increase the resolution in the
em_perf_state::cost for 64-bit architectures. The cost of increasing
resolution on 32-bit is pretty high (64-bit division) and is not justified
since there are no new 32bit big.LITTLE EAS systems expected which would
benefit from this higher resolution.
This patch allows to avoid the rounding to milli-Watt errors, which might
occur in EAS energy estimation for each PD. The rounding error is common
for small tasks which have small utilization value.
There are two places in the code where it makes a difference:
1. In the find_energy_efficient_cpu() where we are searching for
best_delta. We might suffer there when two PDs return the same result,
like in the example below.
Scenario:
Low utilized system e.g. ~200 sum_util for PD0 and ~220 for PD1. There
are quite a few small tasks ~10-15 util. These tasks would suffer for
the rounding error. These utilization values are typical when running games
on Android. One of our partners has reported 5..10mA less battery drain
when running with increased resolution.
Some details:
We have two PDs: PD0 (big) and PD1 (little)
Let's compare w/o patch set ('old') and w/ patch set ('new')
We are comparing energy w/ task and w/o task placed in the PDs
a) 'old' w/o patch set, PD0
task_util = 13
cost = 480
sum_util_w/o_task = 215
sum_util_w_task = 228
scale_cpu = 1024
energy_w/o_task = 480 * 215 / 1024 = 100.78 => 100
energy_w_task = 480 * 228 / 1024 = 106.87 => 106
energy_diff = 106 - 100 = 6
(this is equal to 'old' PD1's energy_diff in 'c)')
b) 'new' w/ patch set, PD0
task_util = 13
cost = 480 * 1000 = 480000
sum_util_w/o_task = 215
sum_util_w_task = 228
energy_w/o_task = 480000 * 215 / 1024 = 100781
energy_w_task = 480000 * 228 / 1024 = 106875
energy_diff = 106875 - 100781 = 6094
(this is not equal to 'new' PD1's energy_diff in 'd)')
c) 'old' w/o patch set, PD1
task_util = 13
cost = 160
sum_util_w/o_task = 283
sum_util_w_task = 293
scale_cpu = 355
energy_w/o_task = 160 * 283 / 355 = 127.55 => 127
energy_w_task = 160 * 296 / 355 = 133.41 => 133
energy_diff = 133 - 127 = 6
(this is equal to 'old' PD0's energy_diff in 'a)')
d) 'new' w/ patch set, PD1
task_util = 13
cost = 160 * 1000 = 160000
sum_util_w/o_task = 283
sum_util_w_task = 293
scale_cpu = 355
energy_w/o_task = 160000 * 283 / 355 = 127549
energy_w_task = 160000 * 296 / 355 = 133408
energy_diff = 133408 - 127549 = 5859
(this is not equal to 'new' PD0's energy_diff in 'b)')
2. Difference in the 6% energy margin filter at the end of
find_energy_efficient_cpu(). With this patch the margin comparison also
has better resolution, so it's possible to have better task placement
thanks to that.
Fixes: 27871f7a8a ("PM: Introduce an Energy Model management framework")
Reported-by: CCJ Yeh <CCj.Yeh@mediatek.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c73c57081b ]
Commit 08cc83cc7f ("net: dsa: add support for BRIDGE_MROUTER
attribute") added an option for users to turn off multicast flooding
towards the CPU if they turn off the IGMP querier on a bridge which
already has enslaved ports (echo 0 > /sys/class/net/br0/bridge/multicast_router).
And commit a8b659e7ff ("net: dsa: act as passthrough for bridge port flags")
simply papered over that issue, because it moved the decision to flood
the CPU with multicast (or not) from the DSA core down to individual drivers,
instead of taking a more radical position then.
The truth is that disabling multicast flooding to the CPU is simply
something we are not prepared to do now, if at all. Some reasons:
- ICMP6 neighbor solicitation messages are unregistered multicast
packets as far as the bridge is concerned. So if we stop flooding
multicast, the outside world cannot ping the bridge device's IPv6
link-local address.
- There might be foreign interfaces bridged with our DSA switch ports
(sending a packet towards the host does not necessarily equal
termination, but maybe software forwarding). So if there is no one
interested in that multicast traffic in the local network stack, that
doesn't mean nobody is.
- PTP over L4 (IPv4, IPv6) is multicast, but is unregistered as far as
the bridge is concerned. This should reach the CPU port.
- The switch driver might not do FDB partitioning. And since we don't
even bother to do more fine-grained flood disabling (such as "disable
flooding _from_port_N_ towards the CPU port" as opposed to "disable
flooding _from_any_port_ towards the CPU port"), this breaks standalone
ports, or even multiple bridges where one has an IGMP querier and one
doesn't.
Reverting the logic makes all of the above work.
Fixes: a8b659e7ff ("net: dsa: act as passthrough for bridge port flags")
Fixes: 08cc83cc7f ("net: dsa: add support for BRIDGE_MROUTER attribute")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cbbf09b577 ]
DSA's idea of optimizing out multicast flooding to the CPU port leaves
quite a few holes open, so it should be reverted.
The mt7530 driver is the only new driver which added a .port_set_mrouter
implementation after the reorg from commit a8b659e7ff ("net: dsa: act
as passthrough for bridge port flags"), so it needs to be reverted
separately so that the other revert commit can go a bit further down the
git history.
Fixes: 5a30833b9a ("net: dsa: mt7530: support MDB and bridge flag operations")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7df4e74494 ]
Qingfang points out that when a bridge with the default settings is
created and a port joins it:
ip link add br0 type bridge
ip link set swp0 master br0
DSA calls br_multicast_router() on the bridge to see if the br0 device
is a multicast router port, and if it is, it enables multicast flooding
to the CPU port, otherwise it disables it.
If we look through the multicast_router_show() sysfs or at the
IFLA_BR_MCAST_ROUTER netlink attribute, we see that the default mrouter
attribute for the bridge device is "1" (MDB_RTR_TYPE_TEMP_QUERY).
However, br_multicast_router() will return "0" (MDB_RTR_TYPE_DISABLED),
because an mrouter port in the MDB_RTR_TYPE_TEMP_QUERY state may not be
actually _active_ until it receives an actual IGMP query. So, the
br_multicast_router() function should really have been called
br_multicast_router_active() perhaps.
When/if an IGMP query is received, the bridge device will transition via
br_multicast_mark_router() into the active state until the
ip4_mc_router_timer expires after an multicast_querier_interval.
Of course, this does not happen if the bridge is created with an
mcast_router attribute of "2" (MDB_RTR_TYPE_PERM).
The point is that in lack of any IGMP query messages, and in the default
bridge configuration, unregistered multicast packets will not be able to
reach the CPU port through flooding, and this breaks many use cases
(most obviously, IPv6 ND, with its ICMP6 neighbor solicitation multicast
messages).
Leave the multicast flooding setting towards the CPU port down to a driver
level decision.
Fixes: 010e269f91 ("net: dsa: sync up switchdev objects and port attributes when joining the bridge")
Reported-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 47bfc4d128 ]
On TI K3 am64x platform the issue with RX IRQ is observed - it's become
disabled forever after .ndo_stop(). The K3 CPSW driver manipulates RX IRQ
by using standard Linux enable_irq()/disable_irq_nosync() API as there is
no IRQ enable/disable options in CPSW HW itself, as result during
.ndo_stop() following sequence happens
phy_stop()
teardown TX/RX channels
wait for TX tdown complete
napi_disable(TX)
clean up TX channels
(a)
napi_disable(RX)
At point (a) it's not possible to predict if RX IRQ was triggered or not.
if RX IRQ was triggered then it also not possible to definitely say if RX
NAPI was run or only scheduled and immediately canceled by
napi_disable(RX). Actually the last case causes RX IRQ to be permanently
disabled.
Another observed issue is that RX IRQ enable counter become unbalanced if
(gro_flush_timeout =! 0) while (napi_defer_hard_irqs == 0):
Unbalanced enable for IRQ 44
WARNING: CPU: 0 PID: 10 at ../kernel/irq/manage.c:776 __enable_irq+0x38/0x80
__enable_irq+0x38/0x80
enable_irq+0x54/0xb0
am65_cpsw_nuss_rx_poll+0x2f4/0x368
__napi_poll+0x34/0x1b8
net_rx_action+0xe4/0x220
_stext+0x11c/0x284
run_ksoftirqd+0x4c/0x60
To avoid above issues introduce flag indicating if RX was actually disabled
before enabling it in am65_cpsw_nuss_rx_poll() and restore RX IRQ state in
.ndo_open()
Fixes: 4f7cce2724 ("net: ethernet: ti: am65-cpsw: add support for am64x cpsw3g")
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 713baf3dae ]
An earlier commit replaced using batostr to using %pMR sprintf for the
construction of session->name. Static analysis detected that this new
method can use a total of 21 characters (including the trailing '\0')
so we need to increase the BTNAMSIZ from 18 to 21 to fix potential
buffer overflows.
Addresses-Coverity: ("Out-of-bounds write")
Fixes: fcb73338ed ("Bluetooth: Use %pMR in sprintf/seq_printf instead of batostr")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 112cedc8e6 ]
If a kernel module gets unloaded then it printed report about a leak before
commit 275678e7a9 ("debugfs: Check module state before warning in
{full/open}_proxy_open()"). An additional check was added in this commit to
avoid this printing. But it was forgotten that the function must return an
error in this case because it was not actually opened.
As result, the systems started to crash or to hang when a module was
unloaded while something was trying to open a file.
Fixes: 275678e7a9 ("debugfs: Check module state before warning in {full/open}_proxy_open()")
Cc: Taehee Yoo <ap420073@gmail.com>
Reported-by: Mário Lopes <ml@simonwunderlich.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Link: https://lore.kernel.org/r/20210802162444.7848-1-sven@narfation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8b17a0bd9 ]
TX timestamps are sent by SJA1110 as Ethernet packets containing
metadata, so they are received by the tagging driver but must be
processed by the switch driver - the one that is stateful since it
keeps the TX timestamp queue.
This means that there is an sja1110_process_meta_tstamp() symbol
exported by the switch driver which is called by the tagging driver.
There is a shim definition for that function when the switch driver is
not compiled, which does nothing, but that shim is not effective when
the tagging protocol driver is built-in and the switch driver is a
module, because built-in code cannot call symbols exported by modules.
So add an optional dependency between the tagger and the switch driver,
if PTP support is enabled in the switch driver. If PTP is not enabled,
sja1110_process_meta_tstamp() will translate into the shim "do nothing
with these meta frames" function.
Fixes: 566b18c8b7 ("net: dsa: sja1105: implement TX timestamping for SJA1110")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b6e638b4b ]
Upcoming patches will add tag_8021q related logic to switch.c and
port.c, in order to allow it to make use of cross-chip notifiers.
In addition, a struct dsa_8021q_context *ctx pointer will be added to
struct dsa_switch.
It seems fairly low-reward to #ifdef the *ctx from struct dsa_switch and
to provide shim implementations of the entire tag_8021q.c calling
surface (not even clear what to do about the tag_8021q cross-chip
notifiers to avoid compiling them). The runtime overhead for switches
which don't use tag_8021q is fairly small because all helpers will check
for ds->tag_8021q_ctx being a NULL pointer and stop there.
So let's make it part of dsa_core.o.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e3d4571955 ]
The SMSM driver detects interrupt edges by tracking the last state
it has seen (and has triggered the interrupt handler for). This works
fine, but only if the interrupt does not change state while masked.
For example, if an interrupt is unmasked while the state is HIGH,
the stored last_value for that interrupt might still be LOW. Then,
when the remote processor triggers smsm_intr() we assume that nothing
has changed, even though the state might have changed from HIGH to LOW.
Attempt to fix this by checking the current remote state before
unmasking an IRQ. Use atomic operations to avoid the interrupt handler
from interfering with the unmask function.
This fixes modem crashes in some edge cases with the BAM-DMUX driver.
Specifically, the BAM-DMUX interrupt handler is not called for the
HIGH -> LOW smsm state transition if the BAM-DMUX driver is loaded
(and therefore unmasks the interrupt) after the modem was already started:
qcom-q6v5-mss 4080000.remoteproc: fatal error received: a2_task.c:3188:
Assert FALSE failed: A2 DL PER deadlock timer expired waiting for Apps ACK
Fixes: c97c4090ff ("soc: qcom: smsm: Add driver for Qualcomm SMSM")
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Link: https://lore.kernel.org/r/20210712135703.324748-2-stephan@gerhold.net
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0e00392a89 ]
PME signaling is only enabled by __pci_enable_wake() if the target
device can signal PME from the given target power state (to avoid
pointless reconfiguration of the device), but if the hierarchy above
the device goes into D3cold, the device itself will end up in D3cold
too, so if it can signal PME from D3cold, it should be enabled to
do so in __pci_enable_wake().
[Note that if the device does not end up in D3cold and it cannot
signal PME from the original target power state, it will not signal
PME, so in that case the behavior does not change.]
Link: https://lore.kernel.org/linux-pm/3149540.aeNJFYEL58@kreacher/
Fixes: 5bcc2fb4e8 ("PCI PM: Simplify PCI wake-up code")
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com>
Reported-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit da9f215068 ]
It is inconsistent to return PCI_D0 from pci_target_state() instead
of the original target state if 'wakeup' is true and the device
cannot signal PME from D0.
This only happens when the device cannot signal PME from the original
target state and any shallower power states (including D0) and that
case is effectively equivalent to the one in which PME singaling is
not supported at all. Since the original target state is returned in
the latter case, make the function do that in the former one too.
Link: https://lore.kernel.org/linux-pm/3149540.aeNJFYEL58@kreacher/
Fixes: 666ff6f83e ("PCI/PM: Avoid using device_may_wakeup() for runtime PM")
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com>
Reported-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1ac61faf6e ]
Plane constraints firmware interface is to override the default
alignment for a given color format. By default venus hardware has
alignments as 128x32, but NV12 was defined differently to meet
various usecases. Compressed NV12 has always been aligned as 128x32,
hence not needed to override the default alignment.
Fixes: bc28936bbb ("media: venus: helpers, hfi, vdec: Set actual plane constraints to FW")
Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09ea9719a4 ]
Currently the call to find_format can potentially return a NULL to
fmt and the nullpointer is later dereferenced on the assignment of
pixmp->num_planes = fmt->num_planes. Fix this by adding a NULL pointer
check and returning NULL for the failure case.
Addresses-Coverity: ("Dereference null return")
Fixes: aaaa93eda6 ("[media] media: venus: venc: add video encoder files")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 331e06bbde ]
In case of error, the function qcom_smem_get() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check
should be replaced with IS_ERR().
Fixes: d566e78dd6 ("media: venus : hfi: add venus image info into smem")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6fa54bc713 ]
If em28xx_ir_init fails, it would decrease the refcount of dev. However,
in the em28xx_ir_fini, when ir is NULL, it goes to ref_put and decrease
the refcount of dev. This will lead to a refcount bug.
Fix this bug by removing the kref_put in the error handling code
of em28xx_ir_init.
refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 7 at lib/refcount.c:28 refcount_warn_saturate+0x18e/0x1a0 lib/refcount.c:28
Modules linked in:
CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.13.0 #3
Workqueue: usb_hub_wq hub_event
RIP: 0010:refcount_warn_saturate+0x18e/0x1a0 lib/refcount.c:28
Call Trace:
kref_put.constprop.0+0x60/0x85 include/linux/kref.h:69
em28xx_usb_disconnect.cold+0xd7/0xdc drivers/media/usb/em28xx/em28xx-cards.c:4150
usb_unbind_interface+0xbf/0x3a0 drivers/usb/core/driver.c:458
__device_release_driver drivers/base/dd.c:1201 [inline]
device_release_driver_internal+0x22a/0x230 drivers/base/dd.c:1232
bus_remove_device+0x108/0x160 drivers/base/bus.c:529
device_del+0x1fe/0x510 drivers/base/core.c:3540
usb_disable_device+0xd1/0x1d0 drivers/usb/core/message.c:1419
usb_disconnect+0x109/0x330 drivers/usb/core/hub.c:2221
hub_port_connect drivers/usb/core/hub.c:5151 [inline]
hub_port_connect_change drivers/usb/core/hub.c:5440 [inline]
port_event drivers/usb/core/hub.c:5586 [inline]
hub_event+0xf81/0x1d40 drivers/usb/core/hub.c:5668
process_one_work+0x2c9/0x610 kernel/workqueue.c:2276
process_scheduled_works kernel/workqueue.c:2338 [inline]
worker_thread+0x333/0x5b0 kernel/workqueue.c:2424
kthread+0x188/0x1d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Reported-by: Dongliang Mu <mudongliangabcd@gmail.com>
Fixes: ac56886371 ("media: em28xx: Fix possible memory leak of em28xx struct")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 64f67b5240 ]
Some 2-in-1s with a detachable (USB) keyboard(dock) have mute-LEDs in
the speaker- and/or mic-mute keys on the keyboard.
Examples of this are the Lenovo Thinkpad10 tablet (with its USB kbd-dock)
and the HP x2 10 series.
The detachable nature of these keyboards means that the keyboard and
thus the mute LEDs may show up after the user (or userspace restoring
old mixer settings) has muted the speaker and/or mic.
Current LED-class devices with a default_trigger of "audio-mute" or
"audio-micmute" initialize the brightness member of led_classdev with
ledtrig_audio_get() before registering the LED.
This makes the software state after attaching the keyboard match the
actual audio mute state, e.g. cat /sys/class/leds/foo/brightness will
show the right value.
But before this commit nothing was actually calling the led_classdev's
brightness_set[_blocking] callback so the value returned by
ledtrig_audio_get() was never actually being sent to the hw, leading
to the mute LEDs staying in their default power-on state, after
attaching the keyboard, even if ledtrig_audio_get() returned a different
state.
This could be fixed by having the individual LED drivers call
brightness_set[_blocking] themselves after registering the LED,
but this really is something which should be done by a led-trigger
activate callback.
Add an activate callback for this, fixing the issue of the
mute LEDs being out of sync after (re)attaching the keyboard.
Cc: Takashi Iwai <tiwai@suse.de>
Fixes: faa2541f5b ("leds: trigger: Introduce audio mute LED trigger")
Reviewed-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8aa41952ef ]
fwnode_get_next_available_child_node() bumps a reference counting of
a returned variable. We have to balance it whenever we return to
the caller.
Fixes: e1c6edcbea ("leds: rt8515: Add Richtek RT8515 LED driver")
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7e1baaaa24 ]
device_get_next_child_node() bumps a reference counting of a returned variable.
We have to balance it whenever we return to the caller.
Fixes: 8cd7d6daba ("leds: lt3593: Add device tree probing glue")
Cc: Daniel Mack <daniel@zonque.org>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1ed4d05e0a ]
When requesting GPIO line the probe can be deferred.
In such case don't spam logs with an error message.
This can be achieved by switching to dev_err_probe().
Fixes: c3987cd2bc ("leds: lgm: Add LED controller driver for LGM SoC")
Cc: Amireddy Mallikarjuna reddy <mallikarjunax.reddy@linux.intel.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9999908ca1 ]
fwnode_get_next_child_node() bumps a reference counting of a returned variable.
We have to balance it whenever we return to the caller.
All the same in fwnode_for_each_child_node() case.
Fixes: c3987cd2bc ("leds: lgm: Add LED controller driver for LGM SoC")
Cc: Amireddy Mallikarjuna reddy <mallikarjunax.reddy@linux.intel.com>
Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f16a3bb69a ]
The driver is written as if platform_get_irq() returns 0 on errors (while
actually it returns a negative error code), blithely passing these error
codes to request_irq() (which takes *unsigned* IRQ #) -- which fails with
-EINVAL. Add the necessary error check to the pre-existing *if* statement
forcing the driver into the polling mode...
Fixes: 4ad48e6ab1 ("i2c: Renesas Highlander FPGA SMBus support")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c96ca5604a ]
Hihope boards use Realtek PHY. From the very beginning it use only
tx delays. However the phy driver commit bbc4d71d63
("net: phy: realtek: fix rtl8211e rx/tx delay config") introduced
NFS mount failure. Now it needs rx delay inaddition to tx delay
for NFS mount to work. This patch fixes NFS mount failure issue
by adding MAC internal rx delay.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Fixes: bbc4d71d63 ("net: phy: realtek: fix rtl8211e rx/tx delay config")
Link: https://lore.kernel.org/r/20210721180632.15080-1-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 369e955b3d ]
Make sure to call btf__free() (and not simply free(), which does not
free all pointers stored in the struct) on pointers to struct btf
objects retrieved at various locations.
These were found while updating the calls to btf__get_from_id().
Fixes: 999d82cbc0 ("tools/bpf: enhance test_btf file testing to test func info")
Fixes: 254471e57a ("tools/bpf: bpftool: add support for func types")
Fixes: 7b612e291a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs")
Fixes: d56354dc49 ("perf tools: Save bpf_prog_info and BTF of new BPF programs")
Fixes: 47c09d6a9f ("bpftool: Introduce "prog profile" command")
Fixes: fa853c4b83 ("perf stat: Enable counting events for BPF programs")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210729162028.29512-5-quentin@isovalent.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6d2d73cdd6 ]
Variable "err" is initialised to -EINVAL so that this error code is
returned when something goes wrong in libbpf_find_prog_btf_id().
However, a recent change in the function made use of the variable in
such a way that it is set to 0 if retrieving linear information on the
program is successful, and this 0 value remains if we error out on
failures at later stages.
Let's fix this by setting err to -EINVAL later in the function.
Fixes: e9fc3ce99b ("libbpf: Streamline error reporting for high-level APIs")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210729162028.29512-2-quentin@isovalent.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a25fca4d3c ]
This patch fixes the MGMT add_advertising command repsones with the
wrong opcode when it is trying to return the not supported error.
Fixes: cbbdfa6f33 ("Bluetooth: Enable controller RPA resolution using Experimental feature")
Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c29b6b0b12 ]
The reference to the drm_device that was acquired by
devm_drm_dev_alloc() is released automatically by the devres
infrastructure. It must not be released manually, as that causes a
reference underflow..
Fixes: ea6aae1518 ("drm: rcar-du: Embed drm_device in rcar_du_device")
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit acf34954ef ]
The commit that introduced devlink support released devlink resources in
wrong order, that made an unwind flow to be asymmetrical. In addition,
the am65-cpsw-nuss used internal to devlink core field - registered.
In order to fix the unwind flow and remove such access to the
registered field, rewrite the code to call devlink_port_unregister only
on registered ports.
Fixes: 58356eb31d ("net: ti: am65-cpsw-nuss: Add devlink support")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8ca34a13f7 ]
Syzbot reported warning in netlbl_cipsov4_add(). The
problem was in too big doi_def->map.std->lvl.local_size
passed to kcalloc(). Since this value comes from userpace there is
no need to warn if value is not correct.
The same problem may occur with other kcalloc() calls in
this function, so, I've added __GFP_NOWARN flag to all
kcalloc() calls there.
Reported-and-tested-by: syzbot+cdd51ee2e6b0b2e18c0d@syzkaller.appspotmail.com
Fixes: 96cb8e3313 ("[NetLabel]: CIPSOv4 and Unlabeled packet integration")
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c9856e4ed ]
There is some sort of corner case behavior of the controller,
which could rarely be triggered at least on i.MX6SX connected
to 800x480 DPI panel and i.MX8MM connected to DPI->DSI->LVDS
bridged 1920x1080 panel (and likely on other setups too), where
the image on the panel shifts to the right and wraps around.
This happens either when the controller is enabled on boot or
even later during run time. The condition does not correct
itself automatically, i.e. the display image remains shifted.
It seems this problem is known and is due to sporadic underflows
of the LCDIF FIFO. While the LCDIF IP does have underflow/overflow
IRQs, neither of the IRQs trigger and neither IRQ status bit is
asserted when this condition occurs.
All known revisions of the LCDIF IP have CTRL1 RECOVER_ON_UNDERFLOW
bit, which is described in the reference manual since i.MX23 as
"
Set this bit to enable the LCDIF block to recover in the next
field/frame if there was an underflow in the current field/frame.
"
Enable this bit to mitigate the sporadic underflows.
Fixes: 45d59d7040 ("drm: Add new driver for MXSFB controller")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Daniel Abrecht <public@danielabrecht.ch>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Stefan Agner <stefan@agner.ch>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Jagan Teki <jagan@amarulasolutions.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210620224701.189289-1-marex@denx.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 15d428e6fe ]
In cpuset_hotplug_workfn(), the detection of whether the cpu list
has been changed is done by comparing the effective cpus of the top
cpuset with the cpu_active_mask. However, in the rare case that just
all the CPUs in the subparts_cpus are offlined, the detection fails
and the partition states are not updated correctly. Fix it by forcing
the cpus_updated flag to true in this particular case.
Fixes: 4b842da276 ("cpuset: Make CPU hotplug work with partition")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 26ab7b3845 ]
This commit does a cleanup in LRO configuration.
LRO is a parameter of an RQ, but its state is changed by modifying a TIR
related to the RQ.
The current status: LRO for tunneled packets is not supported in the
driver, inner TIRs may enable LRO on creation, but LRO status of inner
TIRs isn't changed in mlx5e_modify_tirs_lro(). This is inconsistent, but
as long as the firmware doesn't declare support for tunneled LRO, it
works, because the same RQs are shared between the inner and outer TIRs.
This commit does two fixes:
1. If the firmware has the tunneled LRO capability, LRO is blocked
altogether, because it's not possible to block it for inner TIRs only,
when the same RQs are shared between inner and outer TIRs, and the
driver won't be able to handle tunneled LRO traffic.
2. mlx5e_modify_tirs_lro() is patched to modify LRO state for all TIRs,
including inner ones, because all TIRs related to an RQ should agree on
their LRO state.
Fixes: 7b3722fa9e ("net/mlx5e: Support RSS for GRE tunneled packets")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9c43f3865c ]
TIR's rx_hash_field_selector_inner can be enabled only when
tunneled_offload_en = 1. tunneled_offload_en is filled according to the
tunneled_offload_en field in struct mlx5e_params, which is false in the
IPoIB profile. On the other hand, the IPoIB profile passes inner_ttc =
true to mlx5e_create_indirect_tirs, which potentially allows the latter
function to attempt to create inner indirect TIRs without having
tunneled_offload_en set.
This commit prohibits this behavior by passing inner_ttc = false to
mlx5e_create_indirect_tirs. The latter function won't attempt to create
inner indirect TIRs.
As inner indirect TIRs are not created in the IPoIB profile (this commit
blocks it explicitly, and even before they would have failed to be
created), the call to mlx5e_create_inner_ttc_table in
mlx5i_create_flow_steering is a no-op and can be removed.
Fixes: 46dc933cee ("net/mlx5e: Provide explicit directive if to create inner indirect tirs")
Fixes: 458821c72b ("net/mlx5e: IPoIB, Add inner TTC table to IPoIB flow steering")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 72ccc373b0 ]
After enabling CONFIG_REGULATOR_DEBUG=y we observer below debug logs.
Changes help link VCCK and VDDEE pwm regulator to 5V regulator supply
instead of dummy regulator.
[ 7.117140] pwm-regulator regulator-vcck: Looking up pwm-supply from device tree
[ 7.117153] pwm-regulator regulator-vcck: Looking up pwm-supply property in node /regulator-vcck failed
[ 7.117184] VCCK: supplied by regulator-dummy
[ 7.117194] regulator-dummy: could not add device link regulator.8: -ENOENT
[ 7.117266] VCCK: 860 <--> 1140 mV at 986 mV, enabled
[ 7.118498] VDDEE: will resolve supply early: pwm
[ 7.118515] pwm-regulator regulator-vddee: Looking up pwm-supply from device tree
[ 7.118526] pwm-regulator regulator-vddee: Looking up pwm-supply property in node /regulator-vddee failed
[ 7.118553] VDDEE: supplied by regulator-dummy
[ 7.118563] regulator-dummy: could not add device link regulator.9: -ENOENT
Fixes: 087a1d8b4e ("ARM: dts: meson8b: ec100: add the VDDEE regulator")
Fixes: 3e7db1c1b7 ("ARM: dts: meson8b: ec100: improve the description of the regulators")
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20210705112358.3554-4-linux.amoon@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 876228e9f9 ]
After enabling CONFIG_REGULATOR_DEBUG=y we observe below debug logs.
Changes help link VCCK and VDDEE pwm regulator to 5V regulator supply
instead of dummy regulator.
[ 7.117140] pwm-regulator regulator-vcck: Looking up pwm-supply from device tree
[ 7.117153] pwm-regulator regulator-vcck: Looking up pwm-supply property in node /regulator-vcck failed
[ 7.117184] VCCK: supplied by regulator-dummy
[ 7.117194] regulator-dummy: could not add device link regulator.8: -ENOENT
[ 7.117266] VCCK: 860 <--> 1140 mV at 986 mV, enabled
[ 7.118498] VDDEE: will resolve supply early: pwm
[ 7.118515] pwm-regulator regulator-vddee: Looking up pwm-supply from device tree
[ 7.118526] pwm-regulator regulator-vddee: Looking up pwm-supply property in node /regulator-vddee failed
[ 7.118553] VDDEE: supplied by regulator-dummy
[ 7.118563] regulator-dummy: could not add device link regulator.9: -ENOENT
Fixes: 524d96083b ("ARM: dts: meson8b: odroidc1: add the CPU voltage regulator")
Fixes: 8bdf38be71 ("ARM: dts: meson8b: odroidc1: add the VDDEE regulator")
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Anand Moon <linux.amoon@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
[narmstrong: fixed typo in commit s/observer/observe/]
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20210705112358.3554-2-linux.amoon@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 44cf630bcb ]
We are seeing "imprecise external abort (0x1406)" errors during boot
(which then cause the whole board to hang) on Meson8 (but not Meson8m2).
These are observed while trying to access the GPU's registers when the
MALI clock is running at it's default setting of 24MHz. The 3.10 vendor
kernel uses 318.75MHz as "default" GPU frequency. Using that makes the
"imprecise external aborts" go away.
Add the assigned-clocks and assigned-clock-rates properties to also bump
the MALI clock to 318.75MHz before accessing any of it's registers.
Fixes: 7d3f6b536e ("ARM: dts: meson8: add the Mali-450 MP6 GPU")
Reported-by: Demetris Ierokipides <ierokipides.dem@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://lore.kernel.org/r/20210711214023.2163565-1-martin.blumenstingl@googlemail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 525e2f9fd0 ]
st->bucket stores the current bucket number.
st->offset stores the offset within this bucket that is the sk to be
seq_show(). Thus, st->offset only makes sense within the same
st->bucket.
These two variables are an optimization for the common no-lseek case.
When resuming the seq_file iteration (i.e. seq_start()),
tcp_seek_last_pos() tries to continue from the st->offset
at bucket st->bucket.
However, it is possible that the bucket pointed by st->bucket
has changed and st->offset may end up skipping the whole st->bucket
without finding a sk. In this case, tcp_seek_last_pos() currently
continues to satisfy the offset condition in the next (and incorrect)
bucket. Instead, regardless of the offset value, the first sk of the
next bucket should be returned. Thus, "bucket == st->bucket" check is
added to tcp_seek_last_pos().
The chance of hitting this is small and the issue is a decade old,
so targeting for the next tree.
Fixes: a8b690f98b ("tcp: Fix slowness in read /proc/net/tcp")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210701200541.1033917-1-kafai@fb.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5810323ba6 ]
This fixes a bug which if we probe a non-existing
I2C device, and the SMU returns 0xFF, from then on
we can never communicate with the SMU, because the
code before this patch reads and interprets 0xFF
as a terminal error, and thus we never write 0
into register 90 to clear the status (and
subsequently send a new command to the SMU.)
It is not an error that the SMU returns status
0xFF. This means that the SMU executed the last
command successfully (execution status), but the
command result is an error of some sort (execution
result), depending on what the command was.
When doing a status check of the SMU, before we
send a new command, the only status which
precludes us from sending a new command is 0--the
SMU hasn't finished executing a previous command,
and 0xFC--the SMU is busy.
This bug was seen as the following line in the
kernel log,
amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state!
when subsequent SMU commands, not necessarily
related to I2C, were sent to the SMU.
This patch fixes this bug.
v2: Add a comment to the description of
__smu_cmn_poll_stat() to explain why we're NOT
defining the SMU FW return codes as macros, but
are instead hard-coding them. Such a change, can
be followed up by a subsequent patch.
v3: The changes are,
a) Add comments to break labels in
__smu_cmn_reg2errno().
b) When an unknown/unspecified/undefined result is
returned back from the SMU, map that to
-EREMOTEIO, to distinguish failure at the SMU
FW.
c) Add kernel-doc to
smu_cmn_send_msg_without_waiting(),
smu_cmn_wait_for_response(),
smu_cmn_send_smc_msg_with_param().
d) In smu_cmn_send_smc_msg_with_param(), since we
wait for completion of the command, if the
result of the completion is
undefined/unknown/unspecified, we print that to
the kernel log.
v4: a) Add macros as requested, though redundant, to
be removed when SMU consolidates for all
ASICs--see comment in code.
b) Get out if the SMU code is unknown.
v5: Rename the macro names.
Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Fixes: fcb1fe9c9e ("drm/amd/powerplay: pre-check the SMU state before issuing message")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9af417610b ]
The bounds check of id is off-by-one and the comparison should
be >= rather >. Currently the WARN_ON_ONCE check does not stop
the out of range indexing of &ldev->ctx.table[id] so also add
a return path if the bounds are out of range.
Addresses-Coverity: ("Illegal address computation").
Fixes: 5609c185f2 ("6lowpan: iphc: add support for stateful compression")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 20a831f04f ]
When reading the support debug features failed, there are not available
features init. Continue to set the debug features is illogical, we should
skip btintel_set_debug_features(), even if check it by "if (!features)".
Fixes: c453b10c2b ("Bluetooth: btusb: Configure Intel debug feature based on available support")
Signed-off-by: Jun Miao <jun.miao@windriver.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 59da0b38bc ]
Smatch complains that some of these struct members are not initialized
leading to a stack information disclosure:
net/bluetooth/sco.c:778 sco_conn_defer_accept() warn:
check that 'cp.retrans_effort' doesn't leak information
This seems like a valid warning. I've added a default case to fix
this issue.
Fixes: 2f69a82acf ("Bluetooth: Use voice setting in deferred SCO connection request")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c275e5d349 ]
Inside function mt9m114_detect(), variable "retvalue" could
be uninitialized if mt9m114_read_reg() returns error, however, it
is used in the later if statement, which is potentially unsafe.
The local variable "retvalue" is renamed to "model" to avoid
confusion.
Link: https://lore.kernel.org/linux-media/20210625053858.3862-1-yzhai003@ucr.edu
Fixes: ad85094 (media / atomisp: fix the uninitialized use of model ID)
Signed-off-by: Yizhuo <yzhai003@ucr.edu>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 44693d74f5 ]
The frame memory control register value is currently determined
before userspace selects the final capture format and never corrected.
Update ctx->frame_mem_ctrl in __coda_start_decoding() to fix decoding
into YUV420 or YVU420 capture buffers.
Reported-by: Andrej Picej <andrej.picej@norik.com>
Fixes: 497e6b8559 ("media: coda: add sequence initialization work")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e58430e1d4 ]
There are a few bugs in this code. 1) No checks for whether
dma_alloc_attrs() or __get_free_pages() failed. 2) If
video_register_device() fails it doesn't clean up the dma attrs or the
free pages. 3) The video_device_release() function frees "vfd" which
leads to a use after free on the next line. The call to
video_unregister_device() is not required so I have just removed that.
Fixes: f7e7b48e6d ("[media] rockchip/rga: v4l2 m2m support")
Reported-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6f5885a775 ]
In go7007_alloc() kzalloc() is used for struct go7007
allocation. It means that there is no need in zeroing
any members, because kzalloc will take care of it.
Removing these reduntant initialization steps increases
execution speed a lot:
Before:
+ 86.802 us | go7007_alloc();
After:
+ 29.595 us | go7007_alloc();
Fixes: 866b8695d6 ("Staging: add the go7007 video driver")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 131ae388b8 ]
In dvb_usb_i2c_init, if i2c_add_adapter fails, it only prints an error
message, and then continues to set DVB_USB_STATE_I2C. This affects the
logic of dvb_usb_i2c_exit, which leads to that, the deletion of i2c_adap
even if the i2c_add_adapter fails.
Fix this by returning at the failure of i2c_add_adapter and then move
dvb_usb_i2c_exit out of the error handling code of dvb_usb_i2c_init.
Fixes: 13a79f14ab ("media: dvb-usb: Fix memory leak at error in dvb_usb_device_init()")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 797c061ad7 ]
If vp702x_usb_in_op fails, the mac address is not initialized.
And vp702x_read_mac_addr does not handle this failure, which leads to
the uninit-value in dvb_usb_adapter_dvb_init.
Fix this by handling the failure of vp702x_usb_in_op.
Fixes: 786baecfe7 ("[media] dvb-usb: move it to drivers/media/usb/dvb-usb")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c5453769f7 ]
If dibusb_read_eeprom_byte fails, the mac address is not initialized.
And nova_t_read_mac_address does not handle this failure, which leads to
the uninit-value in dvb_usb_adapter_dvb_init.
Fix this by handling the failure of dibusb_read_eeprom_byte.
Reported-by: syzbot+e27b4fd589762b0b9329@syzkaller.appspotmail.com
Fixes: 786baecfe7 ("[media] dvb-usb: move it to drivers/media/usb/dvb-usb")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c2255ff477 ]
The failure to register devlink will leave the system with dangled
devlink resource, which is not cleaned if devlink_port_register() fails.
In order to remove access to ".registered" field of struct devlink_port,
require both devlink_register and devlink_port_register to success and
check it through device pointer.
Fixes: fbfb803153 ("ionic: Add hardware init and device commands")
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ecc8cb7c0 ]
This race was discovered when I carefully analyzed the code to locate
another firmware-related UAF issue. It can be triggered only when the
firmware load operation is executed during suspend. This possibility is
almost impossible because there are few firmware load and suspend actions
in the actual environment.
CPU0 CPU1
__device_uncache_fw_images(): assign_fw():
fw_cache_piggyback_on_request()
<----- P0
spin_lock(&fwc->name_lock);
...
list_del(&fce->list);
spin_unlock(&fwc->name_lock);
uncache_firmware(fce->name);
<----- P1
kref_get(&fw_priv->ref);
If CPU1 is interrupted at position P0, the new 'fce' has been added to the
list fwc->fw_names by the fw_cache_piggyback_on_request(). In this case,
CPU0 executes __device_uncache_fw_images() and will be able to see it when
it traverses list fwc->fw_names. Before CPU1 executes kref_get() at P1, if
CPU0 further executes uncache_firmware(), the count of fw_priv->ref may
decrease to 0, causing fw_priv to be released in advance.
Move kref_get() to the lock protection range of fwc->name_lock to fix it.
Fixes: ac39b3ea73 ("firmware loader: let caching firmware piggyback on loading firmware")
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Link: https://lore.kernel.org/r/20210719064531.3733-2-thunder.leizhen@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a6579cbfd7 ]
In the case where IS_ERR(lsi->si_sc_inode) is true the error exit path
to free_local does not kfree the allocated object lsi leading to a memory
leak. Fix this by kfree'ing lst before taking the error exit path.
Addresses-Coverity: ("Resource leak")
Fixes: 97fd734ba1 ("gfs2: lookup local statfs inodes prior to journal recovery")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d43b3a989b ]
rpmhpd_aggregate_corner() takes a corner as parameter, but in
rpmhpd_power_off() the code requests the level of the first corner
instead.
In all (known) current cases the first corner has level 0, so this
change should be a nop, but in case that there's a power domain with a
non-zero lowest level this makes sure that rpmhpd_power_off() actually
requests the lowest level - which is the closest to "power off" we can
get.
While touching the code, also skip the unnecessary zero-initialization
of "ret".
Fixes: 279b7e8a62 ("soc: qcom: rpmhpd: Add RPMh power domain driver")
Reviewed-by: Rajendra Nayak <rnayak@codeaurora.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Sibi Sankar <sibis@codeaurora.org>
Tested-by: Sibi Sankar <sibis@codeaurora.org>
Link: https://lore.kernel.org/r/20210703005416.2668319-2-bjorn.andersson@linaro.org
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8b4b06919f ]
i40e_config_vf_promiscuous_mode() calls
i40e_getnum_vf_vsi_vlan_filters() without acquiring the
mac_filter_hash_lock spinlock.
This is unsafe because mac_filter_hash may get altered in another thread
while i40e_getnum_vf_vsi_vlan_filters() traverses the hashes.
Simply adding the spinlock in i40e_getnum_vf_vsi_vlan_filters() is not
possible as it already gets called in i40e_get_vlan_list_sync() with the
spinlock held. Therefore adding a wrapper that acquires the spinlock and
call the correct function where appropriate.
Fixes: 37d318d780 ("i40e: Remove scheduling while atomic possibility")
Fix-suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 63a9192b8f ]
The 'tail' pointer is also free-running count, so it needs to be masked
as 'adminq_prod_cnt' does, to become an index value of AdminQ buffer.
Fixes: 5cdad90de6 ("gve: Batch AQ commands for creating and destroying queues.")
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Catherine Sullivan <csully@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 75f0fc7b48 ]
In bpf_patch_insn_data(), we first use the bpf_patch_insn_single() to
insert new instructions, then use adjust_insn_aux_data() to adjust
insn_aux_data. If the old env->prog have no enough room for new inserted
instructions, we use bpf_prog_realloc to construct new_prog and free the
old env->prog.
There have two errors here. First, if adjust_insn_aux_data() return
ENOMEM, we should free the new_prog. Second, if adjust_insn_aux_data()
return ENOMEM, bpf_patch_insn_data() will return NULL, and env->prog has
been freed in bpf_prog_realloc, but we will use it in bpf_check().
So in this patch, we make the adjust_insn_aux_data() never fails. In
bpf_patch_insn_data(), we first pre-malloc memory for the new
insn_aux_data, then call bpf_patch_insn_single() to insert new
instructions, at last call adjust_insn_aux_data() to adjust
insn_aux_data.
Fixes: 8041902dae ("bpf: adjust insn_aux_data when patching insns")
Signed-off-by: He Fengqing <hefengqing@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210714101815.164322-1-hefengqing@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dcb0145821 ]
If an error occurs after a successful 'regulator_enable()' call,
'regulator_disable()' must be called.
Fix the error handling path of the probe accordingly.
Fixes: cb496cd472 ("media: cxd2880-spi: Add optional vcc regulator")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e642197562 ]
The error code is missing in this code scenario, add the error code
'-EINVAL' to the return value 'ret'.
Eliminate the follow smatch warning:
drivers/leds/leds-is31fl32xx.c:388 is31fl32xx_parse_dt() warn: missing
error code 'ret'.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Fixes: 9d7cffaf99 ("leds: Add driver for the ISSI IS31FL32xx family of LED controllers")
Acked-by: David Rivshin <drivshin@allworx.com>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ea3e1c36e3 ]
Without this patch, the TDA19971 chip's EDID is inactive.
EDID never worked with this driver, it was all tested with HDMI signal
sources which don't need EDID support.
Signed-off-by: Krzysztof Halasa <khalasa@piap.pl>
Fixes: 9ac0038db9 ("media: i2c: Add TDA1997x HDMI receiver driver")
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 123aaf816b ]
SAMA5D2 does not have the YCYC field for the RLP (rounding, limiting,
packaging) module.
The YCYC field is supposed to work with interleaved YUV formats like YUYV.
In SAMA5D2, we have to use YYCC field, which is used for both planar
formats like YUV420 and interleaved formats like YUYV.
Fix the according rlp callback to replace the generic YCYC field (which
makes more sense from a logical point of view) with the required YYCC
field.
Fixes: debfa49687 ("media: atmel: atmel-isc-base: add support for more formats and additional pipeline modules")
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3694f996be ]
The TAS2505/TAS2521 does support up to two channels, LEFT and RIGHT,
which are being alternated on the audio data bus by Word Clock, WCLK.
This is documented in TI slau472 2.7.1 Digital Audio Interface. Note
that both the LEFT and RIGHT channels are only used for audio INPUT,
while only the LEFT channel is used for audio OUTPUT.
Fixes: b4525b6196 ("ASoC: tlv320aic32x4: add support for TAS2505")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Claudius Heine <ch@denx.de>
Cc: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210708091229.56443-1-marex@denx.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 26cfc0dbe4 ]
The function wait_for_completion_interruptible_timeout will return
-ERESTARTSYS immediately when receiving SIGKILL signal which is sent
by "jffs2_gcd_mtd" during umounting jffs2. This will break the SPI memory
operation because the data transmitting may begin before the command or
address transmitting completes. Use wait_for_completion_timeout to prevent
the process from being interruptible.
Fixes: 67dca5e580 ("spi: spi-mem: Add support for Zynq QSPI controller")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Link: https://lore.kernel.org/r/20210826005930.20572-1-quanyang.wang@windriver.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 98e47570ba ]
In commit e915331149 ("regulator: vctrl-regulator: Avoid deadlock getting
and setting the voltage"), all calls to get/set the voltage of the
control regulator were switched to unlocked versions to avoid deadlocks.
However, the call in the probe path is done without regulator locks
held. In this case the locked version should be used.
Switch back to the locked regulator_get_voltage() in the probe path to
avoid any mishaps.
Fixes: e915331149 ("regulator: vctrl-regulator: Avoid deadlock getting and setting the voltage")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://lore.kernel.org/r/20210825033704.3307263-2-wenst@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cc40b72251 ]
dun_bytes needs to be less than or equal to the IV size of the
encryption mode, not just less than or equal to BLK_CRYPTO_MAX_IV_SIZE.
Currently this doesn't matter since blk_crypto_init_key() is never
actually passed invalid values, but we might as well fix this.
Fixes: a892c8d52c ("block: Inline encryption support for blk-mq")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20210825055918.51975-1-ebiggers@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3bff147b18 ]
When a fatal machine check results in a system reset, Linux does not
clear the error(s) from machine check bank(s) - hardware preserves the
machine check banks across a warm reset.
During initialization of the kernel after the reboot, Linux reads, logs,
and clears all machine check banks.
But there is a problem. In:
5de97c9f6d ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
the call to mce_register_decode_chain() moved later in the boot
sequence. This means that /dev/mcelog doesn't see those early error
logs.
This was partially fixed by:
cd9c57cad3 ("x86/MCE: Dump MCE to dmesg if no consumers")
which made sure that the logs were not lost completely by printing
to the console. But parsing console logs is error prone. Users of
/dev/mcelog should expect to find any early errors logged to standard
places.
Add a new flag MCP_QUEUE_LOG to machine_check_poll() to be used in early
machine check initialization to indicate that any errors found should
just be queued to genpool. When mcheck_late_init() is called it will
call mce_schedule_work() to actually log and flush any errors queued in
the genpool.
[ Based on an original patch, commit message by and completely
productized by Tony Luck. ]
Fixes: 5de97c9f6d ("x86/mce: Factor out and deprecate the /dev/mcelog driver")
Reported-by: Sumanth Kamatala <skamatala@juniper.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210824003129.GA1642753@agluck-desk2.amr.corp.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2294a7299f ]
MCDDRCFG is a per-channel register and uses bit{0,1} to indicate
the NVDIMM presence on DIMM slot{0,1}. Current i10nm_edac driver
wrongly uses MCDDRCFG as per-DIMM register and fails to detect
the NVDIMM.
Fix it by reading MCDDRCFG as per-channel register and using its
bit{0,1} to check whether the NVDIMM is populated on DIMM slot{0,1}.
Fixes: d4dc89d069 ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Reported-by: Fan Du <fan.du@intel.com>
Tested-by: Wen Jin <wen.jin@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20210818175701.1611513-2-tony.luck@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 047d4226b0 ]
When rngd is run as root then lots of these types of message will appear
in the kernel log if the TPM has been configured to provide random bytes:
[ 7406.275163] tpm tpm0: tpm_transmit: tpm_recv: error -4
The issue is caused by the following call that is interrupted while
waiting for the TPM's response.
sig = wait_event_interruptible(ibmvtpm->wq, !ibmvtpm->tpm_processing_cmd);
Rather than waiting for the response in the low level driver, have it use
the polling loop in tpm_try_transmit() that uses a command's duration to
poll until a result has been returned by the TPM, thus ending when the
timeout has occurred but not responding to signals and ctrl-c anymore. To
stay in this polling loop extend tpm_ibmvtpm_status() to return
'true' for as long as the vTPM is indicated as being busy in
tpm_processing_cmd. Since the loop requires the TPM's timeouts, get them
now using tpm_get_timeouts() after setting the TPM2 version flag on the
chip.
To recreat the resolved issue start rngd like this:
sudo rngd -r /dev/hwrng -t
sudo rngd -r /dev/tpm0 -t
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1981473
Fixes: 6674ff145e ("tpm_ibmvtpm: properly handle interrupted packet receptions")
Cc: Nayna Jain <nayna@linux.ibm.com>
Cc: George Wilson <gcwilson@linux.ibm.com>
Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ea35e0d5df ]
Address a kbuild issue where a developer created an ECDSA key for signing
kernel modules and then builds an older version of the kernel, when bi-
secting the kernel for example, that does not support ECDSA keys.
If openssl is installed, trigger the creation of an RSA module signing
key if it is not an RSA key.
Fixes: cfc411e7ff ("Move certificate handling to its own directory")
Cc: David Howells <dhowells@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 462354d986 ]
Replace vf_mask type with unsigned long to avoid a stack-out-of-bound.
This is to fix the following warning reported by KASAN the first time
adf_msix_isr_ae() gets called.
[ 692.091987] BUG: KASAN: stack-out-of-bounds in find_first_bit+0x28/0x50
[ 692.092017] Read of size 8 at addr ffff88afdf789e60 by task swapper/32/0
[ 692.092076] Call Trace:
[ 692.092089] <IRQ>
[ 692.092101] dump_stack+0x9c/0xcf
[ 692.092132] print_address_description.constprop.0+0x18/0x130
[ 692.092164] ? find_first_bit+0x28/0x50
[ 692.092185] kasan_report.cold+0x7f/0x111
[ 692.092213] ? static_obj+0x10/0x80
[ 692.092234] ? find_first_bit+0x28/0x50
[ 692.092262] find_first_bit+0x28/0x50
[ 692.092288] adf_msix_isr_ae+0x16e/0x230 [intel_qat]
Fixes: ed8ccaef52 ("crypto: qat - Add support for SRIOV")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 60a1cd10b2 ]
When disable_irq_nosync for an interrupt is called from within its
interrupt handler, this interrupt is only marked as disabled with the
intention to mask it when it triggers again.
The AIC hardware however automatically masks the interrupt when it is read.
aic_irq_eoi then unmasks it again if it's not disabled *and* not masked.
This results in a state mismatch between the hardware state and the
state kept in irq_data: The hardware interrupt is masked but
IRQD_IRQ_MASKED is not set. Any further calls to unmask_irq will directly
return and the interrupt can never be enabled again.
Fix this by keeping the hardware and irq_data state in sync by unmasking in
aic_irq_eoi if and only if the irq_data state also assumes the interrupt to
be unmasked.
Fixes: 76cde26394 ("irqchip/apple-aic: Add support for the Apple Interrupt Controller")
Signed-off-by: Sven Peter <sven@svenpeter.dev>
Acked-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210812100942.17206-1-sven@svenpeter.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b1a811633f ]
Syzbot hit WARNING in internal_create_group(). The problem was in
too big disk->first_minor.
disk->first_minor is initialized by value, which comes from userspace
and there wasn't any sanity checks about value correctness. It can cause
duplicate creation of sysfs files/links, because disk->first_minor will
be passed to MKDEV() which causes truncation to byte. Since maximum
minor value is 0xff, let's check if first_minor is correct minor number.
NOTE: the root case of the reported warning was in wrong error handling
in register_disk(), but we can avoid passing knowingly wrong values to
sysfs API, because sysfs error messages can confuse users. For example:
user passed 1048576 as index, but sysfs complains about duplicate
creation of /dev/block/43:0. It's not obvious how 1048576 becomes 0.
Log and reproducer for above example can be found on syzkaller bug
report page.
Link: https://syzkaller.appspot.com/bug?id=03c2ae9146416edf811958d5fd7acfab75b143d1
Fixes: b0d9111a2d ("nbd: use an idr to keep track of nbd devices")
Reported-by: syzbot+9937dc42271cd87d4b98@syzkaller.appspotmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 68c9417b19 ]
Now open_mutex is used to synchronize partition operations (e.g,
blk_drop_partitions() and blkdev_reread_part()), however it makes
nbd driver broken, because nbd may call del_gendisk() in nbd_release()
or nbd_genl_disconnect() if NBD_CFLAG_DESTROY_ON_DISCONNECT is enabled,
and deadlock occurs, as shown below:
// AB-BA dead-lock
nbd_genl_disconnect blkdev_open
nbd_disconnect_and_put
lock bd_mutex
// last ref
nbd_put
lock nbd_index_mutex
del_gendisk
nbd_open
try lock nbd_index_mutex
try lock bd_mutex
or
// AA dead-lock
nbd_release
lock bd_mutex
nbd_put
try lock bd_mutex
Instead of fixing block layer (e.g, introduce another lock), fixing
the nbd driver to call del_gendisk() in a kworker when
NBD_DESTROY_ON_DISCONNECT is enabled. When NBD_DESTROY_ON_DISCONNECT
is disabled, nbd device will always be destroy through module removal,
and there is no risky of deadlock.
To ensure the reuse of nbd index succeeds, moving the calling of
idr_remove() after del_gendisk(), so if the reused index is not found
in nbd_index_idr, the old disk must have been deleted. And reusing
the existing destroy_complete mechanism to ensure nbd_genl_connect()
will wait for the completion of del_gendisk().
Also adding a new workqueue for nbd removal, so nbd_cleanup()
can ensure all removals complete before exits.
Reported-by: syzbot+0fe7752e52337864d29b@syzkaller.appspotmail.com
Fixes: c76f48eb5c ("block: take bd_mutex around delete_partitions in del_gendisk")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210811124428.2368491-2-hch@lst.de
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit be83c3b6e7 ]
If CMT instance has at least two channels, one channel will be used
as a clock source and another one used as a clock event device.
In that case, IRQ is not requested for clock source channel so
sh_cmt_clock_event_program_verify() might work incorrectly.
Besides, when a channel is only used for clock source, don't need to
re-set the next match_value since it should be maximum timeout as
it still is.
On the other hand, due to no IRQ, total_cycles is not counted up
when reaches compare match time (timer counter resets to zero),
so sh_cmt_clocksource_read() returns unexpected value.
Therefore, use 64-bit clocksoure's mask for 32-bit or 16-bit variants
will also lead to wrong delta calculation. Hence, this mask should
correspond to timer counter width, and above function just returns
the raw value of timer counter register.
Fixes: bfa76bb12f ("clocksource: sh_cmt: Request IRQ for clock event device only")
Fixes: 37e7742c55 ("clocksource/drivers/sh_cmt: Fix clocksource width for 32-bit machines")
Signed-off-by: Phong Hoang <phong.hoang.wz@renesas.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210422123443.73334-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b6f756726e ]
We should set the additional space to 0 in mpi_resize().
So use kcalloc() instead of kmalloc_array().
In lib/mpi/ec.c:
/****************
* Resize the array of A to NLIMBS. the additional space is cleared
* (set to 0) [done by m_realloc()]
*/
int mpi_resize(MPI a, unsigned nlimbs)
Like the comment of kernel's mpi_resize() said, the additional space
need to be set to 0, but when a->d is not NULL, it does not set.
The kernel's mpi lib is from libgcrypt, the mpi resize in libgcrypt
is _gcry_mpi_resize() which set the additional space to 0.
This bug may cause mpi api which use mpi_resize() get wrong result
under the condition of using the additional space without initiation.
If this condition is not met, the bug would not be triggered.
Currently in kernel, rsa, sm2 and dh use mpi lib, and they works well,
so the bug is not triggered in these cases.
add_points_edwards() use the additional space directly, so it will
get a wrong result.
Fixes: cdec9cb516 ("crypto: GnuPG based MPI lib - source files (part 1)")
Signed-off-by: Hongbo Li <herberthbli@tencent.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 976c1de1de ]
Depending on the DMA driver being used, the struct dma_slave_config may
need to be initialized to zero for the unused data.
For example, we have three DMA drivers using src_port_window_size and
dst_port_window_size. If these are left uninitialized, it can cause DMA
failures.
For spi-pic32, this is probably not currently an issue but is still good to
fix though.
Fixes: 1bcb9f8ceb ("spi: spi-pic32: Add PIC32 SPI master driver")
Cc: Purna Chandra Mandal <purna.mandal@microchip.com>
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Cc: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210810081727.19491-2-tony@atomide.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 866663b7b5 ]
When merging one bio to request, if they are discard IO and the queue
supports multi-range discard, we need to return ELEVATOR_DISCARD_MERGE
because both block core and related drivers(nvme, virtio-blk) doesn't
handle mixed discard io merge(traditional IO merge together with
discard merge) well.
Fix the issue by returning ELEVATOR_DISCARD_MERGE in this situation,
so both blk-mq and drivers just need to handle multi-range discard.
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Fixes: 2705dfb209 ("block: fix discard request merge")
Link: https://lore.kernel.org/r/20210729034226.1591070-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2189e928b6 ]
When enabling CONFIG_RMW_INSNS in e.g. a Coldfire build:
{standard input}:3068: Error: invalid instruction for this architecture; needs 68020 or higher (68020 [68k, 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060]) -- statement `casl %d4,%d0,(%a6)' ignored
Fix this by (a) adding a new config symbol to track if support for any
CPU that lacks the CAS instruction is enabled, and (b) making
CONFIG_RMW_INSNS depend on the new symbol not being set.
Fixes: 0e152d8050 ("m68k: reorganize Kconfig options to improve mmu/non-mmu selections")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210725104413.318932-1-geert@linux-m68k.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dc87740c8a ]
If rcu_print_task_stall() is invoked on an rcu_node structure that does
not contain any tasks blocking the current grace period, it takes an
early exit that fails to release that rcu_node structure's lock. This
results in a self-deadlock, which is detected by lockdep.
To reproduce this bug:
tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 3 --trust-make --configs "TREE03" --kconfig "CONFIG_PROVE_LOCKING=y" --bootargs "rcutorture.stall_cpu=30 rcutorture.stall_cpu_block=1 rcutorture.fwd_progress=0 rcutorture.test_boost=0"
This will also result in other complaints, including RCU's scheduler
hook complaining about blocking rather than preemption and an rcutorture
writer stall.
Only a partial RCU CPU stall warning message will be printed because of
the self-deadlock.
This commit therefore releases the lock on the rcu_print_task_stall()
function's early exit path.
Fixes: c583bcb8f5 ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled")
Tested-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e6a901a44f ]
The for loop in rcu_print_task_stall() always omits ts[0], which points
to the first task blocking the stalled grace period. This in turn fails
to count this first task, which means that ndetected will be equal to
zero when all CPUs have passed through their quiescent states and only
one task is blocking the stalled grace period. This zero value for
ndetected will in turn result in an incorrect "All QSes seen" message:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: Tasks blocked on level-1 rcu_node (CPUs 12-23):
(detected by 15, t=6504 jiffies, g=164777, q=9011209)
rcu: All QSes seen, last rcu_preempt kthread activity 1 (4295252379-4295252378), jiffies_till_next_fqs=1, root ->qsmask 0x2
BUG: sleeping function called from invalid context at include/linux/uaccess.h:156
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 70613, name: msgstress04
INFO: lockdep is turned off.
Preemption disabled at:
[<ffff8000104031a4>] create_object.isra.0+0x204/0x4b0
CPU: 15 PID: 70613 Comm: msgstress04 Kdump: loaded Not tainted
5.12.2-yoctodev-standard #1
Hardware name: Marvell OcteonTX CN96XX board (DT)
Call trace:
dump_backtrace+0x0/0x2cc
show_stack+0x24/0x30
dump_stack+0x110/0x188
___might_sleep+0x214/0x2d0
__might_sleep+0x7c/0xe0
This commit therefore fixes the loop to include ts[0].
Fixes: c583bcb8f5 ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled")
Tested-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ca4984a7dd ]
The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
uclamp active task (that is, when buckets.tasks reaches 0 for all
buckets) to maintain the last uclamp.max and prevent blocked util from
suddenly becoming visible.
However, there is an asymmetry in how the flag is set and cleared which
can lead to having the flag set whilst there are active tasks on the rq.
Specifically, the flag is cleared in the uclamp_rq_inc() path, which is
called at enqueue time, but set in uclamp_rq_dec_id() which is called
both when dequeueing a task _and_ in the update_uclamp_active() path. As
a result, when both uclamp_rq_{dec,ind}_id() are called from
update_uclamp_active(), the flag ends up being set but not cleared,
hence leaving the runqueue in a broken state.
Fix this by clearing the flag in update_uclamp_active() as well.
Fixes: e496187da7 ("sched/uclamp: Enforce last task's UCLAMP_MAX")
Reported-by: Rick Yiu <rickyiu@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20210805102154.590709-2-qperret@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0469dede0e ]
ecdsa_set_pub_key() makes an u64 pointer at 1 byte offset of the key.
This results in an unaligned u64 pointer. This pointer is passed to
ecc_swap_digits() which assumes natural alignment.
This causes a kernel crash on an armv7 platform:
[ 0.409022] Unhandled fault: alignment exception (0x001) at 0xc2a0a6a9
...
[ 0.416982] PC is at ecdsa_set_pub_key+0xdc/0x120
...
[ 0.491492] Backtrace:
[ 0.492059] [<c07c266c>] (ecdsa_set_pub_key) from [<c07c75d4>] (test_akcipher_one+0xf4/0x6c0)
Handle unaligned input buffer in ecc_swap_digits() by replacing
be64_to_cpu() to get_unaligned_be64(). Change type of in pointer to
void to reflect it doesn’t necessarily need to be aligned.
Fixes: 4e6602916b ("crypto: ecdsa - Add support for ECDSA signature verification")
Reported-by: Guillaume Gardet <guillaume.gardet@arm.com>
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 821720b9f3 ]
The updated XTS code fails to check the return code of skcipher_walk_virt,
which may lead to skcipher_walk_abort() or skcipher_walk_done() being called
while the walk argument is in an inconsistent state.
So check the return value after each such call, and bail on errors.
Fixes: 2481104fe9 ("crypto: x86/aes-ni-xts - rewrite and drop indirections via glue helper")
Reported-by: Dave Hansen <dave.hansen@intel.com>
Reported-by: syzbot <syzbot+5d1bad8042a8f0e8117a@syzkaller.appspotmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ad1abe4769 ]
Deal with deferred probe using dev_err_probe so the error is handled
and avoid logging lots probe defer information like the following:
[ 9.125121] cw2015 4-0062: Failed to register power supply
[ 9.211131] cw2015 4-0062: Failed to register power supply
Fixes: b4c7715c10 ("power: supply: add CellWise cw2015 fuel gauge driver")
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 459b09b5a3 ]
Since CPU capacity asymmetry can stem purely from maximum frequency
differences (e.g. Pixel 1), a rebuild of the scheduler topology can be
issued upon loading cpufreq, see:
arch_topology.c::init_cpu_capacity_callback()
Turns out that if this rebuild happens *before* sched_debug_init() is
run (which is a late initcall), we end up messing up the sched_domain debug
directory: passing a NULL parent to debugfs_create_dir() ends up creating
the directory at the debugfs root, which in this case creates
/sys/kernel/debug/domains (instead of /sys/kernel/debug/sched/domains).
This currently doesn't happen on asymmetric systems which use cpufreq-scpi
or cpufreq-dt drivers, as those are loaded via
deferred_probe_initcall() (it is also a late initcall, but appears to be
ordered *after* sched_debug_init()).
Ionela has been working on detecting maximum frequency asymmetry via ACPI,
and that actually happens via a *device* initcall, thus before
sched_debug_init(), and causes the aforementionned debugfs mayhem.
One option would be to punt sched_debug_init() down to
fs_initcall_sync(). Preventing update_sched_domain_debugfs() from running
before sched_debug_init() appears to be the safer option.
Fixes: 3b87f136f8 ("sched,debug: Convert sysctl sched_domains to debugfs")
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lore.kernel.org/r/20210514095339.12979-1-ionela.voinescu@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 915fea04f9 ]
The restart interrupt is triggered whenever a secondary CPU is
brought online, a remote function call dispatched from another
CPU or a manual PSW restart is initiated and causes the system
to kdump. The handling routine is always called with DAT turned
off. It then initializes the stack frame and invokes a callback.
The existing callbacks handle DAT as follows:
* __do_restart() and __machine_kexec() turn in on upon entry;
* __ipl_run(), __reipl_run() and __dump_run() do not turn it
right away, but all of them call diag308() - which turns DAT
on, but only if kasan is enabled;
In addition to the described complexity all callbacks (and the
functions they call) should avoid kasan instrumentation while
DAT is off.
This update enables DAT in the assembler restart handler and
relieves any callbacks (which are mostly C functions) from
dealing with DAT altogether.
There are four types of CPU restart that initialize control
registers in different ways:
1. Start of secondary CPU on boot - control registers are
inherited from the IPL CPU;
2. Restart of online CPU - control registers of the CPU being
restarted are kept;
3. Hotplug of offline CPU - control registers are inherited
from the starting CPU;
4. Start of offline CPU triggered by manual PSW restart -
the control registers are read from the absolute lowcore
and contain the boot time IPL CPU values updated with all
follow-up calls of smp_ctl_set_bit() and smp_ctl_clear_bit()
routines;
In first three cases contents of the control registers is the
most recent. In the latter case control registers are good
enough to facilitate successful completion of kdump operation.
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit cabebb697c ]
If for any reason the interrupt enable for an ap queue fails the
state machine run for the queue returned wrong return codes to the
caller. So the caller assumed interrupt support for this queue in
enabled and thus did not re-establish the high resolution timer used
for polling. In the end this let to a hang for the user space process
waiting "forever" for the reply.
This patch reworks these return codes to return correct indications
for the caller to re-establish the timer when a queue runs without
interrupt support.
Please note that this is fixing a wrong behavior after a first
failure (enable interrupt support for the queue) failed. However,
looks like this occasionally happens on KVM systems.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9372a82892 ]
Currently allocation and registration of s390dbf debug areas are tied
together. As a result, a debug area cannot be unregistered and
re-registered while any process has an associated debugfs file open.
Fix this by splitting alloc/release from register/unregister.
Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1204777867 ]
Any previously recorded s390dbf debug data is reset when a debug area
is resized using the 'pages' sysfs attribute. This can make
live-debugging unnecessarily complex.
Fix this by copying existing debug data to the newly allocated debug
area when resizing.
Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f7addcdd52 ]
Currently clp_set_pci_fn() always returns 0 as long as the CLP request
itself succeeds even if the operation itself returns a response code
other than CLP_RC_OK or CLP_RC_SETPCIFN_ALRDY. This is highly misleading
because calling code assumes that a zero rc means that the operation was
successful.
Fix this by returning the response code or cc on failure with the
exception of the special handling for CLP_RC_SETPCIFN_ALRDY. Also let's
not assume that the returned function handle for CLP_RC_SETPCIFN_ALRDY
is 0, we don't need it anyway.
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d3e9f732c4 ]
Daniel reports that the v5.14-rc4-rt4 kernel throws a BUG when running
stress-ng:
| [ 90.202543] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:35
| [ 90.202549] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2047, name: iou-wrk-2041
| [ 90.202555] CPU: 5 PID: 2047 Comm: iou-wrk-2041 Tainted: G W 5.14.0-rc4-rt4+ #89
| [ 90.202559] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
| [ 90.202561] Call Trace:
| [ 90.202577] dump_stack_lvl+0x34/0x44
| [ 90.202584] ___might_sleep.cold+0x87/0x94
| [ 90.202588] rt_spin_lock+0x19/0x70
| [ 90.202593] ___slab_alloc+0xcb/0x7d0
| [ 90.202598] ? newidle_balance.constprop.0+0xf5/0x3b0
| [ 90.202603] ? dequeue_entity+0xc3/0x290
| [ 90.202605] ? io_wqe_dec_running.isra.0+0x98/0xe0
| [ 90.202610] ? pick_next_task_fair+0xb9/0x330
| [ 90.202612] ? __schedule+0x670/0x1410
| [ 90.202615] ? io_wqe_dec_running.isra.0+0x98/0xe0
| [ 90.202618] kmem_cache_alloc_trace+0x79/0x1f0
| [ 90.202621] io_wqe_dec_running.isra.0+0x98/0xe0
| [ 90.202625] io_wq_worker_sleeping+0x37/0x50
| [ 90.202628] schedule+0x30/0xd0
| [ 90.202630] schedule_timeout+0x8f/0x1a0
| [ 90.202634] ? __bpf_trace_tick_stop+0x10/0x10
| [ 90.202637] io_wqe_worker+0xfd/0x320
| [ 90.202641] ? finish_task_switch.isra.0+0xd3/0x290
| [ 90.202644] ? io_worker_handle_work+0x670/0x670
| [ 90.202646] ? io_worker_handle_work+0x670/0x670
| [ 90.202649] ret_from_fork+0x22/0x30
which is due to the RT kernel not liking a GFP_ATOMIC allocation inside
a raw spinlock. Besides that not working on RT, doing any kind of
allocation from inside schedule() is kind of nasty and should be avoided
if at all possible.
This particular path happens when an io-wq worker goes to sleep, and we
need a new worker to handle pending work. We currently allocate a small
data item to hold the information we need to create a new worker, but we
can instead include this data in the io_worker struct itself and just
protect it with a single bit lock. We only really need one per worker
anyway, as we will have run pending work between to sleep cycles.
https://lore.kernel.org/lkml/20210804082418.fbibprcwtzyt5qax@beryllium.lan/
Reported-by: Daniel Wagner <dwagner@suse.de>
Tested-by: Daniel Wagner <dwagner@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2f488f698f ]
There is an existing lock hierarchy of
&dev->event_lock --> &fasync_struct.fa_lock --> &f->f_owner.lock
from the following call chain:
input_inject_event():
spin_lock_irqsave(&dev->event_lock,...);
input_handle_event():
input_pass_values():
input_to_handler():
evdev_events():
evdev_pass_values():
spin_lock(&client->buffer_lock);
__pass_event():
kill_fasync():
kill_fasync_rcu():
read_lock(&fa->fa_lock);
send_sigio():
read_lock_irqsave(&fown->lock,...);
&dev->event_lock is HARDIRQ-safe, so interrupts have to be disabled
while grabbing &fasync_struct.fa_lock, otherwise we invert the lock
hierarchy. However, since kill_fasync which calls kill_fasync_rcu is
an exported symbol, it may not necessarily be called with interrupts
disabled.
As kill_fasync_rcu may be called with interrupts disabled (for
example, in the call chain above), we replace calls to
read_lock/read_unlock on &fasync_struct.fa_lock in kill_fasync_rcu
with read_lock_irqsave/read_unlock_irqrestore.
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f671a691e2 ]
Syzbot reports a potential deadlock in do_fcntl:
========================================================
WARNING: possible irq lock inversion dependency detected
5.12.0-syzkaller #0 Not tainted
--------------------------------------------------------
syz-executor132/8391 just changed the state of lock:
ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: f_getown_ex fs/fcntl.c:211 [inline]
ffff888015967bf8 (&f->f_owner.lock){.+..}-{2:2}, at: do_fcntl+0x8b4/0x1200 fs/fcntl.c:395
but this lock was taken by another, HARDIRQ-safe lock in the past:
(&dev->event_lock){-...}-{2:2}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Chain exists of:
&dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&f->f_owner.lock);
local_irq_disable();
lock(&dev->event_lock);
lock(&new->fa_lock);
<Interrupt>
lock(&dev->event_lock);
*** DEADLOCK ***
This happens because there is a lock hierarchy of
&dev->event_lock --> &new->fa_lock --> &f->f_owner.lock
from the following call chain:
input_inject_event():
spin_lock_irqsave(&dev->event_lock,...);
input_handle_event():
input_pass_values():
input_to_handler():
evdev_events():
evdev_pass_values():
spin_lock(&client->buffer_lock);
__pass_event():
kill_fasync():
kill_fasync_rcu():
read_lock(&fa->fa_lock);
send_sigio():
read_lock_irqsave(&fown->lock,...);
However, since &dev->event_lock is HARDIRQ-safe, interrupts have to be
disabled while grabbing &f->f_owner.lock, otherwise we invert the lock
hierarchy.
Hence, we replace calls to read_lock/read_unlock on &f->f_owner.lock,
with read_lock_irq/read_unlock_irq.
Reported-and-tested-by: syzbot+e6d5398a02c516ce5e70@syzkaller.appspotmail.com
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7b3d52683b ]
There are several places where the return value check of crypto_aead_setkey
and crypto_aead_setauthsize were lost. It is necessary to add these checks.
At the same time, move the crypto_aead_setauthsize() call out of the loop,
and only need to call it once after load transform.
Fixee: 53f52d7aec ("crypto: tcrypt - Added speed tests for AEAD crypto alogrithms in tcrypt test suite")
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a52626106d ]
When the endian configuration of the hardware is abnormal, it will
cause the SEC engine is faulty that reports empty message. And it
will affect the normal function of the hardware. Currently the soft
configuration method can't restore the faulty device. The endian
needs to be configured according to the system properties. So fix it.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 90367a027a ]
Because the algs registration process has added a judgment.
So need to add the judgment for the abnormal exiting process.
Signed-off-by: Kai Ye <yekai13@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 645ae0af18 ]
The function adf_iov_putmsg() is only used inside the intel_qat module
therefore should not be exported.
Remove EXPORT_SYMBOL for the function adf_iov_putmsg().
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b90c1c4d3f ]
At start and shutdown, VFs notify the PF about their state. These
notifications are carried out through a message exchange using the PFVF
protocol.
Function names lead to believe they do perform init or shutdown logic.
This is to fix the naming to better reflect their purpose.
Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
Co-developed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0a73c762e1 ]
The top half of the VF drivers handled only a source at the time.
If an interrupt for PF2VF and bundle occurred at the same time, the ISR
scheduled only the bottom half for PF2VF.
This patch fixes the VF top half so that if both sources of interrupt
trigger at the same time, both bottom halves are scheduled.
This patch is based on earlier work done by Conor McLoughlin.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5147f0906d ]
The function adf_dev_init() ignores the error code reported by
enable_vf2pf_comms(). If the latter fails, e.g. the VF is not compatible
with the pf, then the load of the VF driver progresses.
This patch changes adf_dev_init() so that the error code from
enable_vf2pf_comms() is returned to the caller.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe4d55773b ]
lockdep complains that in omap-aes, the list_lock is taken both with
softirqs enabled at probe time, and also in softirq context, which
could lead to a deadlock:
================================
WARNING: inconsistent lock state
5.14.0-rc1-00035-gc836005b01c5-dirty #69 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
ksoftirqd/0/7 [HC0[0]:SC1[3]:HE1:SE0] takes:
bf00e014 (list_lock){+.?.}-{2:2}, at: omap_aes_find_dev+0x18/0x54 [omap_aes_driver]
{SOFTIRQ-ON-W} state was registered at:
_raw_spin_lock+0x40/0x50
omap_aes_probe+0x1d4/0x664 [omap_aes_driver]
platform_probe+0x58/0xb8
really_probe+0xbc/0x314
__driver_probe_device+0x80/0xe4
driver_probe_device+0x30/0xc8
__driver_attach+0x70/0xf4
bus_for_each_dev+0x70/0xb4
bus_add_driver+0xf0/0x1d4
driver_register+0x74/0x108
do_one_initcall+0x84/0x2e4
do_init_module+0x5c/0x240
load_module+0x221c/0x2584
sys_finit_module+0xb0/0xec
ret_fast_syscall+0x0/0x2c
0xbed90b30
irq event stamp: 111800
hardirqs last enabled at (111800): [<c02a21e4>] __kmalloc+0x484/0x5ec
hardirqs last disabled at (111799): [<c02a21f0>] __kmalloc+0x490/0x5ec
softirqs last enabled at (111776): [<c01015f0>] __do_softirq+0x2b8/0x4d0
softirqs last disabled at (111781): [<c0135948>] run_ksoftirqd+0x34/0x50
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(list_lock);
<Interrupt>
lock(list_lock);
*** DEADLOCK ***
2 locks held by ksoftirqd/0/7:
#0: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: netif_receive_skb+0x6c/0x260
#1: c0f5e8c8 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x2c/0xdc
stack backtrace:
CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 5.14.0-rc1-00035-gc836005b01c5-dirty #69
Hardware name: Generic AM43 (Flattened Device Tree)
[<c010e6e0>] (unwind_backtrace) from [<c010b9d0>] (show_stack+0x10/0x14)
[<c010b9d0>] (show_stack) from [<c017c640>] (mark_lock.part.17+0x5bc/0xd04)
[<c017c640>] (mark_lock.part.17) from [<c017d9e4>] (__lock_acquire+0x960/0x2fa4)
[<c017d9e4>] (__lock_acquire) from [<c0180980>] (lock_acquire+0x10c/0x358)
[<c0180980>] (lock_acquire) from [<c093d324>] (_raw_spin_lock_bh+0x44/0x58)
[<c093d324>] (_raw_spin_lock_bh) from [<bf00b258>] (omap_aes_find_dev+0x18/0x54 [omap_aes_driver])
[<bf00b258>] (omap_aes_find_dev [omap_aes_driver]) from [<bf00b328>] (omap_aes_crypt+0x94/0xd4 [omap_aes_driver])
[<bf00b328>] (omap_aes_crypt [omap_aes_driver]) from [<c08ac6d0>] (esp_input+0x1b0/0x2c8)
[<c08ac6d0>] (esp_input) from [<c08c9e90>] (xfrm_input+0x410/0x1290)
[<c08c9e90>] (xfrm_input) from [<c08b6374>] (xfrm4_esp_rcv+0x54/0x11c)
[<c08b6374>] (xfrm4_esp_rcv) from [<c0838840>] (ip_protocol_deliver_rcu+0x48/0x3bc)
[<c0838840>] (ip_protocol_deliver_rcu) from [<c0838c50>] (ip_local_deliver_finish+0x9c/0xdc)
[<c0838c50>] (ip_local_deliver_finish) from [<c0838dd8>] (ip_local_deliver+0x148/0x1b0)
[<c0838dd8>] (ip_local_deliver) from [<c0838f5c>] (ip_rcv+0x11c/0x180)
[<c0838f5c>] (ip_rcv) from [<c077e3a4>] (__netif_receive_skb_one_core+0x54/0x74)
[<c077e3a4>] (__netif_receive_skb_one_core) from [<c077e588>] (netif_receive_skb+0xa8/0x260)
[<c077e588>] (netif_receive_skb) from [<c068d6d4>] (cpsw_rx_handler+0x224/0x2fc)
[<c068d6d4>] (cpsw_rx_handler) from [<c0688ccc>] (__cpdma_chan_process+0xf4/0x188)
[<c0688ccc>] (__cpdma_chan_process) from [<c068a0c0>] (cpdma_chan_process+0x3c/0x5c)
[<c068a0c0>] (cpdma_chan_process) from [<c0690e14>] (cpsw_rx_mq_poll+0x44/0x98)
[<c0690e14>] (cpsw_rx_mq_poll) from [<c0780810>] (__napi_poll+0x28/0x268)
[<c0780810>] (__napi_poll) from [<c0780c64>] (net_rx_action+0xcc/0x204)
[<c0780c64>] (net_rx_action) from [<c0101478>] (__do_softirq+0x140/0x4d0)
[<c0101478>] (__do_softirq) from [<c0135948>] (run_ksoftirqd+0x34/0x50)
[<c0135948>] (run_ksoftirqd) from [<c01583b8>] (smpboot_thread_fn+0xf4/0x1d8)
[<c01583b8>] (smpboot_thread_fn) from [<c01546dc>] (kthread+0x14c/0x174)
[<c01546dc>] (kthread) from [<c010013c>] (ret_from_fork+0x14/0x38)
...
The omap-des and omap-sham drivers appear to have a similar issue.
Fix this by using spin_{,un}lock_bh() around device list access in all
the probe and remove functions.
Signed-off-by: Ben Hutchings <ben.hutchings@mind.be>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0083242c93 ]
The scheduler currently expects NUMA node distances to be stable from
init onwards, and as a consequence builds the related data structures
once-and-for-all at init (see sched_init_numa()).
Unfortunately, on some architectures node distance is unreliable for
offline nodes and may very well change upon onlining.
Skip over offline nodes during sched_init_numa(). Track nodes that have
been onlined at least once, and trigger a build of a node's NUMA masks
when it is first onlined post-init.
Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210818074333.48645-1-srikar@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8617bb7400 ]
Tests showed a mismatch between what the CCA tool reports about
the APKA master key state and what's displayed by the zcrypt dd
in sysfs. After some investigation, we found out that the
documentation which was the source for the zcrypt dd implementation
lacks the listing of 3 fields. So this patch now moves the
evaluation of the APKA master key state to the correct offset.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d3683c0552 ]
Introduce dev_busid, which exports the device-id associated with the
io-subchannel (and message-subchannel). The dev_busid indicates that of
the device which may be physically installed on the corrosponding
subchannel. The dev_busid value "none" indicates that the subchannel
is not valid, there is no I/O device currently associated with the
subchannel.
The dev_busid information would be helpful to write device-specific
udev-rules associated with the subchannel. The dev_busid interface would
be available even when the sch is not bound to any driver or if there is
no operational device connected on it. Hence this attribute can be used to
write udev-rules which are specific to the device associated with the
subchannel.
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit efe2175478 ]
Pin control needs to be activated by setting the enable bit, otherwise
hardware rejects all pin changes. Previously this stayed unnoticed on
Nexus 7 because pin control was enabled by default after rebooting from
downstream kernel, which uses driver that enables the bit and charger
registers are non-volatile until power supply (battery) is disconnected.
Configure the pin control enable bit. This fixes the potentially
never-enabled charging on devices that use pin control.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e804d5abe2 ]
According to the NVMe specification, the response dword 0 value of the
Connect command is based on status code: return cntlid for successful
compeltion return IPO and IATTR for connect invalid parameters. Fix
a missing error information for a zero sized queue, and return the
cntlid also for I/O queue Connect commands.
Signed-off-by: Amit Engel <amit.engel@dell.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 85032874f8 ]
We update ctrl->queue_count and schedule another reconnect when io queue
count is zero.But we will never try to create any io queue in next reco-
nnection, because ctrl->queue_count already set to zero.We will end up
having an admin-only session in Live state, which is exactly what we try
to avoid in the original patch.
Update ctrl->queue_count after queue_count zero checking to fix it.
Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 664227fde6 ]
We update ctrl->queue_count and schedule another reconnect when io queue
count is zero.But we will never try to create any io queue in next reco-
nnection, because ctrl->queue_count already set to zero.We will end up
having an admin-only session in Live state, which is exactly what we try
to avoid in the original patch.
Update ctrl->queue_count after queue_count zero checking to fix it.
Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4f1e9630af ]
After patch 54efd50 (block: make generic_make_request handle
arbitrarily sized bios), the IO through io-throttle may be larger,
and these IOs may be further split into more small IOs. However,
IOPS throttle does not seem to be aware of this change, which
makes the calculation of IOPS of large IOs incomplete, resulting
in disk-side IOPS that does not meet expectations. Maybe we should
fix this problem.
We can reproduce it by set max_sectors_kb of disk to 128, set
blkio.write_iops_throttle to 100, run a dd instance inside blkio
and use iostat to watch IOPS:
dd if=/dev/zero of=/dev/sdb bs=1M count=1000 oflag=direct
As a result, without this change the average IOPS is 1995, with
this change the IOPS is 98.
Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/65869aaad05475797d63b4c3fed4f529febe3c26.1627876014.git.brookxu@tencent.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fad7cd3310 ]
If user specify a large enough value of NBD blocks option, it may trigger
signed integer overflow which may lead to nbd->config->bytesize becomes a
large or small value, zero in particular.
UBSAN: Undefined behaviour in drivers/block/nbd.c:325:31
signed integer overflow:
1024 * 4611686155866341414 cannot be represented in type 'long long int'
[...]
Call trace:
[...]
handle_overflow+0x188/0x1dc lib/ubsan.c:192
__ubsan_handle_mul_overflow+0x34/0x44 lib/ubsan.c:213
nbd_size_set drivers/block/nbd.c:325 [inline]
__nbd_ioctl drivers/block/nbd.c:1342 [inline]
nbd_ioctl+0x998/0xa10 drivers/block/nbd.c:1395
__blkdev_driver_ioctl block/ioctl.c:311 [inline]
[...]
Although it is not a big deal, still silence the UBSAN by limit
the input value.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20210804021212.990223-1-libaokun1@huawei.com
[axboe: dropped unlikely()]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 28ce50f8d9 ]
Currently iocharset=utf8 mount option is broken. To use UTF-8 as iocharset,
it is required to use utf8 mount option.
Fix iocharset=utf8 mount option to use be equivalent to the utf8 mount
option.
If UTF-8 as iocharset is used then s_nls_iocharset is set to NULL. So
simplify code around, remove s_utf8 field as to distinguish between UTF-8
and non-UTF-8 it is needed just to check if s_nls_iocharset is set to NULL
or not.
Link: https://lore.kernel.org/r/20210808162453.1653-5-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b645333443 ]
Currently iocharset=utf8 mount option is broken. To use UTF-8 as iocharset,
it is required to use utf8 mount option.
Fix iocharset=utf8 mount option to use be equivalent to the utf8 mount
option.
If UTF-8 as iocharset is used then s_nls_map is set to NULL. So simplify
code around, remove UDF_FLAG_NLS_MAP and UDF_FLAG_UTF8 flags as to
distinguish between UTF-8 and non-UTF-8 it is needed just to check if
s_nls_map set to NULL or not.
Link: https://lore.kernel.org/r/20210808162453.1653-4-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 781d2a9a2f ]
We were checking validity of LVID entries only when getting
implementation use information from LVID in udf_sb_lvidiu(). However if
the LVID is suitably corrupted, it can cause problems also to code such
as udf_count_free() which doesn't use udf_sb_lvidiu(). So check validity
of LVID already when loading it from the disk and just disable LVID
altogether when it is not valid.
Reported-by: syzbot+7fbfe5fed73ebb675748@syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8c3b5e6ec0 ]
If high resolution timers are disabled the timerfd notification about a
clock was set event is not happening for all cases which use
clock_was_set_delayed() because that's a NOP for HIGHRES=n, which is wrong.
Make clock_was_set_delayed() unconditially available to fix that.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135158.196661266@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 627ef5ae2d ]
If __hrtimer_start_range_ns() is invoked with an already armed hrtimer then
the timer has to be canceled first and then added back. If the timer is the
first expiring timer then on removal the clockevent device is reprogrammed
to the next expiring timer to avoid that the pending expiry fires needlessly.
If the new expiry time ends up to be the first expiry again then the clock
event device has to reprogrammed again.
Avoid this by checking whether the timer is the first to expire and in that
case, keep the timer on the current CPU and delay the reprogramming up to
the point where the timer has been enqueued again.
Reported-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210713135157.873137732@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 406dd42bd1 ]
When an itimer deactivates a previously armed expiration, it simply doesn't
do anything. As a result the process wide cputime counter keeps running and
the tick dependency stays set until it reaches the old ghost expiration
value.
This can be reproduced with the following snippet:
void trigger_process_counter(void)
{
struct itimerval n = {};
n.it_value.tv_sec = 100;
setitimer(ITIMER_VIRTUAL, &n, NULL);
n.it_value.tv_sec = 0;
setitimer(ITIMER_VIRTUAL, &n, NULL);
}
Fix this with resetting the relevant base expiration. This is similar to
disarming a timer.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210726125513.271824-4-frederic@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 767f4b620e ]
Hypervisors likely do not expose the SMCA feature to the guest and
loading this module leads to false warnings. This module should not be
loaded in guests to begin with, but people tend to do so, especially
when testing kernels in VMs. And then they complain about those false
warnings.
Do the practical thing and do not load this module when running as a
guest to avoid all that complaining.
[ bp: Rewrite commit message. ]
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Tested-by: Kim Phillips <kim.phillips@amd.com>
Link: https://lkml.kernel.org/r/20210628172740.245689-1-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ccfc9dd691 ]
The soft watchdog timer function checks if a virtual machine
was suspended and hence what looks like a lockup in fact
is a false positive.
This is what kvm_check_and_clear_guest_paused() does: it
tests guest PVCLOCK_GUEST_STOPPED (which is set by the host)
and if it's set then we need to touch all watchdogs and bail
out.
Watchdog timer function runs from IRQ, so PVCLOCK_GUEST_STOPPED
check works fine.
There is, however, one more watchdog that runs from IRQ, so
watchdog timer fn races with it, and that watchdog is not aware
of PVCLOCK_GUEST_STOPPED - RCU stall detector.
apic_timer_interrupt()
smp_apic_timer_interrupt()
hrtimer_interrupt()
__hrtimer_run_queues()
tick_sched_timer()
tick_sched_handle()
update_process_times()
rcu_sched_clock_irq()
This triggers RCU stalls on our devices during VM resume.
If tick_sched_handle()->rcu_sched_clock_irq() runs on a VCPU
before watchdog_timer_fn()->kvm_check_and_clear_guest_paused()
then there is nothing on this VCPU that touches watchdogs and
RCU reads stale gp stall timestamp and new jiffies value, which
makes it think that RCU has stalled.
Make RCU stall watchdog aware of PVCLOCK_GUEST_STOPPED and
don't report RCU stalls when we resume the VM.
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b4da13aa28 ]
A missing clock update is causing the following warning:
rq->clock_update_flags < RQCF_ACT_SKIP
WARNING: CPU: 112 PID: 2041 at kernel/sched/sched.h:1453
sub_running_bw.isra.0+0x190/0x1a0
...
CPU: 112 PID: 2041 Comm: sugov:112 Tainted: G W 5.14.0-rc1 #1
Hardware name: WIWYNN Mt.Jade Server System
B81.030Z1.0007/Mt.Jade Motherboard, BIOS 1.6.20210526 (SCP:
1.06.20210526) 2021/05/26
...
Call trace:
sub_running_bw.isra.0+0x190/0x1a0
migrate_task_rq_dl+0xf8/0x1e0
set_task_cpu+0xa8/0x1f0
try_to_wake_up+0x150/0x3d4
wake_up_q+0x64/0xc0
__up_write+0xd0/0x1c0
up_write+0x4c/0x2b0
cppc_set_perf+0x120/0x2d0
cppc_cpufreq_set_target+0xe0/0x1a4 [cppc_cpufreq]
__cpufreq_driver_target+0x74/0x140
sugov_work+0x64/0x80
kthread_worker_fn+0xe0/0x230
kthread+0x138/0x140
ret_from_fork+0x10/0x18
The task causing this is the `cppc_fie` DL task introduced by
commit 1eb5dde674 ("cpufreq: CPPC: Add support for frequency
invariance").
With CONFIG_ACPI_CPPC_CPUFREQ_FIE=y and schedutil cpufreq governor on
slow-switching system (like on this Ampere Altra WIWYNN Mt. Jade Arm
Server):
DL task `curr=sugov:112` lets `p=cppc_fie` migrate and since the latter
is in `non_contending` state, migrate_task_rq_dl() calls
sub_running_bw()->__sub_running_bw()->cpufreq_update_util()->
rq_clock()->assert_clock_updated()
on p.
Fix this by updating the clock for a non_contending task in
migrate_task_rq_dl() before calling sub_running_bw().
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lore.kernel.org/r/20210804135925.3734605-1-dietmar.eggemann@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit fe28140b33 ]
We should not clear FLAGS_DMA_ACTIVE before omap_sham_update_dma_stop() is
done calling dma_unmap_sg(). We already clear FLAGS_DMA_ACTIVE at the
end of omap_sham_update_dma_stop().
The early clearing of FLAGS_DMA_ACTIVE is not causing issues as we do not
need to defer anything based on FLAGS_DMA_ACTIVE currently. So this can be
applied as clean-up.
Cc: Lokesh Vutla <lokeshvutla@ti.com>
Cc: Tero Kristo <kristo@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit caa534c3ba ]
When fuel_gauge_reg_readb()/_writeb() fails, report which register we
were trying to read / write when the error happened.
Also reword the message a bit:
- Drop the axp288 prefix, dev_err() already prints this
- Switch from telegram / abbreviated style to a normal sentence, aligning
the message with those from fuel_gauge_read_*bit_word()
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f95091536f ]
It is possible for sched_getattr() to incorrectly report the state of
the reset_on_fork flag when called on a deadline task.
Indeed, if the flag was set on a deadline task using sched_setattr()
with flags (SCHED_FLAG_RESET_ON_FORK | SCHED_FLAG_KEEP_PARAMS), then
p->sched_reset_on_fork will be set, but __setscheduler() will bail out
early, which means that the dl_se->flags will not get updated by
__setscheduler_params()->__setparam_dl(). Consequently, if
sched_getattr() is then called on the task, __getparam_dl() will
override kattr.sched_flags with the now out-of-date copy in dl_se->flags
and report the stale value to userspace.
To fix this, make sure to only copy the flags that are relevant to
sched_deadline to and from the dl_se->flags field.
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210727101103.2729607-2-qperret@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df6313d707 ]
After calling dma_map_single(), we must also call dma_mapping_error().
This fixes the following warning when compiling with CONFIG_DMA_API_DEBUG:
[ 311.241478] WARNING: CPU: 0 PID: 428 at kernel/dma/debug.c:1027 check_unmap+0x79c/0x96c
[ 311.249547] DMA-API: mxs-dcp 2280000.crypto: device driver failed to check map error[device address=0x00000000860cb080] [size=32 bytes] [mapped as single]
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 048661a1f9 ]
Yanfei reported that setting HANDOFF should not depend on recomputing
@first, only on @first state. Which would then give:
if (ww_ctx || !first)
first = __mutex_waiter_is_first(lock, &waiter);
if (first)
__mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
But because 'ww_ctx || !first' is basically 'always' and the test for
first is relatively cheap, omit that first branch entirely.
Reported-by: Yanfei Xu <yanfei.xu@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Reviewed-by: Yanfei Xu <yanfei.xu@windriver.com>
Link: https://lore.kernel.org/r/20210630154114.896786297@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit a7bfaad54b upstream.
During CXL ACPI probe, host bridge ports are discovered by scanning
the ACPI0017 root port for ACPI0016 host bridge devices. The scan
matches on the hardware id of "ACPI0016". An issue occurs when an
ACPI0016 device is defined in the DSDT yet disabled on the platform.
Attempts by the cxl_acpi driver to add host bridge ports using a
disabled device fails, and the entire cxl_acpi probe fails.
The DSDT table includes an _STA method that sets the status and the
ACPI subsystem has checks available to examine it. One such check is
in the acpi_pci_find_root() path. Move the call to acpi_pci_find_root()
to the matching function to prevent this issue when adding either
upstream or downstream ports.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Fixes: 7d4b5ca2e2 ("cxl/acpi: Add downstream port data to cxl_port instances")
Cc: <stable@vger.kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163072203957.2250120.2178685721061002124.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a729691b54 upstream.
When this platform was relatively new in November 2011, with early BIOS
revisions, a reboot quirk was added in commit 6be30bb7d7 ("x86/reboot:
Blacklist Dell OptiPlex 990 known to require PCI reboot")
However, this quirk (and several others) are open-ended to all BIOS
versions and left no automatic expiry if/when the system BIOS fixed the
issue, meaning that nobody is likely to come along and re-test.
What is really problematic with using PCI reboot as this quirk does, is
that it causes this platform to do a full power down, wait one second,
and then power back on. This is less than ideal if one is using it for
boot testing and/or bisecting kernels when legacy rotating hard disks
are installed.
It was only by chance that the quirk was noticed in dmesg - and when
disabled it turned out that it wasn't required anymore (BIOS A24), and a
default reboot would work fine without the "harshness" of power cycling the
machine (and disks) down and up like the PCI reboot does.
Doing a bit more research, it seems that the "newest" BIOS for which the
issue was reported[1] was version A06, however Dell[2] seemed to suggest
only up to and including version A05, with the A06 having a large number of
fixes[3] listed.
As is typical with a new platform, the initial BIOS updates come frequently
and then taper off (and in this case, with a revival for CPU CVEs); a
search for O990-A<ver>.exe reveals the following dates:
A02 16 Mar 2011
A03 11 May 2011
A06 14 Sep 2011
A07 24 Oct 2011
A10 08 Dec 2011
A14 06 Sep 2012
A16 15 Oct 2012
A18 30 Sep 2013
A19 23 Sep 2015
A20 02 Jun 2017
A23 07 Mar 2018
A24 21 Aug 2018
While it's overkill to flash and test each of the above, it would seem
likely that the issue was contained within A0x BIOS versions, given the
dates above and the dates of issue reports[4] from distros. So rather than
just throw out the quirk entirely, limit the scope to just those early BIOS
versions, in case people are still running systems from 2011 with the
original as-shipped early A0x BIOS versions.
[1] https://lore.kernel.org/lkml/1320373471-3942-1-git-send-email-trenn@suse.de/
[2] https://www.dell.com/support/kbdoc/en-ca/000131908/linux-based-operating-systems-stall-upon-reboot-on-optiplex-390-790-990-systems
[3] https://www.dell.com/support/home/en-ca/drivers/driversdetails?driverid=85j10
[4] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/768039
Fixes: 6be30bb7d7 ("x86/reboot: Blacklist Dell OptiPlex 990 known to require PCI reboot")
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210530162447.996461-4-paul.gortmaker@windriver.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d761b084b upstream.
When nothing is connected to pcie ports, each port is set to reset state.
When this occurs, next access result in a hang on boot as follows:
mt7621-pci 1e140000.pcie: pcie0 no card, disable it (RST & CLK)
mt7621-pci 1e140000.pcie: pcie1 no card, disable it (RST & CLK)
mt7621-pci 1e140000.pcie: pcie2 no card, disable it (RST & CLK)
[ HANGS HERE ]
Fix this just detecting 'nothing is connected state' to avoid next accesses
to pcie port related configuration registers.
Fixes: b99cc3a2b6 ("staging: mt7621-pci: avoid custom 'map_irq' function")
Cc: stable <stable@vger.kernel.org>
Reported-by: DENG Qingfang <dqfext@gmail.com>
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Link: https://lore.kernel.org/r/20210823170803.2108-1-sergio.paracuellos@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 94f339147f upstream.
Only TDs with status TD_CLEARING_CACHE will be given back after
cache is cleared with a set TR deq command.
xhci_invalidate_cached_td() failed to set the TD_CLEARING_CACHE status
for some cancelled TDs as it assumed an endpoint only needs to clear the
TD it stopped on.
This isn't always true. For example with streams enabled an endpoint may
have several stream rings, each stopping on a different TDs.
Note that if an endpoint has several stream rings, the current code
will still only clear the cache of the stream pointed to by the last
cancelled TD in the cancel list.
This patch only focus on making sure all canceled TDs are given back,
avoiding hung task after device removal.
Another fix to solve clearing the caches of all stream rings with
cancelled TDs is needed, but not as urgent.
This issue was simultanously discovered and debugged by
by Tao Wang, with a slightly different fix proposal.
Fixes: 674f8438c1 ("xhci: split handling halted endpoints into two steps")
Cc: <stable@vger.kernel.org> #5.12
Reported-by: Tao Wang <wat@codeaurora.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20210820123503.2605901-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f4292e2faf upstream.
Turns out Hans de Goede completed the work I started last year trying to
improve Chinese-clone detection of CSR controller chips. Quirk after quirk
these Bluetooth dongles are more usable now.
Even after a few BlueZ regressions; these clones are so fickle that some
days they stop working altogether. Except on Windows, they work fine.
But this force-suspend initialization quirk seems to mostly do the trick,
after a lot of testing Bluetooth now seems to work *all* the time.
The only problem is that the solution ended up being masked under a very
stringent check; when there are probably hundreds of fake dongle
models out there that benefit from a good reset. Make it so.
Fixes: 81cac64ba2 ("Bluetooth: Deal with USB devices that are faking CSR vendor")
Fixes: cde1a8a992 ("Bluetooth: btusb: Fix and detect most of the Chinese Bluetooth controllers")
Fixes: d74e0ae7e0 ("Bluetooth: btusb: Fix detection of some fake CSR controllers with a bcdDevice val of 0x0134")
Fixes: 0671c06623 ("Bluetooth: btusb: Add workaround for remote-wakeup issues with Barrot 8041a02 fake CSR controllers")
Cc: stable@vger.kernel.org
Cc: Hans de Goede <hdegoede@redhat.com>
Tested-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com>
Signed-off-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 644d0a5bcc upstream.
The pdev maybe not a platform device, e.g. c_can_pci device, in this
case, calling to_platform_device() would not make sense. Also, per the
comment in drivers/net/can/c_can/c_can_ethtool.c, @bus_info should
match dev_name() string, so I am replacing this with dev_name() to fix
this issue.
[ 1.458583] BUG: unable to handle page fault for address: 0000000100000000
[ 1.460921] RIP: 0010:strnlen+0x1a/0x30
[ 1.466336] ? c_can_get_drvinfo+0x65/0xb0 [c_can]
[ 1.466597] ethtool_get_drvinfo+0xae/0x360
[ 1.466826] dev_ethtool+0x10f8/0x2970
[ 1.467880] sock_ioctl+0xef/0x300
Fixes: 2722ac986e ("can: c_can: add ethtool support")
Link: https://lore.kernel.org/r/20210906233704.1162666-1-ztong0001@gmail.com
Cc: stable@vger.kernel.org # 5.14+
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f97a2103f1 upstream.
Commit e26f023e01 ("firmware/dmi: Include product_sku info to modalias")
added a new field to the modalias in the middle of the modalias, breaking
some existing udev/hwdb matches on the whole modalias without a wildcard
('*') in between the pvr and rvn fields.
All modalias matches in e.g. :
https://github.com/systemd/systemd/blob/main/hwdb.d/60-sensor.hwdb
deliberately end in ':*' so that new fields can be added at *the end* of
the modalias, but adding a new field in the middle like this breaks things.
Move the new sku field to the end of the modalias to fix some hwdb
entries no longer matching.
The new sku field has already been put to use in 2 new hwdb entries:
sensor:modalias:platform:HID-SENSOR-200073:dmi:*svnDell*:sku0A3E:*
ACCEL_LOCATION=base
sensor:modalias:platform:HID-SENSOR-200073:dmi:*svnDell*:sku0B0B:*
ACCEL_LOCATION=base
The wildcard use before and after the sku in these matches means that they
should keep working with the sku moved to the end.
Note that there is a second instance of in essence the same problem,
commit f5152f4ded ("firmware/dmi: Report DMI Bios & EC firmware release")
Added 2 new br and efr fields in the middle of the modalias. This too
breaks some hwdb modalias matches, but this has gone unnoticed for over
a year. So some newer hwdb modalias matches actually depend on these
fields being in the middle of the string. Moving these to the end now
would break 3 hwdb entries, while fixing 8 entries.
Since there is no good answer for the new br and efr fields I have chosen
to leave these as is. Instead I'll submit a hwdb update to put a wildcard
at the place where these fields may or may not be present depending on the
kernel version.
BugLink: https://github.com/systemd/systemd/issues/20550
Link: https://github.com/systemd/systemd/pull/20562
Fixes: e26f023e01 ("firmware/dmi: Include product_sku info to modalias")
Cc: stable@vger.kernel.org
Cc: Kai-Chuan Hsieh <kaichuan.hsieh@canonical.com>
Cc: Erwan Velu <e.velu@criteo.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 514e976744 upstream.
My local syzbot instance hit memory leak in usb_set_configuration().
The problem was in unputted usb interface. In case of errors after
usb_get_intf() the reference should be putted to correclty free memory
allocated for this interface.
Fixes: ec16dae545 ("V4L/DVB (7019): V4L: add support for Syntek DC1125 webcams")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4267c5a8f3 upstream.
The recent change for low latency playback works in most of test cases
but it turned out still to hit errors on some use cases, most notably
with JACK with small buffer sizes. This is because USB-audio driver
fills up and submits full URBs at the beginning, while the URBs would
return immediately and try to fill more -- that can easily trigger
XRUN. It was more or less expected, but in the small buffer size, the
problem became pretty obvious.
Fixing this behavior properly would require the change of the
fundamental driver design, so it's no trivial task, unfortunately.
Instead, here we work around the problem just by switching back to the
old method when the given configuration is too fragile with the low
latency stream handling. As a threshold, we calculate the total
buffer bytes in all plus one URBs, and check whether it's beyond the
PCM buffer bytes. The one extra URB is needed because XRUN happens at
the next submission after the first round.
Fixes: 307cc9baac ("ALSA: usb-audio: Reduce latency at playback start, take#2")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210827203311.5987-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13d9c6b998 upstream.
ASUS ROG Strix G17 has the very same PCI and codec SSID (1043:103f) as
ASUS TX300, and unfortunately, the existing quirk for TX300 is broken
on ASUS ROG. Actually the device works without the quirk, so we'll
need to clear the quirk before applying for this device.
Since ASUS ROG has a different codec (ALC294 - while TX300 has
ALC282), this patch adds a workaround for the device, just clearing
the codec->fixup_id by checking the codec vendor_id.
It's a bit ugly to add such a workaround there, but it seems to be the
simplest way.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214101
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210820143214.3654-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7af5a14371 upstream.
We've got a regression report for USB-audio with Sony WALKMAN NW-A45
DAC device where no sound is audible on recent kernel. The bisection
resulted in the code change wrt endpoint management, and the further
debug session revealed that it was caused by the order of the USB
audio interface. In the earlier code, we always set up the USB
interface at first before other setups, but it was changed to be done
at the last for UAC2/3, which is more standard way, while keeping the
old way for UAC1. OTOH, this device seems requiring the setup of the
interface at first just like UAC1.
This patch works around the regression by applying the interface setup
specifically for the WALKMAN at the beginning of the endpoint setup
procedure. This change is written straightforwardly to be easily
backported in old kernels. A further cleanup to move the workaround
into a generic quirk section will follow in a later patch.
Fixes: bf6313a0ff ("ALSA: usb-audio: Refactor endpoint management")
Cc: <stable@vger.kernel.org>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214105
Link: https://lore.kernel.org/r/20210824054700.8236-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a824efdb7 upstream.
Syzbot found a warning caused by hid_submit_ctrl() submitting a
control request to transfer a 0-length input report:
usb 1-1: BOGUS control dir, pipe 80000280 doesn't match bRequestType a1
(The warning message is a little difficult to understand. It means
that the control request claims to be for an IN transfer but this
contradicts the USB spec, which requires 0-length control transfers
always to be in the OUT direction.)
Now, a zero-length report isn't good for anything and there's no
reason for a device to have one, but the fuzzer likes to pick out
these weird edge cases. In the future, perhaps we will decide to
reject 0-length reports at probe time. For now, the simplest approach
for avoiding these warnings is to pretend that the report actually has
length 1.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+9b57a46bf1801ce2a2ca@syzkaller.appspotmail.com
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5049307d37 upstream.
[patch description by Alan Stern]
Commit 7652dd2c5c ("USB: core: Check buffer length matches wLength
for control transfers") causes control URB submissions to fail if the
transfer_buffer_length value disagrees with the setup packet's wLength
valuel. Unfortunately, it turns out that the usbhid can trigger this
failure mode when it submits a control request for an input report: It
pads the transfer buffer size to a multiple of the maxpacket value but
does not increase wLength correspondingly.
These failures have caused problems for people using an APS UPC, in
the form of a flood of log messages resembling:
hid-generic 0003:051D:0002.0002: control queue full
This patch fixes the problem by setting the wLength value equal to the
padded transfer_buffer_length value in hid_submit_ctrl(). As a nice
bonus, the code which stores the transfer_buffer_length value is now
shared between the two branches of an "if" statement, so it can be
de-duplicated.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Fixes: 7652dd2c5c ("USB: core: Check buffer length matches wLength for control transfers")
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba4bbdabec upstream.
Make sure that the driver crtscts state is not updated in the unlikely
event that the flow-control request fails. Not doing so could break RTS
control.
Fixes: 5951b85088 ("USB: serial: cp210x: suppress modem-control errors")
Cc: stable@vger.kernel.org # 5.11
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d9a007059 upstream.
In the unlikely event that setting the software flow-control characters
fails the other flow-control settings should still be updated (just like
all other terminal settings).
Move out the error message printed by the set_chars() helper to make it
more obvious that this is intentional.
Fixes: 7748feffcd ("USB: serial: cp210x: add support for software flow control")
Cc: stable@vger.kernel.org # 5.11
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b2bbb92f70 upstream.
Commit 81414b4dd4 ("ext4: remove redundant sb checksum
recomputation") removed checksum recalculation after updating
superblock free space / inode counters in ext4_fill_super() based on
the fact that we will recalculate the checksum on superblock
writeout.
That is correct assumption but until the writeout happens (which can
take a long time) the checksum is incorrect in the buffer cache and if
programs such as tune2fs or resize2fs is called shortly after a file
system is mounted can fail. So return back the checksum recalculation
and add a comment explaining why.
Fixes: 81414b4dd4 ("ext4: remove redundant sb checksum recomputation")
Cc: stable@kernel.org
Reported-by: Boyang Xue <bxue@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210812124737.21981-1-jack@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a54c4613da upstream.
The location of the system.data extended attribute can change whenever
xattr_sem is not taken. So we need to recalculate the i_inline_off
field since it mgiht have changed between ext4_write_begin() and
ext4_write_end().
This means that caching i_inline_off is probably not helpful, so in
the long run we should probably get rid of it and shrink the in-memory
ext4 inode slightly, but let's fix the race the simple way for now.
Cc: stable@kernel.org
Fixes: f19d5870cb ("ext4: add normal write support for inline data")
Reported-by: syzbot+13146364637c7363a7de@syzkaller.appspotmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 67d69e9d1a upstream.
AUDIT_TRIM is expected to be idempotent, but multiple executions resulted
in a refcount underflow and use-after-free.
git bisect fingered commit fb041bb7c0 ("locking/refcount: Consolidate
implementations of refcount_t") but this patch with its more thorough
checking that wasn't in the x86 assembly code merely exposed a previously
existing tree refcount imbalance in the case of tree trimming code that
was refactored with prune_one() to remove a tree introduced in
commit 8432c70062 ("audit: Simplify locking around untag_chunk()")
Move the put_tree() to cover only the prune_one() case.
Passes audit-testsuite and 3 passes of "auditctl -t" with at least one
directory watch.
Cc: Jan Kara <jack@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Seiji Nishikawa <snishika@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 8432c70062 ("audit: Simplify locking around untag_chunk()")
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
[PM: reformatted/cleaned-up the commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0efb16294 upstream.
A common implementation of isatty(3) involves calling a ioctl passing
a dummy struct argument and checking whether the syscall failed --
bionic and glibc use TCGETS (passing a struct termios), and musl uses
TIOCGWINSZ (passing a struct winsize). If the FD is a socket, we will
copy sizeof(struct ifreq) bytes of data from the argument and return
-EFAULT if that fails. The result is that the isatty implementations
may return a non-POSIX-compliant value in errno in the case where part
of the dummy struct argument is inaccessible, as both struct termios
and struct winsize are smaller than struct ifreq (at least on arm64).
Although there is usually enough stack space following the argument
on the stack that this did not present a practical problem up to now,
with MTE stack instrumentation it's more likely for the copy to fail,
as the memory following the struct may have a different tag.
Fix the problem by adding an early check for whether the ioctl is a
valid socket ioctl, and return -ENOTTY if it isn't.
Fixes: 44c02a2c3d ("dev_ioctl(): move copyin/copyout to callers")
Link: https://linux-review.googlesource.com/id/I869da6cf6daabc3e4b7b82ac979683ba05e27d4d
Signed-off-by: Peter Collingbourne <pcc@google.com>
Cc: <stable@vger.kernel.org> # 4.19
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 064c734986 upstream.
The stat() family of syscalls report the wrong size for encrypted
symlinks, which has caused breakage in several userspace programs.
Fix this by calling fscrypt_symlink_getattr() after ubifs_getattr() for
encrypted symlinks. This function computes the correct size by reading
and decrypting the symlink target (if it's not already cached).
For more details, see the commit which added fscrypt_symlink_getattr().
Fixes: ca7f85be8d ("ubifs: Add support for encrypted symlinks")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210702065350.209646-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 461b43a8f9 upstream.
The stat() family of syscalls report the wrong size for encrypted
symlinks, which has caused breakage in several userspace programs.
Fix this by calling fscrypt_symlink_getattr() after f2fs_getattr() for
encrypted symlinks. This function computes the correct size by reading
and decrypting the symlink target (if it's not already cached).
For more details, see the commit which added fscrypt_symlink_getattr().
Fixes: cbaf042a3c ("f2fs crypto: add symlink encryption")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210702065350.209646-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c4bca10ce upstream.
The stat() family of syscalls report the wrong size for encrypted
symlinks, which has caused breakage in several userspace programs.
Fix this by calling fscrypt_symlink_getattr() after ext4_getattr() for
encrypted symlinks. This function computes the correct size by reading
and decrypting the symlink target (if it's not already cached).
For more details, see the commit which added fscrypt_symlink_getattr().
Fixes: f348c25232 ("ext4 crypto: add symlink encryption")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210702065350.209646-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d187605605 upstream.
Add a helper function fscrypt_symlink_getattr() which will be called
from the various filesystems' ->getattr() methods to read and decrypt
the target of encrypted symlinks in order to report the correct st_size.
Detailed explanation:
As required by POSIX and as documented in various man pages, st_size for
a symlink is supposed to be the length of the symlink target.
Unfortunately, st_size has always been wrong for encrypted symlinks
because st_size is populated from i_size from disk, which intentionally
contains the length of the encrypted symlink target. That's slightly
greater than the length of the decrypted symlink target (which is the
symlink target that userspace usually sees), and usually won't match the
length of the no-key encoded symlink target either.
This hadn't been fixed yet because reporting the correct st_size would
require reading the symlink target from disk and decrypting or encoding
it, which historically has been considered too heavyweight to do in
->getattr(). Also historically, the wrong st_size had only broken a
test (LTP lstat03) and there were no known complaints from real users.
(This is probably because the st_size of symlinks isn't used too often,
and when it is, typically it's for a hint for what buffer size to pass
to readlink() -- which a slightly-too-large size still works for.)
However, a couple things have changed now. First, there have recently
been complaints about the current behavior from real users:
- Breakage in rpmbuild:
https://github.com/rpm-software-management/rpm/issues/1682https://github.com/google/fscrypt/issues/305
- Breakage in toybox cpio:
https://www.mail-archive.com/toybox@lists.landley.net/msg07193.html
- Breakage in libgit2: https://issuetracker.google.com/issues/189629152
(on Android public issue tracker, requires login)
Second, we now cache decrypted symlink targets in ->i_link. Therefore,
taking the performance hit of reading and decrypting the symlink target
in ->getattr() wouldn't be as big a deal as it used to be, since usually
it will just save having to do the same thing later.
Also note that eCryptfs ended up having to read and decrypt symlink
targets in ->getattr() as well, to fix this same issue; see
commit 3a60a1686f ("eCryptfs: Decrypt symlink target for stat size").
So, let's just bite the bullet, and read and decrypt the symlink target
in ->getattr() in order to report the correct st_size. Add a function
fscrypt_symlink_getattr() which the filesystems will call to do this.
(Alternatively, we could store the decrypted size of symlinks on-disk.
But there isn't a great place to do so, and encryption is meant to hide
the original size to some extent; that property would be lost.)
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210702065350.209646-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4571b8c5e upstream.
[BUG]
It's easy to trigger NULL pointer dereference, just by removing a
non-existing device id:
# mkfs.btrfs -f -m single -d single /dev/test/scratch1 \
/dev/test/scratch2
# mount /dev/test/scratch1 /mnt/btrfs
# btrfs device remove 3 /mnt/btrfs
Then we have the following kernel NULL pointer dereference:
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 9 PID: 649 Comm: btrfs Not tainted 5.14.0-rc3-custom+ #35
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:btrfs_rm_device+0x4de/0x6b0 [btrfs]
btrfs_ioctl+0x18bb/0x3190 [btrfs]
? lock_is_held_type+0xa5/0x120
? find_held_lock.constprop.0+0x2b/0x80
? do_user_addr_fault+0x201/0x6a0
? lock_release+0xd2/0x2d0
? __x64_sys_ioctl+0x83/0xb0
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
[CAUSE]
Commit a27a94c2b0 ("btrfs: Make btrfs_find_device_by_devspec return
btrfs_device directly") moves the "missing" device path check into
btrfs_rm_device().
But btrfs_rm_device() itself can have case where it only receives
@devid, with NULL as @device_path.
In that case, calling strcmp() on NULL will trigger the NULL pointer
dereference.
Before that commit, we handle the "missing" case inside
btrfs_find_device_by_devspec(), which will not check @device_path at all
if @devid is provided, thus no way to trigger the bug.
[FIX]
Before calling strcmp(), also make sure @device_path is not NULL.
Fixes: a27a94c2b0 ("btrfs: Make btrfs_find_device_by_devspec return btrfs_device directly")
CC: stable@vger.kernel.org # 5.4+
Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7428022b50 upstream.
When a port leaves a VLAN-aware bridge, the current code does not clear
other ports' matrix field bit. If the bridge is later set to VLAN-unaware
mode, traffic in the bridge may leak to that port.
Remove the VLAN filtering check in mt7530_port_bridge_leave.
Fixes: 474a2ddaa1 ("net: dsa: mt7530: fix VLAN traffic leaks")
Fixes: 83163f7dca ("net: dsa: mediatek: add VLAN support for MT7530")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55981d3541 upstream.
Some USB BT adapters don't satisfy the MTU requirement mentioned in
commit e848dbd364 ("Bluetooth: btusb: Add support USB ALT 3 for WBS")
and have ALT 3 setting that produces no/garbled audio. Some adapters
with larger MTU were also reported to have problems with ALT 3.
Add a flag and check it and MTU before selecting ALT 3, falling back to
ALT 1. Enable the flag for Realtek, restoring the previous behavior for
non-Realtek devices.
Tested with USB adapters (mtu<72, no/garbled sound with ALT3, ALT1
works) BCM20702A1 0b05:17cb, CSR8510A10 0a12:0001, and (mtu>=72, ALT3
works) RTL8761BU 0bda:8771, Intel AX200 8087:0029 (after disabling
ALT6). Also got reports for (mtu>=72, ALT 3 reported to produce bad
audio) Intel 8087:0a2b.
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Fixes: e848dbd364 ("Bluetooth: btusb: Add support USB ALT 3 for WBS")
Tested-by: Michał Kępień <kernel@kempniu.pl>
Tested-by: Jonathan Lampérth <jon@h4n.dev>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This repository contains the Linux kernel used on the Raspberry Pi. If you believe that the issue you are seeing is kernel-related, this is the right place. If not, we have other repositories for the GPU firmware at [github.com/raspberrypi/firmware](https://github.com/raspberrypi/firmware) and Raspberry Pi userland applications at [github.com/raspberrypi/userland](https://github.com/raspberrypi/userland). If you have problems with the Raspbian distribution packages, report them in the [github.com/RPi-Distro/repo](https://github.com/RPi-Distro/repo). If you simply have a question, then [the Raspberry Pi forums](https://www.raspberrypi.org/forums) are the best place to ask it.
**Describe the bug**
Add a clear and concise description of what you think the bug is.
**To reproduce**
List the steps required to reproduce the issue.
**Expected behaviour**
Add a clear and concise description of what you expected to happen.
**Actual behaviour**
Add a clear and concise description of what actually happened.
**System**
Copy and paste the results of the raspinfo command in to this section. Alternatively, copy and paste a pastebin link, or add answers to the following questions:
* Which model of Raspberry Pi? e.g. Pi3B+, PiZeroW
* Which OS and version (`cat /etc/rpi-issue`)?
* Which firmware version (`vcgencmd version`)?
* Which kernel version (`uname -a`)?
**Logs**
If applicable, add the relevant output from `dmesg` or similar.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.