ad83c7cb2f ("irqchip/irq-bcm2836: Add support for DT interrupt polarity")
changed the way that the BCM2836/7 local interrupts are mapped; instead
of being pre-mapped they are now mapped on-demand. A side effect of this
change is that the call to irq_of_parse_and_map from armctrl_of_init
creates a new mapping, forming a gap between the IRQs and the FIQs. This
gap breaks the FIQ<->IRQ mapping which up to now has been done by assuming:
1) that the value of FIQ_START is the same as the number of normal IRQs
that will be mapped (still true), and
2) that this value is also the offset between an IRQ and its equivalent
FIQ (which is no longer the case).
Remove both assumptions by measuring the interval between the last IRQ
and the last FIQ, passing it as the parameter to init_FIQ().
Fixes: https://github.com/raspberrypi/linux/issues/2432
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Register for reboot notifications, sending RPI_FIRMWARE_NOTIFY_REBOOT
over the mailbox interface on reception.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Upstream Linux deems using output GPIOs to generate IRQs as a bogus
use case, even though the BCM2835 GPIO controller is capable of doing
so. A number of users would like to make use of this facility, so
disable the checks.
See: https://github.com/raspberrypi/linux/issues/2527
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
With HW_VLAN_CTAG_RX enabled we don't observe the checksum
issue, so amend the workaround to only drop back to s/w
checksums if VLAN offload is disabled.
See #2458.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The chip supports stripping the VLAN tag and reporting it
in metadata. Implement this as it also appears to solve the
issues observed in checksum computation.
See #2458.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
HW_VLAN_CTAG_FILTER was partially implemented, but not fully to Linux.
Complete the implementation of this.
See #2458.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
There appears to be some issue in the LAN78xx where the checksum
computed on a VLAN tagged packet is incorrect, or at least not
in the form that the kernel is after. This is most easily shown
by pinging a device via a VLAN tagged interface and it will dump
out the error message and stack trace from netdev_rx_csum_fault.
It has also been seen with standard TCP and UDP packets.
Until this is fully understood, request that the network stack
computes the checksum on packets signalled as having a VLAN tag
applied.
See https://github.com/raspberrypi/linux/issues/2458
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Enable EEE mode as soon as possible after connecting to the PHY, and
before phy_start. This avoids a second link negotiation, which speeds
up booting and stops the interface failing to become ready.
See: https://github.com/raspberrypi/linux/issues/2437
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The frame abort timeout being set by lan78xx_set_rx_max_frame_length
didn't account for any VLAN headers, resulting in very low
throughput if used with tagged VLANs.
Use VLAN_ETH_HLEN instead of ETH_HLEN to correct for this.
See https://github.com/raspberrypi/linux/issues/2458
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The patch to set the lan78xx MAC address from DT does so regardless of
whether or not the interface already has a valid address. As the
initialisation function is called from the reset handler when the
interface is brought up, it is impossible to change the MAC address
in a way that persists across the interface being brought up.
Fix the problem by moving the DT reading code after the check for a
valid address.
See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=209309
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add support for DT property "microchip,led-modes", a vector of two
cells (u32s) in the range 0-15, each of which sets the mode for one
of the two LEDs. The possible values are:
0=link/activity 1=link1000/activity
2=link100/activity 3=link10/activity
4=link100/1000/activity 5=link10/1000/activity
6=link10/100/activity 14=off 15=on
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
By user request, add a switch to prevent the clocks being stopped when
the stream is paused, stopped or shutdown. Provide access to the switch
by adding a 'non-stop-clocks' parameter to the audioinjector-addons
overlay.
See: https://github.com/raspberrypi/linux/issues/2409
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
It seems that trying to go from unlatched to unlatched will time out
waiting for STOP, and we can just skip that.
Signed-off-by: Eric Anholt <eric@anholt.net>
We don't use the same async update path between fkms and normal kms,
and the normal kms workaround ended up making us wait. This became a
larger problem in rpi-4.14.y, as the USB HID update rate throttling
got (accidentally?) dropped.
Signed-off-by: Eric Anholt <eric@anholt.net>
The SMICS interrupt fires continuously, but since it's 1/100 the rate
of the USB interrupts, we don't really need a way to turn it off. We
do need to make sure that we don't tell DRM about it until DRM has
asked for the interrupt at least once, because otherwise it will throw
a warning at boot time.
Signed-off-by: Eric Anholt <eric@anholt.net>
The default LED modes put 1000Mb/activity on orange and 100Mb/activity
on green, but this leaves no indication for a 10Mb link. Change the
defaults to put 10Mb/100Mb/activity on green.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add two new DT properties:
* microchip,eee-enabled - a boolean to enable EEE
* microchip,tx-lpi-timer - time in microseconds to wait before entering
low power state
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
This warning appears with GCC 6.4.0 from toolchains.bootlin.com:
../drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c: In function ‘vchiq_open’:
../drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c:1735:7: warning: unused variable ‘ret’ [-Wunused-variable]
int ret;
^~~
This variable's usage was removed by commit 3c980263c5 ("staging:
vchiq_arm: Make debugfs failure non-fatal"), making it useless.
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
This warning appears with GCC 6.4.0 from toolchains.bootlin.com:
../sound/soc/bcm/allo-piano-dac-plus.c: In function ‘snd_allo_piano_dac_init’:
../sound/soc/bcm/allo-piano-dac-plus.c:711:30: warning: argument to ‘sizeof’ in ‘memset’ call is the same expression as the destination; did you mean to dereference it? [-Wsizeof-pointer-memaccess]
memset(glb_ptr, 0x00, sizeof(glb_ptr));
^
Suggested-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Under-voltage due to inadequate power supplies is a recurring problem for
new Raspberry Pi users. There are visual indications that an
under-voltage situation is occuring like blinking power led and a
lightning icon on the desktop (not shown when using the vc4 driver), but
for new users it's not obvious that this signifies a critical situation.
This patch provides a twofold improvement to the situation:
Firstly it logs under-voltage events to the kernel log. This provides
information also for headless installations.
Secondly it provides a sysfs file to read the value. This improves on
'vcgencmd' by providing change notification. Userspace can poll on the
file and be notified of changes to the value.
A script can poll the file and use dbus notification to put a windows on
the desktop with information about the severity with a recommendation to
change the power supply. A link to more information can also be provided.
Only changes to the sticky bits are reported (cleared between readings).
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Prevent voltage low warnings from filling log
Although the correct fix for low voltage warnings is to
improve the power supply, the current implementation
of the detection can fill the log if the warning
happens freqently. This replaces the logging with
slightly custom ratelimited logging.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Reduce log spam when mailbox call not implemented
This changes the logging message when a mailbox call
fails to the dev_dbg level. In addition, it fixes the
low voltage detection logging code so that if the
mailbox call doies fails, it logs at error level
and flags so the call is no longer attempted.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
The SC16IS752 is a dual-channel device. The two channels are largely
independent, but the IRQ signals are wired together as an open-drain,
active low signal which will be driven low while either of the
channels requires attention, which can be for significant periods of
time until operations complete and the interrupt can be acknowledged.
In that respect it is should be treated as a true level-sensitive IRQ.
The kernel, however, needs to be able to exit interrupt context in
order to use I2C or SPI to access the device registers (which may
involve sleeping). Therefore the interrupt needs to be masked out or
paused in some way.
The usual way to manage sleeping from within an interrupt handler
is to use a threaded interrupt handler - a regular interrupt routine
does the minimum amount of work needed to triage the interrupt before
waking the interrupt service thread. If the threaded IRQ is marked as
IRQF_ONESHOT the kernel will automatically mask out the interrupt
until the thread runs to completion. The sc16is7xx driver used to
use a threaded IRQ, but a patch switched to using a kthread_worker
in order to set realtime priorities on the handler thread and for
other optimisations. The end result is non-threaded IRQ that
schedules some work then returns IRQ_HANDLED, making the kernel
think that all IRQ processing has completed.
The work-around to prevent a constant stream of interrupts is to
mark the interrupt as edge-sensitive rather than level-sensitive,
but interpreting an active-low source as a falling-edge source
requires care to prevent a total cessation of interrupts. Whereas
an edge-triggering source will generate a new edge for every interrupt
condition a level-triggering source will keep the signal at the
interrupting level until it no longer requires attention; in other
words, the host won't see another edge until all interrupt conditions
are cleared. It is therefore vital that the interrupt handler does not
exit with an outstanding interrupt condition, otherwise the kernel
will not receive another interrupt unless some other operation causes
the interrupt state on the device to be cleared.
The existing sc16is7xx driver has a very simple interrupt "thread"
(kthread_work job) that processes interrups on each channel in turn
until there are no more. If both channels are active and the first
channel starts interrupting while the handler for the second channel
is running then it will not be detected and an IRQ stall ensues. This
could be handled easily if there was a shared IRQ status register, or
a convenient way to determine if the IRQ had been deasserted for any
length of time, but both appear to be lacking.
Avoid this problem (or at least make it much less likely to happen)
by reducing the granularity of per-channel interrupt processing
to one condition per iteration, only exiting the overall loop when
both channels are no longer interrupting.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
I2C busses can be assigned specific bus numbers using aliases in
Device Tree - string properties where the name is the alias and the
value is the path to the node. The current DT parameter mechanism
does not allow property names to be derived from a parameter value
in any way, so it isn't possible to generate unique or matching
aliases for nodes from an overlay that can generate multiple
instances, e.g. i2c-gpio.
Work around this limitation (at least temporarily) by allowing
the i2c adapter number to be initialised from the "reg" property
if present.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
This ensures that the screen goes blank during DPMS (screensaver),
including the cursor. Planes don't necessarily get disabled during
CRTC disable, so we need to be careful to not leave them on or turn
them back on early.
Signed-off-by: Eric Anholt <eric@anholt.net>
In the rewrite of vc4_crtc.c for fkms, I dropped the part of the
CRTC's atomic flush handler that moved the completion event from the
proposed atomic state change to the CRTC's current state. That meant
that when full screen pageflipping happened (glxgears -fullscreen in
X, compton, por weston), the app would end up blocked firever waiting
to draw its next frame.
Signed-off-by: Eric Anholt <eric@anholt.net>
There is a new test in __irq_startup that the IRQ is activated, which
hasn't been the case for FIQs since they bypass some of the usual setup.
Augment enable_fiq to include a call to irq_activate to avoid the
warning.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The VideoCore bootloader passes in Serial number and
Revision number through Device Tree. Make these available to
userspace through /proc/cpuinfo.
Mainline status:
There is a commit in linux-next that standardize passing the serial
number through Device Tree (string: /serial-number):
ARM: 8355/1: arch: Show the serial number from devicetree in cpuinfo
There was an attempt to do the same with the revision number, but it
didn't get in:
[PATCH v2 1/2] arm: devtree: Set system_rev from DT revision
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Driver was using a fixed resolution, this commit
adds touchscreen size, and coordinate flip and swap
features via device tree overlays.
Adds overrides so the VC4 can adjust the DT parameters
appropriately; there is a newer version of the VC4 side
driver that can now set up the appropriate DT values
if required.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
The MCP2515 datasheet clearly describes a level-triggered interrupt
pin. Therefore the receiving interrupt controller must also be
configured for level-triggered operation otherwise there is a danger
of a missed interrupt condition blocking all subsequent interrupts.
The ONESHOT flag ensures that the interrupt is masked until the
threaded interrupt handler exits.
Rather than change the flags globally (they must have worked for at
least one user), allow the flags to be overridden from Device Tree
in the event that the device has a DT node.
See: https://github.com/raspberrypi/linux/issues/2175https://github.com/raspberrypi/linux/issues/2263
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Uses the debugfs I/F to provide access to the AXI
bus performance monitors.
Requires the new mailbox peripheral access for access
to the VPU performance registers, system bus access
is done using direct register reads.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
IRQ-CPU mapping is round robined on ARM64 to increase
concurrency and allow multiple interrupts to be serviced
at a time. This reduces the need for FIQ.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
In ARM64, the FIQ mechanism used by this driver is not current
implemented. As a workaround, reqular IRQ is used instead
of FIQ.
In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time. This reduces the need for FIQ.
Tests Run:
This mechanism is most likely to break when multiple USB devices
are attached at the same time. So the system was tested under
stress.
Devices:
1. USB Speakers playing back a FLAC audio through VLC
at 96KHz.(Higher then typically, but supported on my speakers).
2. sftp transferring large files through the buildin ethernet
connection which is connected through USB.
3. Keyboard and mouse attached and being used.
Although I do occasionally hear some glitches, the music seems to
play quite well.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
ARM64: Modify default config to get raspbian to boot (#1686)
1. Enable emulation of deprecated instructions.
2. Enable ARM 8.1 and 8.2 features which are not detected at runtime.
3. Switch the default governer to powersave.
4. Include the watchdog timer driver in the kernel image rather then a module.
Tested with raspbian-jessie 2016-09-23.
ARM64: Make it work again on 4.9 (#1790)
* Invoke the dtc compiler with the same options used in arm mode.
* ARM64 now uses the bcm2835 platform just like ARM32.
* ARM64: Update bcmrpi3_defconfig
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
Update arm64 Makefile to compile bcm2710 dtb file
The line 'dtb-$(CONFIG_ARCH_BCM2835) += bcm2710-rpi-3-b.dtb' has been copied from previous rpi-4.14.y version into rpi-4.15.y arch/arm64/boot/dts/broadcom/Makefile to restore compilation of bcm2710-rpi-3-b.dtb device tree blob under 'make ARCH=arm64 dtbs' command.
Since we don't have any CLM-capable firmware yet, silence the warning
of its absence by using request_firmware_direct, which should also
be marginally quicker.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The brcmfmac WiFi driver always complains about the '00' country code.
Modify the driver to ignore '00' silently.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: Disable power management
Disable wireless power saving in the brcmfmac WLAN driver. This is a
temporary measure until the connectivity loss resulting from power
saving is resolved.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: Use original country code as a fallback
Commit 73345fd212:
brcmfmac: Configure country code using device specific settings
prevents region codes from working on devices that lack a region code
translation table. In the event of an absent table, preserve the old
behaviour of using the provided code as-is.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: Plug memory leak in brcmf_fill_bss_param
See: https://github.com/raspberrypi/linux/issues/1471
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: do not use internal roaming engine by default
Some evidence of curing disconnects with this disabled, so make it a default.
Can be overridden with module parameter roamoff=0
See: http://projectable.me/optimize-my-pi-wi-fi/
brcmfmac: Change stop_ap sequence
Patch from Broadcom/Cypress to resolve a customer error
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
This is a port of Pantelis Antoniou's v3 port that makes use of the
new upstreamed configfs support for binary attributes.
Original commit message:
Add a runtime interface to using configfs for generic device tree overlay
usage. With it its possible to use device tree overlays without having
to use a per-platform overlay manager.
Please see Documentation/devicetree/configfs-overlays.txt for more info.
Changes since v2:
- Removed ifdef CONFIG_OF_OVERLAY (since for now it's required)
- Created a documentation entry
- Slight rewording in Kconfig
Changes since v1:
- of_resolve() -> of_resolve_phandles().
Originally-signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
DT configfs: Fix build errors on other platforms
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
DT configfs: fix build error
There is an error when compiling rpi-4.6.y branch:
CC drivers/of/configfs.o
drivers/of/configfs.c:291:21: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
.default_groups = of_cfs_def_groups,
^
drivers/of/configfs.c:291:21: note: (near initialization for 'of_cfs_subsys.su_group.default_groups.next')
The .default_groups is linked list since commit
1ae1602de0.
This commit uses configfs_add_default_group to fix this problem.
Signed-off-by: Slawomir Stepien <sst@poczta.fm>
configfs: New of_overlay API
These warnings appear with GCC 7.3.0 from toolchains.bootlin.com:
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c: In function ‘mgt_dispatcher’:
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c:734:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if(check_fwstate(pmlmepriv, WIFI_AP_STATE) == _TRUE)
^
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c:739:3: note: here
case WIFI_ASSOCREQ:
^~~~
../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_phycfg.c: In function ‘phy_TxPwrIdxToDbm’:
../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_phycfg.c:2365:10: warning: this statement may fall through [-Wimplicit-fallthrough=]
Offset = -8;
~~~~~~~^~~~
../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_phycfg.c:2366:2: note: here
default:
^~~~~~~
../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/usb/usb_halinit.c: In function ‘GetHwReg8192CU’:
../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/usb/usb_halinit.c:5694:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
*((u16 *)(val)) = pHalData->BasicRateSet;
~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/usb/usb_halinit.c:5695:3: note: here
case HW_VAR_TXPAUSE:
^~~~
../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_linux.c: In function ‘set_group_key’:
../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_linux.c:7383:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
keylen = 16;
~~~~~~~^~~~
../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_linux.c:7384:3: note: here
default:
^~~~~~~
../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c: In function ‘set_group_key’:
../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c:822:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
keylen = 16;
~~~~~~~^~~~
../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c:823:3: note: here
default:
^~~~~~~
None of them appear to be a real issue but it is trivial to make the
warnings go away.
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
These warnings appear with GCC 6.4.0 from toolchains.bootlin.com:
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c: In function ‘aes_cipher’:
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c:1504:5: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation]
for (j = 0; j < 8; j++)
^~~
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c:1507:2: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘for’
payload_index = hdrlen + 8;
^~~~~~~~~~~~~
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c: In function ‘aes_decipher’:
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c:1878:5: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation]
for (j = 0; j < 8; j++)
^~~
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c:1881:2: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘for’
payload_index = hdrlen + 8;
^~~~~~~~~~~~~
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c:5666:5: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
if( _rtw_memcmp(pwdinfo->rx_prov_disc_info.peerDevAddr, empty_addr, ETH_ALEN) );
^~
../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c:5667:6: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘if’
_rtw_memcpy(pwdinfo->rx_prov_disc_info.peerDevAddr, GetAddr2Ptr(pframe), ETH_ALEN);
^~~~~~~~~~~
It appears to be due to tabs versus spaces.
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
suppress spurious messages
Add #if for 3.14 kernel change (#87)
Fixes compiling after changes in f663dd9aaf and 99932d4fc0Fixes#86
Set dev_type to wlan
Fixes#23
Tentatively added support for more 8188CUS based devices.
Add support for more 8188CUS and 8192CUS devices
Add ProductId for the Netgear N150 - WNA1000M
Fixes CONFIG_CONCURRENT_MODE CONFIG_MULTI_VIR_IFACES
Fixes compatibility with 3.13
Enables warning in the compiler and fixes some issues, reference => https://github.com/diederikdehaas/rtl8812AU
Starts device in station mode instead of monitor, fixes NetworkManager issues
Enable cfg80211 support
Fix cfg80211 for kernel >= 4.7
Fixes rtl8192cu for kernel >= 4.8
rtl8192: Fixup build
fixup: rtl8192cu fixes from milhouse
rtl8192: switch to netdev->priv_destructor()
When trying to build from the rpi-4.11.y branch, I'm getting the
following error :
drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c:3464:10: error: 'struct net_device' has no member named 'destructor'
It seems to occur since this upstream commit :
cf124db566
[...]
netdev->priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev->destructor(), except for
free_netdev().
netdev->needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().
Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com>
ARM64: Fix build break for RTL8187/RTL8192CU wifi
These drivers use an ASM function from the base
system to compute the ipv6 checksum. These functions
are not available on ARM64, probably because nobody
has bother to write them. The base system does have
a generic "C" version, so a simple fix is to include
the header to use the generic version on ARM64 only.
A longer term solution would be to submit the necessary
ASM function to the upstream source.
With this change, these drivers now compile without
any errors on ARM64.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
Add non-mainline source for rtl8192cu wireless driver version v4.0.2_9000 as
this is widely used. Disable older rtlwifi driver.
8192cu needs old wireless extensions
The obsolete WIRELESS_EXT configuration is used
by the old Realtek code and is needed for AP support.
8192cu: CONFIG_AP_MODE hardcoded in autoconf.h
rtl8192c_rf6052: PHY_RFShadowRefresh(): fix off-by-one
Signed-off-by: Marc Kleine-Budde <mkl@blackshift.org>
rtl8192cu: Add PID for D-Link DWA 131
Pi3 and Compute Module 3 have a GPIO expander that the
VPU communicates with.
There is a mailbox service that now allows control of this
expander, so add a kernel driver that can make use of it.
Pwr_led node added to device-tree for Pi3.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Add a mailbox-driven backlight controller for the Raspberry Pi DSI
touchscreen display. Requires updated GPU firmware to recognise the
mailbox request.
Signed-off-by: Gordon Hollingworth <gordon@raspberrypi.org>
Add Raspberry Pi firmware driver to the dependencies of backlight driver
Otherwise the backlight driver fails to build if the firmware
loading driver is not in the kernel
Signed-off-by: Alex Riesen <alexander.riesen@cetitec.com>
AudioInjector Octo: sample rates, regulators, reset
This patch adds new sample rates to the Audioinjector Octo sound card. The
new supported rates are (in kHz) :
96, 48, 32, 24, 16, 8, 88.2, 44.1, 29.4, 22.05, 14.7
Reference the bcm270x DT regulators in the overlay.
This patch adds a reset GPIO for the AudioInjector.net octo sound card.
Audioinjector octo : Make the playback and capture symmetric
This patch ensures that the sample rate and channel count of the audioinjector
octo sound card are symmetric.
Fe-Pi Audio Sound Card is based on NXP SGTL5000 codec.
Mechanical specification of the board is the same the Raspberry Pi Zero.
3.5mm jacks for Headphone/Mic, Line In, and Line Out.
Signed-off-by: Henry Kupis <fe-pi@cox.net>
Signed-off-by: Miquel Blauw <info@dionaudio.nl>
ASoC: dionaudio_loco-v2: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params and ops as they are no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
Note: due to problems with deferred probing of regulators
the following softdep should be added to a modprobe.d file
softdep arizona-spi pre: arizona-ldo1
Signed-off-by: Matthias Reichl <hias@horus.com>
Pisound dynamic overlay (#1760)
Restructuring pisound-overlay.dts, so it can be loaded and unloaded dynamically using dtoverlay.
Print a logline when the kernel module is removed.
pisound improvements:
* Added a writable sysfs object to enable scripts / user space software
to blink MIDI activity LEDs for variable duration.
* Improved hw_param constraints setting.
* Added compatibility with S16_LE sample format.
* Exposed some simple placeholder volume controls, so the card appears
in volumealsa widget.
Add missing SND_PISOUND selects dependency to SND_RAWMIDI
Without it the Pisound module fails to compile.
See https://github.com/raspberrypi/linux/issues/2366
Updates for Pisound module code:
* Merged 'Fix a warning in DEBUG builds' (1c8b82b).
* Updating some strings and copyright information.
* Fix for handling high load of MIDI input and output.
* Use dual rate oversampling ratio for 96kHz instead of single
rate one.
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
Fixing memset call in pisound.c
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
Fix for Pisound's MIDI Input getting blocked for a while in rare cases.
There was a possible race condition which could lead to Input's FIFO queue
to be underflown, causing high amount of processing in the worker thread for
some period of time.
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
The Piano DAC 2.1 has support for 4 channels with subwoofer.
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com>
Add clock changes and mute gpios (#1938)
Also improve code style and adhere to ALSA coding conventions.
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com>
PianoPlus: Dual Mono & Dual Stereo features added (#2069)
allo-piano-dac-plus: Master volume added + fixes
Master volume added, which controls both DACs volumes.
See: https://github.com/raspberrypi/linux/pull/2149
Also fix initial max volume, default mode value, and unmute.
Signed-off-by: allocom <sparky-dev@allo.com>
ASoC: allo-piano-dac-plus: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
Add initial 2 channel (stereo) support for Allo Piano DAC (2.0/2.1) boards,
using allo-piano-dac-pcm512x-audio overlay and allo-piano-dac ALSA ASoC
machine driver.
NB. The initial support is 2 channel (stereo) ONLY!
(The Piano DAC 2.1 will only support 2 channel (stereo) left/right output,
pending an update to the upstream pcm512x codec driver, which will have
to be submitted via upstream. With the initial downstream support,
provided by this patch, the Piano DAC 2.1 subwoofer outputs will
not function.)
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Signed-off-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
ASoC: allo-piano-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params and ops as they are no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
Support IQAudIO Digi board with iqaudio_digi machine driver and
iqaudio-digi-wm8804-audio overlay.
NB. Machine driver is a cut and paste of hifiberry_digi code, with format
and general cleanup to comply with kernel coding standards.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
Contains the sound/soc/bcm ALSA machine driver and necessary alterations to the Kconfig and Makefile.
Adds the dts overlay and updates the Makefile and README.
Updates the relevant defconfig files to enable building for the Raspberry Pi.
Thanks to Phil Elwell (pelwell) for the review, simple-card concepts and discussion. Thanks to Clive Messer for overlay naming suggestions.
Added support for headphones, microphone and bclk_ratio settings.
This patch adds headphone and microphone capability to the Audio Injector sound card. The patch also sets the bit clock ratio for use in the bcm2835-i2s driver. The bcm2835-i2s can't handle an 8 kHz sample rate when the bit clock is at 12 MHz because its register is only 10 bits wide which can't represent the ch2 offset of 1508. For that reason, the rate constraint is added.
This commit adds basic support for the codec usage including: Device tree overlay,
binding I2S bus and setting I2S mode, clock source and frequency setting according
to spec.
Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
justboom-dac: Adjust for ALSA API change
As of 4.4, snd_soc_limit_volume now takes a struct snd_soc_card *
rather than a struct snd_soc_codec *.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
ASoC: justboom-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Also remove hw_params as it's no longer needed.
Signed-off-by: Matthias Reichl <hias@horus.com>
The driver contains a low-level hardware driver for the TAS5713 and the
drivers for the Raspberry Pi I2S subsystem.
TAS5713: return error if initialisation fails
Existing TAS5713 driver logs errors during initialisation, but does not return
an error code. Therefore even if initialisation fails, the driver will still be
loaded, but won't work. This patch fixes this. I2C communication error will now
reported correctly by a non-zero return code.
HiFiBerry Amp: fix device-tree problems
Some code to load the driver based on device-tree-overlays was missing. This is added by this patch.
The driver is based on the HiFiBerry DAC driver. However HiFiBerry DAC+ uses
a different codec chip (PCM5122), therefore a new driver is necessary.
Add support for the HiFiBerry DAC+ Pro.
The HiFiBerry DAC+ and DAC+ Pro products both use the existing bcm sound driver with the DAC+ Pro having a special clock device driver representing the two high precision oscillators.
An addition bug fix is included for the PCM512x codec where by the physical size of the sample frame is used in the calculation of the LRCK divisor as it was found to be wrong when using 24-bit depth sample contained in a little endian 4-byte sample frame.
Limit PCM512x "Digital" gain to 0dB by default with HiFiBerry DAC+
24db_digital_gain DT param can be used to specify that PCM512x
codec "Digital" volume control should not be limited to 0dB gain,
and if specified will allow the full 24dB gain.
Add dt param to force HiFiBerry DAC+ Pro into slave mode
"dtoverlay=hifiberry-dacplus,slave"
Add 'slave' param to use HiFiBerry DAC+ Pro in slave mode,
with Pi as master for bit and frame clock.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
Fixed a bug when using 352.8kHz sample rate
Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>
ASoC: pcm512x: revert downstream changes
This partially reverts commit 185ea05465
which was added by https://github.com/raspberrypi/linux/pull/1152
The downstream pcm512x changes caused a regression, it broke normal
use of the 24bit format with the codec, eg when using simple-audio-card.
The actual bug with 24bit playback is the incorrect usage
of physical_width in various drivers in the downstream tree
which causes 24bit data to be transmitted with 32 clock
cycles. So it's not the pcm512x that needs fixing, it's the
soundcard drivers.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplus: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
ASoC: hifiberry_dacplus: transmit S24_LE with 64 BCLK cycles
Signed-off-by: Matthias Reichl <hias@horus.com>
Set a limit of 0dB on Digital Volume Control
The main volume control in the PCM512x DAC has a range up to
+24dB. This is dangerously loud and can potentially cause massive
clipping in the output stages. Therefore this sets a sensible
limit of 0dB for this control.
Allow up to 24dB digital gain to be applied when using IQAudIO DAC+
24db_digital_gain DT param can be used to specify that PCM512x
codec "Digital" volume control should not be limited to 0dB gain,
and if specified will allow the full 24dB gain.
Modify IQAudIO DAC+ ASoC driver to set card/dai config from dt
Add the ability to set the card name, dai name and dai stream name, from
dt config.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
IQaudIO: auto-mute for AMP+ and DigiAMP+
IQAudIO amplifier mute via GPIO22. Add dt params for "one-shot" unmute
and auto mute.
Revision 2, auto mute implementing HiassofT suggestion to mute/unmute
using set_bias_level, rather than startup/shutdown....
"By default DAPM waits 5 seconds (pmdown_time) before shutting down
playback streams so a close/stop immediately followed by open/start
doesn't trigger an amp mute+unmute."
Tested on both AMP+ (via DAC+) and DigiAMP+, with both options...
dtoverlay=iqaudio-dacplus,unmute_amp
"one-shot" unmute when kernel module loads.
dtoverlay=iqaudio-dacplus,auto_mute_amp
Unmute amp when ALSA device opened by a client. Mute, with 5 second delay
when ALSA device closed. (Re-opening the device within the 5 second close
window, will cancel mute.)
Revision 4, using gpiod.
Revision 5, clean-up formatting before adding mute code.
- Convert tab plus 4 space formatting to 2x tab
- Remove '// NOT USED' commented code
Revision 6, don't attempt to "one-shot" unmute amp, unless card is
successfully registered.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
ASoC: iqaudio-dac: fix S24_LE format
Remove set_bclk_ratio call so 24-bit data is transmitted in
24 bclk cycles.
Signed-off-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Daniel Matuschek <daniel@matuschek.net>
Add a parameter to turn off SPDIF output if no audio is playing
This patch adds the paramater auto_shutdown_output to the kernel module.
Default behaviour of the module is the same, but when auto_shutdown_output
is set to 1, the SPDIF oputput will shutdown if no stream is playing.
bugfix for 32kHz sample rate, was missing
HiFiBerry Digi: set SPDIF status bits for sample rate
The HiFiBerry Digi driver did not signal the sample rate in the SPDIF status bits.
While this is optional, some DACs and receivers do not accept this signal. This patch
adds the sample rate bits in the SPDIF status block.
Added HiFiBerry Digi+ Pro driver
Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>
This adds a machine driver for the HifiBerry DAC.
It is a sound card that can
be stacked onto the Raspberry Pi.
Signed-off-by: Florian Meier <florian.meier@koalo.de>
PCM512x can accept data padded with additional BCLK cycles
but the driver currently lacks an interface to configure this.
This leads to the problem that S24_LE format in master mode
can result in non-integer clock divisors and pcm512x running
at a rather off rate.
For example 48kHz with 48fs BCLK and SCLK at 24.576MHz uses
a divisor of 10 (rounded down from 10.6666) and results in a
51.2kHz LRCLK. With 64fs BCLK a divisor of 8 is used and
LRCLK runs at exactly 48kHz.
Fix this by providing a minimal set_tdm_slot implementation
so machine drivers can optionally configure custom BCLK ratios.
Signed-off-by: Matthias Reichl <hias@horus.com>
The Raspberry Pi firmware manages the power-down and reboot
process. To do this it installs a pm_power_off handler, causing
the gpio-poweroff module to abort the probe function.
This patch introduces a "force" DT property that overrides that
behaviour, and also adds a DT overlay to enable and control it.
Note that running in an active-low configuration (DT parameter
"active_low") requires a custom dt-blob.bin and probably won't
allow a reboot without switching off, so an external inversion
of the trigger signal may be preferable.
Provide a __copy_from_user that uses memcpy. On BCM2708, use
optimised memcpy/memmove/memcmp/memset implementations.
arch/arm: Add mmiocpy/set aliases for memcpy/set
See: https://github.com/raspberrypi/linux/issues/1082
copy_from_user: CPU_SW_DOMAIN_PAN compatibility
The downstream copy_from_user acceleration must also play nice with
CONFIG_CPU_SW_DOMAIN_PAN.
See: https://github.com/raspberrypi/linux/issues/1381
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Fix driver detection failure Check that the buffer response is non-zero meaning the touchscreen was detected
rpi-ft5406: Use firmware API
RPI-FT5406: Enable aarch64 support through explicit iomem interface
Signed-off-by: Gerhard de Clercq <gerharddeclercq@outlook.com>
Based on the patch authored by Ali Gholami Rudi at
https://lkml.org/lkml/2009/7/13/153
Provide an ioctl for userspace applications, but only if this operation
is hardware accelerated (otherwide it does not make any sense).
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
bcm2708_fb: Add ioctl for reading gpu memory through dma
The "input" trigger makes the associated GPIO an input. This is to support
the Raspberry Pi PWR LED, which is driven by external hardware in normal use.
N.B. pwr_led is not available on Model A or B boards.
leds-gpio: Implement the brightness_get method
The power LED uses some clever logic that means it is driven
by a voltage measuring circuit when configured as input, otherwise
it is driven by the GPIO output value. This patch wires up the
brightness_get method for leds-gpio so that user-space can monitor
the LED value via /sys/class/gpio/led1/brightness. Using the input
trigger this returns an indication of the system power health,
otherwise it is just whatever value the trigger has written most
recently.
See: https://github.com/raspberrypi/linux/issues/1064
Add the bare minimum needed to boot BCM2708 from a Device Tree.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
BCM2708: DT: change 'axi' nodename to 'soc'
Change DT node named 'axi' to 'soc' so it matches ARCH_BCM2835.
The VC4 bootloader fills in certain properties in the 'axi' subtree,
but since this is part of an upstreaming effort, the name is changed.
Signed-off-by: Noralf Tronnes notro@tronnes.org
BCM2708_DT: Correct length of the peripheral space
Use dts-dirs feature for overlays.
The kernel makefiles have a dts-dirs target that is for vendor subdirectories.
Using this fixes the install_dtbs target, which previously did not install the overlays.
BCM270X_DT: configure I2S DMA channels
Signed-off-by: Matthias Reichl <hias@horus.com>
BCM270X_DT: switch to bcm2835-i2s
I2S soundcard drivers with proper devicetree support (i.e. not linking
to the cpu_dai/platform via name but to cpu/platform via of_node)
will work out of the box without any modifications.
When the kernel is compiled without devicetree support the platform
code will instantiate the bcm2708-i2s driver and I2S soundcard drivers
will link to it via name, as before.
Signed-off-by: Matthias Reichl <hias@horus.com>
SDIO-overlay: add poll_once-boolean parameter
Add paramter to toggle sdio-device-polling
done every second or once at boot-time.
Signed-off-by: Patrick Boettcher <patrick.boettcher@posteo.de>
BCM270X_DT: Make mmc overlay compatible with current firmware
The original DT overlay logic followed a merge-then-patch procedure,
i.e. parameters are applied to the loaded overlay before the overlay
is merged into the base DTB. This sequence has been changed to
patch-then-merge, in order to support parameterised node names, and
to protect against bad overlays. As a result, overrides (parameters)
must only target labels in the overlay, but the overlay can obviously target nodes in the base DTB.
mmc-overlay.dts (that switches back to the original mmc sdcard
driver) is the only overlay violating that rule, and this patch
fixes it.
bcm270x_dt: Use the sdhost MMC controller by default
The "mmc" overlay reverts to using the other controller.
squash: Add cprman to dt
BCM270X_DT: Use clk_core for I2C interfaces
BCM270X_DT: Use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi
The mainline Device Tree files are quite close to downstream now.
Let's use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi as base files
for our dts files.
Mainline dts files are based on these files:
bcm2835-rpi.dtsi
bcm2835.dtsi bcm2836.dtsi
bcm283x.dtsi
Current downstream are based on these:
bcm2708.dtsi bcm2709.dtsi bcm2710.dtsi
bcm2708_common.dtsi
This patch introduces this dependency:
bcm2708.dtsi bcm2709.dtsi
bcm2708-rpi.dtsi
bcm270x.dtsi
bcm2835.dtsi bcm2836.dtsi
bcm283x.dtsi
And:
bcm2710.dtsi
bcm2708-rpi.dtsi
bcm270x.dtsi
bcm283x.dtsi
bcm270x.dtsi contains the downstream bcm283x.dtsi diff.
bcm2708-rpi.dtsi is the downstream version of bcm2835-rpi.dtsi.
Other changes:
- The led node has moved from /soc/leds to /leds. This is not a problem
since the label is used to reference it.
- The clk_osc reg property changes from 6 to 3.
- The gpu nodes has their interrupt property set in the base file.
- the clocks label does not point to the /clocks node anymore, but
points to the cprman node. This is not a problem since the overlays
that use the clock node refer to it directly: target-path = "/clocks";
- some nodes now have 2 labels since mainline and downstream differs in
this respect: cprman/clocks, spi0/spi, gpu/vc4.
- some nodes doesn't have an explicit status = "okay" since they're not
disabled in the base file: watchdog and random.
- gpiomem doesn't need an explicit status = "okay".
- bcm2708-rpi-cm.dts got the hpd-gpios property from bcm2708_common.dtsi,
it's now set directly in that file.
- bcm2709-rpi-2-b.dts has the timer node moved from /soc/timer to /timer.
- Removed clock-frequency property on the bcm{2709,2710}.dtsi timer nodes.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270X_DT: Use raspberrypi-power to turn on USB power
Use the raspberrypi-power driver to turn on USB power.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270X_DT: Add a .dtbo target, use for overlays
Change the filenames and extensions to keep the pre-DDT style of
overlay (<name>-overlay.dtb) distinct from new ones that use a
different style of local fixups (<name>.dtbo), and to match other
platforms.
The RPi firmware uses the DDTK trailer atom to choose which type of
overlay to use for each kernel.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Don't generate "linux,phandle" props
The EPAPR standard says to use "phandle" properties to store phandles,
rather than the deprecated "linux,phandle" version. By default, dtc
generates both, but adding "-H epapr" causes it to only generate
"phandle"s, saving some space and clutter.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Add overlay for enc28j60 on SPI2
Works on SPI2 for compute module
BCM270X_DT: Add midi-uart0 overlay
MIDI requires 31.25kbaud, a baudrate unsupported by Linux. The
midi-uart0 overlay configures uart0 (ttyAMA0) to use a fake clock
so that requesting 38.4kbaud actually gets 31.25kbaud.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Add i2c-sensor overlay
The i2c-sensor overlay is a container for various pressure and
temperature sensors, currently bmp085 and bmp280. The standalone
bmp085_i2c-sensor overlay is now deprecated.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: overlays/*-overlay.dtb -> overlays/*.dtbo (#1752)
We now create overlays as .dtbo files.
build: support for .dtbo files for dtb overlays
Kernel 4.4.6+ on RaspberryPi support .dtbo files for overlays, instead of .dtb.
Patch the kernel, which has faulty rules to generate .dtbo the way yocto does
Signed-off-by: Herve Jourdain <herve.jourdain@neuf.fr>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
BCM270X: Drop position requirement for CMA in VC4 overlay.
No longer necessary since 2aefcd5761,
and will probably let peeople that want to choose a larger CMA
allocation (particularly on pi0/1).
Signed-off-by: Eric Anholt <eric@anholt.net>
BCM270X_DT: RPi Device Tree tidy
Use the upstream sdhost node, add thermal-zones, and factor out some
common elements.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
kbuild: Silence unhelpful DTC warnings
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The Raspberry Pi firmware looks for a trailer on the kernel image to
determine whether it was compiled with Device Tree support enabled.
If the firmware finds a kernel without this trailer, or which has a
trailer indicating that it isn't DT-capable, it disables DT support
and reverts to using ATAGs.
The mkknlimg utility adds that trailer, having first analysed the
image to look for signs of DT support and the kernel version string.
knlinfo displays the contents of the trailer in the given kernel image.
scripts/mkknlimg: Add support for ARCH_BCM2835
Add a new trailer field indicating whether this is an ARCH_BCM2835
build, as opposed to MACH_BCM2708/9. If the loader finds this flag
is set it changes the default base dtb file name from bcm270x...
to bcm283y...
Also update knlinfo to show the status of the field.
scripts/mkknlimg: Improve ARCH_BCM2835 detection
The board support code contains sufficient strings to be able to
distinguish 2708 vs. 2835 builds, so remove the check for
bcm2835-pm-wdt which could exist in either.
Also, since the canned configuration is no longer built in (it's
a module), remove the config string checking.
See: https://github.com/raspberrypi/linux/issues/1157
scripts: Multi-platform support for mkknlimg and knlinfo
The firmware uses tags in the kernel trailer to choose which dtb file
to load. Current firmware loads bcm2835-*.dtb if the '283x' tag is true,
otherwise it loads bcm270*.dtb. This scheme breaks if an image supports
multiple platforms.
This patch adds '270X' and '283X' tags to indicate support for RPi and
upstream platforms, respectively. '283x' (note lower case 'x') is left
for old firmware, and is only set if the image only supports upstream
builds.
scripts/mkknlimg: Append a trailer for all input
Now that the firmware assumes an unsigned kernel is DT-capable, it is
helpful to be able to mark a kernel as being non-DT-capable.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
scripts/knlinfo: Decode DDTK atom
Show the DDTK atom as being a boolean.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mkknlimg: Retain downstream-kernel detection
With the death of ARCH_BCM2708 and ARCH_BCM2709, a new way is needed to
determine if this is a "downstream" build that wants the firmware to
load a bcm27xx .dtb. The vc_cma driver is used downstream but not
upstream, making vc_cma_init a suitable predicate symbol.
mkknlimg: Find some more downstream-only strings
See: https://github.com/raspberrypi/linux/issues/1920
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
scripts: Update mkknlimg, just in case
With the removal of the vc_cma driver, mkknlimg lost an indication that
the user had built a downstream kernel. Update the script, adding a few
more key strings, in case it is still being used.
Note that mkknlimg is now deprecated, except to tag kernels as upstream
(283x), and thus requiring upstream DTBs.
See: https://github.com/raspberrypi/linux/issues/2239
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Support booting without Device Tree.
Turn on USB power.
Load driver early because of lacking support for deferred probing
in many drivers.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
firmware: bcm2835: Don't turn on USB power
The raspberrypi-power driver is now used to turn on USB power.
This partly reverts commit:
firmware: bcm2835: Support ARCH_BCM270x
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Add module for accessing the mailbox property channel through
/dev/vcio. Was previously in bcm2708-vcio.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
i2c-bcm2708: fixed baudrate
Fixed issue where the wrong CDIV value was set for baudrates below 3815 Hz (for 250MHz bus clock).
In that case the computed CDIV value was more than 0xffff. However the CDIV register width is only 16 bits.
This resulted in incorrect setting of CDIV and higher baudrate than intended.
Example: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0x1704 -> 42430Hz
After correction: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0xffff -> 3815Hz
The correct baudrate is shown in the log after the cdiv > 0xffff correction.
Perform I2C combined transactions when possible
Perform I2C combined transactions whenever possible, within the
restrictions of the Broadcomm Serial Controller.
Disable DONE interrupt during TA poll
Prevent interrupt from being triggered if poll is missed and transfer
starts and finishes.
i2c: Make combined transactions optional and disabled by default
i2c: bcm2708: add device tree support
Add DT support to driver and add to .dtsi file.
Setup pins in .dts file.
i2c is disabled by default.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
bcm2708: don't register i2c controllers when using DT
The devices for the i2c controllers are in the Device Tree.
Only register devices when not using DT.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
I2C: Only register the I2C device for the current board revision
i2c_bcm2708: Fix clock reference counting
Fix grabbing lock from atomic context in i2c driver
2 main changes:
- check for timeouts in the bcm2708_bsc_setup function as indicated by this comment:
/* poll for transfer start bit (should only take 1-20 polls) */
This implies that the setup function can now fail so account for this everywhere it's called
- Removed the clk_get_rate call from inside the setup function as it locks a mutex and that's not ok since we call it from under a spin lock.
i2c-bcm2708: When using DT, leave the GPIO setup to pinctrl
i2c-bcm2708: Increase timeouts to allow larger transfers
Use the timeout value provided by the I2C_TIMEOUT ioctl when waiting
for completion. The default timeout is 1 second.
See: https://github.com/raspberrypi/linux/issues/260
i2c-bcm2708/BCM270X_DT: Add support for I2C2
The third I2C bus (I2C2) is normally reserved for HDMI use. Careless
use of this bus can break an attached display - use with caution.
It is recommended to disable accesses by VideoCore by setting
hdmi_ignore_edid=1 or hdmi_edid_file=1 in config.txt.
The interface is disabled by default - enable using the
i2c2_iknowwhatimdoing DT parameter.
bcm2708-spi: Don't use static pin configuration with DT
Also remove superfluous error checking - the SPI framework ensures the
validity of the chip_select value.
i2c-bcm2708: Remove non-DT support
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Set the BSC_CLKT clock streching timeout to 35ms as per SMBus specs.
Fixes i2c_bcm2708: Write to FIFO correctly - v2 (#1574)
* i2c: fix i2c_bcm2708: Clear FIFO before sending data
Make sure FIFO gets cleared before trying to send
data in case of a repeated start (COMBINED=Y).
* i2c: fix i2c_bcm2708: Only write to FIFO when not full
Check if FIFO can accept data before writing.
To avoid a peripheral read on the last iteration of a loop,
both bcm2708_bsc_fifo_fill and ~drain are changed as well.
Use clock manager instead of self-made clockmanager.
Also fix some error paths that showd up during development
(especially missing release of dma resources on rmmod)
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Add experimental support for the VideoCore shared memory service.
This allows user processes to allocate memory from VideoCore's
GPU relocatable heap and mmap the buffers. Additionally, the memory
handles can passed to other VideoCore services such as MMAL, OpenMax
and DispmanX
TODO
* This driver was originally released for BCM28155 which has a different
cache architecture to BCM2835. Consequently, in this release only
uncached mappings are supported. However, there's no fundamental
reason which cached mappings cannot be support or BCM2835
* More refactoring is required to remove the typedefs.
* Re-enable the some of the commented out debug-fs statistics which were
disabled when migrating code from proc-fs.
* There's a lot of code to support sharing of VCSM in order to support
Android. This could probably done more cleanly or perhaps just
removed.
Signed-off-by: Tim Gover <timgover@gmail.com>
config: Disable VC_SM for now to fix hang with cutdown kernel
vcsm: Use boolean as it cannot be built as module
On building the bcm_vc_sm as a module we get the following error:
v7_dma_flush_range and do_munmap are undefined in vc-sm.ko.
Fix by making it not an option to build as module
vcsm: Add ioctl for custom cache flushing
vc-sm: Move headers out of arch directory
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
vcsm: Treat EBUSY as success rather than SIGBUS
Currently if two cores access the same page concurrently one will return VM_FAULT_NOPAGE
and the other VM_FAULT_SIGBUS crashing the user code.
Also report when mapping fails.
Signed-off-by: popcornmix <popcornmix@gmail.com>
vcsm: Provide new ioctl to clean/invalidate a 2D block
vcsm: Convert to loading via device tree.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
VCSM: New option to import a DMABUF for VPU use
Takes a dmabuf, and then calls over to the VPU to wrap
it into a suitable handle.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
vcsm: fix multi-platform build
vcsm: add macros for cache functions
vcsm: use dma APIs for cache functions
* Will handle multi-platform builds
vcsm: Fix up macros to avoid breaking numbers used by existing apps
vcsm: Define cache operation constants in user header
Without this change, users have to use raw values (1, 2, 3) to specify
cache operation.
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Support for finding user/vc handle in memory pool
vmcs_sm_{usr,vc}_handle_from_pid_and_address() were failing to find
handle if specified user pointer is not exactly the one that the memory
locking call returned even if the pointer is in range of map/resource.
So fixed the functions to match the range.
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Unify cache manipulating functions
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Fix obscure conditions
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Fix memory leaking on clean_invalid2 ioctl handler
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Describe the use of cache operation constants
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Fix obscure conditions again
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Add no-op cache operation constant
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Revert to do page-table-walk-based cache manipulating on some ioctl calls
On FLUSH, INVALID, CLEAN_INVALID ioctl calls, cache operations based on
page table walk were used in case that the buffer of the cache is not
pinned. So reverted to do page-table-based cache manipulating.
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
vcsm: Define cache operation constants in user header
Without this change, users have to use raw values (1, 2, 3) to specify
cache operation.
Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>
Signed-off-by: popcornmix <popcornmix@gmail.com>
BCM270x: Move vc_mem
Make the vc_mem module available for ARCH_BCM2835 by moving it.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM2835 has two SD card interfaces. This driver uses the other one.
bcm2835-sdhost: Error handling fix, and code clarification
bcm2835-sdhost: Adding overclocking option
Allow a different clock speed to be substitued for a requested 50MHz.
This option is exposed using the "overclock_50" DT parameter.
Note that the sdhost interface is restricted to integer divisions of
core_freq, and the highest sensible option for a core_freq of 250MHz
is 84 (250/3 = 83.3MHz), the next being 125 (250/2) which is much too
high.
Use at your own risk.
bcm2835-sdhost: Round up the overclock, so 62 works for 62.5Mhz
Also only warn once for each overclock setting.
bcm2835-sdhost: Improve error handling and recovery
1) Expose the hw_reset method to the MMC framework, removing many
internal calls by the driver.
2) Reduce overclock setting on error.
3) Increase timeout to cope with high capacity cards.
4) Add properties and parameters to control pio_limit and debug.
5) Reduce messages at probe time.
bcm2835-sdhost: Further improve overclock back-off
bcm2835-sdhost: Clear HBLC for PIO mode
Also update pio_limit default in overlay README.
bcm2835-sdhost: Add the ERASE capability
See: https://github.com/raspberrypi/linux/issues/1076
bcm2835-sdhost: Ignore CRC7 for MMC CMD1
It seems that the sdhost interface returns CRC7 errors for CMD1,
which is the MMC-specific SEND_OP_COND. Returning these errors to
the MMC layer causes a downward spiral, but ignoring them seems
to be harmless.
bcm2835-mmc/sdhost: Remove ARCH_BCM2835 differences
The bcm2835-mmc driver (and -sdhost driver that copied from it)
contains code to handle SDIO interrupts in a threaded interrupt
handler rather than waking the MMC framework thread. The change
follows a patch from Russell King that adds the facility as the
preferred way of working.
However, the new code path is only present in ARCH_BCM2835
builds, which I have taken to be a way of testing the waters
rather than making the change across the board; I can't see
any technical reason why it wouldn't be enabled for MACH_BCM270X
builds. So this patch standardises on the ARCH_BCM2835 code,
removing the old code paths.
bcm2835-sdhost: Don't log timeout errors unless debug=1
The MMC card-discovery process generates timeouts. This is
expected behaviour, so reporting it to the user serves no purpose.
Suppress the reporting of timeout errors unless the debug flag
is on.
bcm2835-sdhost: Add workaround for odd behaviour on some cards
For reasons not understood, the sdhost driver fails when reading
sectors very near the end of some SD cards. The problem could
be related to the similar issue that reading the final sector
of any card as part of a multiple read never completes, and the
workaround is an extension of the mechanism introduced to solve
that problem which ensures those sectors are always read singly.
bcm2835-sdhost: Major revision
This is a significant revision of the bcm2835-sdhost driver. It
improves on the original in a number of ways:
1) Through the use of CMD23 for reads it appears to avoid problems
reading some sectors on certain high speed cards.
2) Better atomicity to prevent crashes.
3) Higher performance.
4) Activity logging included, for easier diagnosis in the event
of a problem.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Restore ATOMIC flag to PIO sg mapping
Allocation problems have been seen in a wireless driver, and
this is the only change which might have been responsible.
SQUASH: bcm2835-sdhost: Only claim one DMA channel
With both MMC controllers enabled there are few DMA channels left. The
bcm2835-sdhost driver only uses DMA in one direction at a time, so it
doesn't need to claim two channels.
See: https://github.com/raspberrypi/linux/issues/1327
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Workaround for "slow" sectors
Some cards have been seen to cause timeouts after certain sectors are
read. This workaround enforces a minimum delay between the stop after
reading one of those sectors and a subsequent data command.
Using CMD23 (SET_BLOCK_COUNT) avoids this problem, so good cards will
not be penalised by this workaround.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Firmware manages the clock divisor
The bcm2835-sdhost driver hands control of the CDIV clock divisor
register to matching firmware, allowing it to adjust to a changing
core clock. This removes the need to use the performance governor or
to enable io_is_busy on the on-demand governor in order to get the
best SD performance.
N.B. As SD clocks must be an integer divisor of the core clock, it is
possible that the SD clock for "turbo" mode can be different (even
lower) than "normal" mode.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Reset the clock in task context
Since reprogramming the clock can now involve a round-trip to the
firmware it must not be done at atomic context, and a tasklet
is not a task.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Don't exit cmd wait loop on error
The FAIL flag can be set in the CMD register before command processing
is complete, leading to spurious "failed to complete" errors. This has
the effect of promoting harmless CRC7 errors during CMD1 processing
into errors that can delay and even prevent booting.
Also:
1) Convert the last KERN_ERROR message in the register dumping to
KERN_INFO.
2) Remove an unnecessary reset call from bcm2835_sdhost_add_host.
See: https://github.com/raspberrypi/linux/pull/1492
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: mmc_card_blockaddr fix
Get the definition of mmc_card_blockaddr from drivers/mmc/core/card.h.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: New timer API
mmc: bcm2835-sdhost: Support underclocking
Support underclocking of the SD bus in two ways:
1. using the max-frequency DT property (which currently has no DT
parameter), and
2. using the exiting sd_overclock parameter.
The two methods differ slightly - in the former the MMC subsystem is
aware of the underclocking, while in the latter it isn't - but the
end results should be the same.
See: https://github.com/raspberrypi/linux/issues/2350
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc: bcm2835-sdhost: Add include
highmem.h (needed for kmap_atomic) is pulled in by one of the other
include files, but only with some CONFIG settings. Make the inclusion
explicit to cater for cases where the CONFIG setting is absent.
See: https://github.com/raspberrypi/linux/issues/2366
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc: Disable CMD23 transfers on all cards
Pending wire-level investigation of these types of transfers
and associated errors on bcm2835-mmc, disable for now. Fallback of
CMD18/CMD25 transfers will be used automatically by the MMC layer.
Reported/Tested-by: Gellert Weisz <gellert@raspberrypi.org>
mmc: bcm2835-mmc: enable DT support for all architectures
Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now.
Enable Device Tree support for all architectures.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
mmc: bcm2835-mmc: fix probe error handling
Probe error handling is broken in several places.
Simplify error handling by using device managed functions.
Replace pr_{err,info} with dev_{err,info}.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-mmc: Add locks when accessing sdhost registers
bcm2835-mmc: Add range of debug options for slowing things down
bcm2835-mmc: Add option to disable some delays
bcm2835-mmc: Add option to disable MMC_QUIRK_BLK_NO_CMD23
bcm2835-mmc: Default to disabling MMC_QUIRK_BLK_NO_CMD23
bcm2835-mmc: Adding overclocking option
Allow a different clock speed to be substitued for a requested 50MHz.
This option is exposed using the "overclock_50" DT parameter.
Note that the mmc interface is restricted to EVEN integer divisions of
250MHz, and the highest sensible option is 63 (250/4 = 62.5), the
next being 125 (250/2) which is much too high.
Use at your own risk.
bcm2835-mmc: Round up the overclock, so 62 works for 62.5Mhz
Also only warn once for each overclock setting.
mmc: bcm2835-mmc: Make available on ARCH_BCM2835
Make the bcm2835-mmc driver available for use on ARCH_BCM2835.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: add bcm2835-mmc entry
Add Device Tree entry for bcm2835-mmc.
In non-DT mode, don't add the device in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-mmc: Don't overwrite MMC capabilities from DT
bcm2835-mmc: Don't override bus width capabilities from devicetree
Take out the force setting of the MMC_CAP_4_BIT_DATA host capability
so that the result read from devicetree via mmc_of_parse() is
preserved.
bcm2835-mmc: Only claim one DMA channel
With both MMC controllers enabled there are few DMA channels left. The
bcm2835-mmc driver only uses DMA in one direction at a time, so it
doesn't need to claim two channels.
See: https://github.com/raspberrypi/linux/issues/1327
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-mmc: New timer API
mmc: bcm2835-mmc: Support underclocking
Support underclocking of the SD bus using the max-frequency DT property
(which currently has no DT parameter). The sd_overclock parameter
already provides another way to achieve the same thing which should be
equivalent in end result, but it is a bug not to support max-frequency
as well.
See: https://github.com/raspberrypi/linux/issues/2350
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add support for DMA controller of BCM2708 as used in the Raspberry Pi.
Currently it only supports cyclic DMA.
Signed-off-by: Florian Meier <florian.meier@koalo.de>
dmaengine: expand functionality by supporting scatter/gather transfers sdhci-bcm2708 and dma.c: fix for LITE channels
DMA: fix cyclic LITE length overflow bug
dmaengine: bcm2708: Remove chancnt affectations
Mirror bcm2835-dma.c commit 9eba5536a7:
chancnt is already filled by dma_async_device_register, which uses the channel
list to know how much channels there is.
Since it's already filled, we can safely remove it from the drivers' probe
function.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: overwrite dreq only if it is not set
dreq is set when the DMA channel is fetched from Device Tree.
slave_id is set using dmaengine_slave_config().
Only overwrite dreq with slave_id if it is not set.
dreq/slave_id in the cyclic DMA case is not touched, because I don't
have hardware to test with.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: do device registration in the board file
Don't register the device in the driver. Do it in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: don't restrict DT support to ARCH_BCM2835
Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now.
Add Device Tree support to the non ARCH_BCM2835 case.
Use the same driver name regardless of architecture.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: add bcm2835-dma entry
Add Device Tree entry for bcm2835-dma.
The entry doesn't contain any resources since they are handled
by the arch/arm/mach-bcm270x/dma.c driver.
In non-DT mode, don't add the device in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2708-dmaengine: Add debug options
BCM270x: Add memory and irq resources to dmaengine device and DT
Prepare for merging of the legacy DMA API arch driver dma.c
with bcm2708-dmaengine by adding memory and irq resources both
to platform file device and Device Tree node.
Don't use BCM_DMAMAN_DRIVER_NAME so we don't have to include mach/dma.h
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Merge with arch dma.c driver and disable dma.c
Merge the legacy DMA API driver with bcm2708-dmaengine.
This is done so we can use bcm2708_fb on ARCH_BCM2835 (mailbox
driver is also needed).
Changes to the dma.c code:
- Use BIT() macro.
- Cutdown some comments to one line.
- Add mutex to vc_dmaman and use this, since the dev lock is locked
during probing of the engine part.
- Add global g_dmaman variable since drvdata is used by the engine part.
- Restructure for readability:
vc_dmaman_chan_alloc()
vc_dmaman_chan_free()
bcm_dma_chan_free()
- Restructure bcm_dma_chan_alloc() to simplify error handling.
- Use device irq resources instead of hardcoded bcm_dma_irqs table.
- Remove dev_dmaman_register() and code it directly.
- Remove dev_dmaman_deregister() and code it directly.
- Simplify bcm_dmaman_probe() using devm_* functions.
- Get dmachans from DT if available.
- Keep 'dma.dmachans' module argument name for backwards compatibility.
Make it available on ARCH_BCM2835 as well.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: set residue_granularity field
bcm2708-dmaengine supports residue reporting at burst level
but didn't report this via the residue_granularity field.
Without this field set properly we get playback issues with I2S cards.
dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer
bcm2708-dmaengine: Use more DMA channels (but not 12)
1) Only the bcm2708_fb drivers uses the legacy DMA API, and
it requires a BULK-capable channel, so all other types
(FAST, NORMAL and LITE) can be made available to the regular
DMA API.
2) DMA channels 11-14 share an interrupt. The driver can't
handle this, so don't use channels 12-14 (12 was used, probably
because it appears to have an interrupt, but in reality that
interrupt is for activity on ANY channel). This may explain
a lockup encountered when running out of DMA channels.
The combined effect of this patch is to leave 7 DMA channels
available + channel 0 for bcm2708_fb via the legacy API.
See: https://github.com/raspberrypi/linux/issues/1110https://github.com/raspberrypi/linux/issues/1108
dmaengine: bcm2708: Make legacy API available for bcm2835-dma
bcm2708_fb uses the legacy DMA API, so in order to start using
bcm2835-dma, bcm2835-dma has to support the legacy API. Make this
possible by exporting bcm_dmaman_probe() and bcm_dmaman_remove().
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Change DT compatible string
Both bcm2835-dma and bcm2708-dmaengine have the same compatible string.
So change compatible to "brcm,bcm2708-dma".
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Remove driver but keep legacy API
Dropping non-DT support means we don't need this driver,
but we still need the legacy DMA API.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2708-dmaengine - Fix arm64 portability/build issues
dma-bcm2708: Fix module compilation of CONFIG_DMA_BCM2708
bcm2708-dmaengine.c defines functions like bcm_dma_start which are
defined as well in dma-bcm2708.h as inline versions when
CONFIG_DMA_BCM2708 is not defined. This works fine when
CONFIG_DMA_BCM2708 is built in, but when it is selected as module build
fails with redefinition errors because in the build system when
CONFIG_DMA_BCM2708 is selected as module, the macro becomes
CONFIG_DMA_BCM2708_MODULE.
This patch makes the header use CONFIG_DMA_BCM2708_MODULE too when
available.
Fixes https://github.com/raspberrypi/linux/issues/2056
Signed-off-by: Andrei Gherzan <andrei@gherzan.com>
Especially on platforms with a slower CPU but a relatively high
framebuffer fill bandwidth, like current ARM devices, the existing
console monochrome imageblit function used to draw console text is
suboptimal for common pixel depths such as 16bpp and 32bpp. The existing
code is quite general and can deal with several pixel depths. By creating
special case functions for 16bpp and 32bpp, by far the most common pixel
formats used on modern systems, a significant speed-up is attained
which can be readily felt on ARM-based devices like the Raspberry Pi
and the Allwinner platform, but should help any platform using the
fb layer.
The special case functions allow constant folding, eliminating a number
of instructions including divide operations, and allow the use of an
unrolled loop, eliminating instructions with a variable shift size,
reducing source memory access instructions, and eliminating excessive
branching. These unrolled loops also allow much better code optimization
by the C compiler. The code that selects which optimized variant is used
is also simplified, eliminating integer divide instructions.
The speed-up, measured by timing 'cat file.txt' in the console, varies
between 40% and 70%, when testing on the Raspberry Pi and Allwinner
ARM-based platforms, depending on font size and the pixel depth, with
the greater benefit for 32bpp.
Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
Signed-off-by: popcornmix <popcornmix@gmail.com>
bcm2708_fb : Implement blanking support using the mailbox property interface
bcm2708_fb: Add pan and vsync controls
bcm2708_fb: DMA acceleration for fb_copyarea
Based on http://www.raspberrypi.org/phpBB3/viewtopic.php?p=62425#p62425
Also used Simon's dmaer_master module as a reference for tweaking DMA
settings for better performance.
For now busylooping only. IRQ support might be added later.
With non-overclocked Raspberry Pi, the performance is ~360 MB/s
for simple copy or ~260 MB/s for two-pass copy (used when dragging
windows to the right).
In the case of using DMA channel 0, the performance improves
to ~440 MB/s.
For comparison, VFP optimized CPU copy can only do ~114 MB/s in
the same conditions (hindered by reading uncached source buffer).
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
bcm2708_fb: report number of dma copies
Add a counter (exported via debugfs) reporting the
number of dma copies that the framebuffer driver
has done, in order to help evaluate different
optimization strategies.
Signed-off-by: Luke Diamand <luked@broadcom.com>
bcm2708_fb: use IRQ for DMA copies
The copyarea ioctl() uses DMA to speed things along. This
was busy-waiting for completion. This change supports using
an interrupt instead for larger transfers. For small
transfers, busy-waiting is still likely to be faster.
Signed-off-by: Luke Diamand <luke@diamand.org>
bcm2708: Make ioctl logging quieter
video: fbdev: bcm2708_fb: Don't panic on error
No need to panic the kernel if the video driver fails.
Just print a message and return an error.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
fbdev: bcm2708_fb: Add ARCH_BCM2835 support
Add Device Tree support.
Pass the device to dma_alloc_coherent() in order to get the
correct bus address on ARCH_BCM2835.
Use the new DMA legacy API header file.
Including <mach/platform.h> is not necessary.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: Add bcm2708-fb device
Add bcm2708-fb to Device Tree and don't add the
platform device when booting in DT mode.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Cleanup of bcm2708_fb file to kernel coding standards
Some minor change to function - remove a use of
in_atomic, plus replacing various debug messages
that manually specify the function name with
("%s",.__func__)
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
usb: dwc: fix lockdep false positive
Signed-off-by: Kari Suvanto <karis79@gmail.com>
usb: dwc: fix inconsistent lock state
Signed-off-by: Kari Suvanto <karis79@gmail.com>
Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas
Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.
Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh
Make sure we wait for the reset to finish
dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
memory corruption, escalating to OOPS under high USB load.
dwc_otg: Fix unsafe access of QTD during URB enqueue
In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.
This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).
dwc_otg: Fix incorrect URB allocation error handling
If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.
dwc_otg: fix potential use-after-free case in interrupt handler
If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.
dwc_otg: add handling of SPLIT transaction data toggle errors
Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).
dwc_otg: implement tasklet for returning URBs to usbcore hcd layer
The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.
This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.
dwc_otg: fix NAK holdoff and allow on split transactions only
This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.
Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.
dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler
usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4a1a).
This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.
NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer. This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.
USB fix using a FIQ to implement split transactions
This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux
dwc_otg: fix device attributes and avoid kernel warnings on boot
dcw_otg: avoid logging function that can cause panics
See: https://github.com/raspberrypi/firmware/issues/21
Thanks to cleverca22 for fix
dwc_otg: mask correct interrupts after transaction error recovery
The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.
By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.
dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ
In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.
Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.
Fix function tracing
dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue
dwc_otg: prevent OOPSes during device disconnects
The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.
Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.
dwc_otg: prevent BUG() in TT allocation if hub address is > 16
A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.
This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.
Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.
dwc_otg: make channel halts with unknown state less damaging
If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.
Add catchall handling to treat as a transaction error and retry.
dwc_otg: fiq_split: use TTs with more granularity
This fixes certain issues with split transaction scheduling.
- Isochronous multi-packet OUT transactions now hog the TT until
they are completed - this prevents hubs aborting transactions
if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
allows simultaneous use of the TT's bulk/control and periodic
transaction buffers
This commit will mainly affect USB audio playback.
dwc_otg: fix potential sleep while atomic during urb enqueue
Fixes a regression introduced with eb1b482a. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.
dwc_otg: make fiq_split_enable imply fiq_fix_enable
Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.
dwc_otg: prevent crashes on host port disconnects
Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.
- Clean up queue heads properly in kill_urbs_in_qh_list by
removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
device drivers so they respond in a more correct fashion
and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
interrupts if the associated URB was dequeued (and the
driver state was cleared)
dwc_otg: prevent leaking URBs during enqueue
A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.
dwc_otg: Enable NAK holdoff for control split transactions
Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb238.
In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.
dwc_otg: Fix for occasional lockup on boot when doing a USB reset
dwc_otg: Don't issue traffic to LS devices in FS mode
Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.
Fix ARM architecture issue with local_irq_restore()
If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.
Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.
Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.
dwc_otg: fiq_fsm: Base commit for driver rewrite
This commit removes the previous FIQ fixes entirely and adds fiq_fsm.
This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.
All driver options have been removed and replaced with:
- dwc_otg.fiq_enable (bool)
- dwc_otg.fiq_fsm_enable (bool)
- dwc_otg.fiq_fsm_mask (bitmask)
- dwc_otg.nak_holdoff (unsigned int)
Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.
fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used
If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).
Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.
In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.
Push the error count reset into the FIQ handler.
fiq_fsm: Implement timeout mechanism
For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.
Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.
fiq_fsm: fix bounce buffer utilisation for Isochronous OUT
Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.
Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.
dwc_otg: Return full-speed frame numbers in HS mode
The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.
fiq_fsm: save PID on completion of interrupt OUT transfers
Also add edge case handling for interrupt transports.
Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.
fiq_fsm: add missing case for fiq_fsm_tt_in_use()
Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.
fiq_fsm: clear hcintmsk for aborted transactions
Prevents the FIQ from erroneously handling interrupts
on a timed out channel.
fiq_fsm: enable by default
fiq_fsm: fix dequeues for non-periodic split transactions
If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.
fiq_fsm: Disable by default
fiq_fsm: Handle HC babble errors
The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.
dwc_otg: fix interrupt registration for fiq_enable=0
Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.
fiq_fsm: Enable by default
dwc_otg: Fix various issues with root port and transaction errors
Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.
Fix a few thinkos with the transaction error passthrough for fiq_fsm.
fiq_fsm: Implement hack for Split Interrupt transactions
Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.
It goes without saying that this is a fairly egregious USB specification
violation, but it works.
Original idea by Hans Petter Selasky @ FreeBSD.org.
dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.
dwc_otg: introduce fiq_fsm_spin(un|)lock()
SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.
This also makes it possible to run the FIQ and IRQ handlers on different
cores.
fiq_fsm: fix build on bcm2708 and bcm2709 platforms
dwc_otg: put some barriers back where they should be for UP
bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active
dwc_otg: fixup read-modify-write in critical paths
Be more careful about read-modify-write on registers that the FIQ
also touches.
Guard fiq_fsm_spin_lock with fiq_enable check
fiq_fsm: Falling out of the state machine isn't fatal
This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.
Also get rid of the useless return value.
squash: dwc_otg: Allow to build without SMP
usb: core: make overcurrent messages more prominent
Hub overcurrent messages are more serious than "debug". Increase loglevel.
usb: dwc_otg: Don't use dma_to_virt()
Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.
Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.
transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Fix crash when fiq_enable=0
dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly
Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.
Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.
dwc_otg: Force host mode to fix incorrect compute module boards
dwc_otg: Add ARCH_BCM2835 support
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Simplify FIQ irq number code
Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Remove duplicate gadget probe/unregister function
dwc_otg: Properly set the HFIR
Douglas Anderson reported:
According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1
This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)
and reported lower timing jitter on a USB analyser
dcw_otg: trim xfer length when buffer larger than allocated size is received
dwc_otg: Don't free qh align buffers in atomic context
dwc_otg: Enable the hack for Split Interrupt transactions by default
dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.
See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437
Signed-off-by: popcornmix <popcornmix@gmail.com>
dwc_otg: Use kzalloc when suitable
dwc_otg: Pass struct device to dma_alloc*()
This makes it possible to get the bus address from Device Tree.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: fix summarize urb->actual_length for isochronous transfers
Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixesraspberrypi/linux#903
fiq_fsm: Use correct states when starting isoc OUT transfers
In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.
For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.
Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.
Set the FSM state up properly if we select a single-packet transfer.
Fixes https://github.com/raspberrypi/linux/issues/1842
dwc_otg: make nak_holdoff work as intended with empty queues
If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.
Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.
Fixes https://github.com/raspberrypi/linux/issues/1709
dwc_otg: fix split transaction data toggle handling around dequeues
See https://github.com/raspberrypi/linux/issues/1709
Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()
dwc_otg: fix several potential crash sources
On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.
dwc_otg: delete hcd->channel_lock
The lock serves no purpose as it is only held while the HCD spinlock
is already being held.
dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt
Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.
There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.
dwcotg: Allow to build without FIQ on ARM64
Signed-off-by: popcornmix <popcornmix@gmail.com>
dwc_otg: make periodic scheduling behave properly for FS buses
If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.
Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.
Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.
Related issue: https://github.com/raspberrypi/linux/issues/2020
dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly
Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.
dwc_otg: add module parameter int_ep_interval_min
Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.
The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).
dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints
Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.
Constrain queuing non-periodic split transactions so they are performed
serially in such cases.
Related: https://github.com/raspberrypi/linux/issues/2024
dwc_otg: Fixup change to DRIVER_ATTR interface
dwc_otg: Fix compilation warnings
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
USB_DWCOTG: Disable building dwc_otg as a module (#2265)
When dwc_otg is built as a module, build will fail with the following
error:
ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2
Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.
As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.
See: https://github.com/raspberrypi/linux/issues/2258
Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>
dwc_otg: New timer API
dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE
dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()
Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.
dwc_otg: Fix a regression when dequeueing isochronous transfers
In 282bed95 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.
This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.
Restore the state machine fence for isochronous transfers.
fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)
See: https://github.com/raspberrypi/linux/issues/2140
dwc_otg: add smp_mb() to prevent driver state corruption on boot
Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.
The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.
usb: dwc_otg: fix memory corruption in dwc_otg driver
[Upstream commit 51b1b64917]
The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.
The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.
Thanks to Stephen Warren for tracking down the cause of this.
Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
usb: dwb_otg: Fix unreachable switch statement warning
This warning appears with GCC 7.3.0 from toolchains.bootlin.com:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
~~~~~~~~~~~~~~~~~^~~~
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: popcornmix <popcornmix@gmail.com>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2709: Drop platform smp and timer init code
irq-bcm2836 handles this through these functions:
bcm2835_init_local_timer_frequency()
bcm2836_arm_irqchip_smp_init()
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm270x: Use watchdog for reboot/poweroff
The watchdog driver already has support for reboot/poweroff.
Make use of this and remove the code from the platform files.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
board_bcm2835: Remove coherent dma pool increase - API has gone
It can be useful to be able to open multiple vchiq instances in a
single process. This currently fails due to a debugfs collision,
so make such a failure non-fatal.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The BCM2835 PL011 implementation seems to have a bug that can lead to a
transmission lockup if CTS changes frequently. A workaround was added to
the driver with a vendor-specific flag to enable it, but this flag is
currently not set for ARM implementations.
Add a "cts-event-workaround" property to Pi DTBs and use the presence
of that property to force the flag to be enabled in the driver.
See: https://github.com/raspberrypi/linux/issues/1280
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The pl011 register accessor functions use the _relaxed versions of the
standard readl() and writel() functions, meaning that there are no
automatic memory barriers. When polling a FIFO status register to check
for fullness, it is necessary to ensure that any outstanding writes have
completed; otherwise the flags are effectively stale, making it possible
that the next write is to a full FIFO.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The UART clock is initialised to be as close to the requested
frequency as possible without exceeding it. Now that there is a
clock manager that returns the actual frequencies, an expected
48MHz clock is reported as 47999625. If the requested baudrate
== requested clock/16, there is no headroom and the slight
reduction in actual clock rate results in failure.
Detect cases where it looks like a "round" clock was chosen and
adjust the reported clock to match that "round" value. As the
code comment says:
/*
* If increasing a clock by less than 0.1% changes it
* from ..999.. to ..000.., round up.
*/
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The pl011 driver looks for DT aliases of the form "serial<n>",
and if found uses <n> as the device ID. This can cause
/dev/ttyAMA0 to become /dev/ttyAMA1, which is confusing if the
other serial port is provided by the 8250 driver which doesn't
use the same logic.
lan78xx_defer_event generates an error message whenever the work item
is already scheduled. lan78xx_open defers three events -
EVENT_STAT_UPDATE, EVENT_DEV_OPEN and EVENT_LINK_RESET. Being aware
of the likelihood (or certainty) of an error message, the DEV_OPEN
event is added to the set of pending events directly, relying on
the subsequent deferral of the EVENT_LINK_RESET call to schedule the
work. Take the same precaution with EVENT_STAT_UPDATE to avoid a
totally unnecessary error message.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
For applications of the LAN78xx that don't have valid programmed
EEPROMs or OTPs, enabling both LEDs and auto-negotiation by default
seems reasonable.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
There is a standard mechanism for locating and using a MAC address from
the Device Tree. Use this facility in the lan78xx driver to support
applications without programmed EEPROM or OTP.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The syscon node defines a register range that duplicates that used by
the local_intc node on bcm2836/7. Since irq-bcm2835 and irq-bcm2836 are
built in and always present together (both drivers are enabled by
CONFIG_ARCH_BCM2835), it is possible to replace the syscon usage with a
global variable that simplifies the code. Doing so does lose the
locking provided by regmap, but as only one side is using the regmap
interface (irq-bcm2835 uses readl and write) there is no loss of
atomicity.
See: https://github.com/raspberrypi/firmware/issues/926
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Without a prompt string, a config setting can't be included in a
defconfig. Give CONFIG_SND_SOC_ICS43432 a prompt so that Pi soundcards
can use the driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
See commit dae803e165 -- the warning is
expected sometimes when using CMA. However, that commit still spams
my kernel log with these warnings.
Signed-off-by: Eric Anholt <eric@anholt.net>
This adds a debug module parameter to aid in debugging transfer issues
by printing info to the kernel log. When enabled, status values are
collected in the interrupt routine and msg info in
bcm2835_i2c_start_transfer(). This is done in a way that tries to avoid
affecting timing. Having printk in the isr can mask issues.
debug values (additive):
1: Print info on error
2: Print info on all transfers
3: Print messages before transfer is started
The value can be changed at runtime:
/sys/module/i2c_bcm2835/parameters/debug
Example output, debug=3:
[ 747.114448] bcm2835_i2c_xfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1]
[ 747.114463] bcm2835_i2c_xfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1]
[ 747.117809] start_transfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1]
[ 747.117825] isr: remain=2, status=0x30000055 : TA TXW TXD TXE [i2c1]
[ 747.117839] start_transfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1]
[ 747.117849] isr: remain=32, status=0xd0000039 : TA RXR TXD RXD [i2c1]
[ 747.117861] isr: remain=20, status=0xd0000039 : TA RXR TXD RXD [i2c1]
[ 747.117870] isr: remain=8, status=0x32 : DONE TXD RXD [i2c1]
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Christopher Alexander Tobias Schulze - May 2, 2015, 11:57 a.m.
This patch fixes a problem with VFP state save and restore related
to exception handling (panic with message "BUG: unsupported FP
instruction in kernel mode") present on VFP11 floating point units
(as used with ARM1176JZF-S CPUs, e.g. on first generation Raspberry
Pi boards). This patch was developed and discussed on
https://github.com/raspberrypi/linux/issues/859
A precondition to see the crashes is that floating point exception
traps are enabled. In this case, the VFP11 might determine that a FPU
operation needs to trap at a point in time when it is not possible to
signal this to the ARM11 core any more. The VFP11 will then set the
FPEXC.EX bit and store the trapped opcode in FPINST. (In some cases,
a second opcode might have been accepted by the VFP11 before the
exception was detected and could be reported to the ARM11 - in this
case, the VFP11 also sets FPEXC.FP2V and stores the second opcode in
FPINST2.)
If FPEXC.EX is set, the VFP11 will "bounce" the next FPU opcode issued
by the ARM11 CPU, which will be seen by the ARM11 as an undefined opcode
trap. The VFP support code examines the FPEXC.EX and FPEXC.FP2V bits
to decide what actions to take, i.e., whether to emulate the opcodes
found in FPINST and FPINST2, and whether to retry the bounced instruction.
If a user space application has left the VFP11 in this "pending trap"
state, the next FPU opcode issued to the VFP11 might actually be the
VSTMIA operation vfp_save_state() uses to store the FPU registers
to memory (in our test cases, when building the signal stack frame).
In this case, the kernel crashes as described above.
This patch fixes the problem by making sure that vfp_save_state() is
always entered with FPEXC.EX cleared. (The current value of FPEXC has
already been saved, so this does not corrupt the context. Clearing
FPEXC.EX has no effects on FPINST or FPINST2. Also note that many
callers already modify FPEXC by setting FPEXC.EN before invoking
vfp_save_state().)
This patch also addresses a second problem related to FPEXC.EX: After
returning from signal handling, the kernel reloads the VFP context
from the user mode stack. However, the current code explicitly clears
both FPEXC.EX and FPEXC.FP2V during reload. As VFP11 requires these
bits to be preserved, this patch disables clearing them for VFP
implementations belonging to architecture 1. There should be no
negative side effects: the user can set both bits by executing FPU
opcodes anyway, and while user code may now place arbitrary values
into FPINST and FPINST2 (e.g., non-VFP ARM opcodes) the VFP support
code knows which instructions can be emulated, and rejects other
opcodes with "unhandled bounce" messages, so there should be no
security impact from allowing reloading FPEXC.EX and FPEXC.FP2V.
Signed-off-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>
At present there is no mechanism to specify driver load order,
which can lead to deferrals and repeated retries until successful.
Since this situation is expected, reduce the dmesg level to
INFO and mention that the operation will be retried.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
These divide off of PLLD_PER and are used for the ethernet and wifi
PHYs source PLLs. Neither of them is currently represented by a phy
device that would grab the clock for us.
This keeps other drivers from killing the networking PHYs when they
disable their own clocks and trigger PLLD_PER's refcount going to 0.
v2: Skip marking as critical if they aren't on at boot.
Signed-off-by: Eric Anholt <eric@anholt.net>
The VPU is responsible for managing the core clock, usually under
direction from the bcm2835-cpufreq driver but not via the clk-bcm2835
driver. Since the core frequency can change without warning, it is
safer to report the maximum clock rate to users of the core clock -
I2C, SPI and the mini UART - to err on the safe side when calculating
clock divisors.
If the DT node for the clock driver includes a reference to the
firmware node, use the firmware API to query the maximum core clock
instead of reading the divider registers.
Prior to this patch, a "100KHz" I2C bus was sometimes clocked at about
160KHz. In particular, switching to the 4.9 kernel was likely to break
SenseHAT usage on a Pi3.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The claim-clocks property can be used to prevent PLLs and dividers
from being marked as critical. It contains a vector of clock IDs,
as defined by dt-bindings/clock/bcm2835.h.
Use this mechanism to claim PLLD_DSI0, PLLD_DSI1, PLLH_AUX and
PLLH_PIX for the vc4_kms_v3d driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The VPU configures and relies on several PLLs and dividers. Mark all
enabled dividers and their PLLs as CRITICAL to prevent the kernel from
switching them off.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The Raspberry Pi firmware looks at the RSTS register to know which
partition to boot from. The reboot syscall command
LINUX_REBOOT_CMD_RESTART2 supports passing in a string argument.
Add support for passing in a partition number 0..63 to boot from.
Partition 63 is a special partiton indicating halt.
If the partition doesn't exist, the firmware falls back to partition 0.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Load driver early since at least bcm2708_fb doesn't support deferred
probing and even if it did, we don't want the video driver deferred.
Support the legacy DMA API which is needed by bcm2708_fb.
Don't mask out channel 2.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
The spi-bcm2835 driver automatically uses GPIO chip-selects due to
some unreliability of the native ones. In doing so it chooses the
same pins as the native chip-selects would use, but the existing
code always uses pins 7 and 8, wherever the SPI function is mapped.
Search the pinctrl group assigned to the driver for pins that
correspond to native chip-selects, and use those for GPIO chip-
selects.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add a duplicate irq range with an offset on the hwirq's so the
driver can detect that enable_fiq() is used.
Tested with downstream dwc_otg USB controller driver.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
The old arch-specific IRQ macros included a dsb to ensure the
write to clear the mailbox interrupt completed before returning
from the interrupt. The BCM2836 irqchip driver needs the same
precaution to avoid spurious interrupts.
Spurious interrupts are still possible for other reasons,
though, so trap them early.
smsc95xx is adjusting truesize when it shouldn't, and following a recent patch from Eric this is now triggering warnings.
This patch stops smsc95xx from changing truesize.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
commit 7810e6781e upstream.
In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for
allocations that can ignore memory policies. The zonelist is obtained
from current CPU's node. This is a problem for __GFP_THISNODE
allocations that want to allocate on a different node, e.g. because the
allocating thread has been migrated to a different CPU.
This has been observed to break SLAB in our 4.4-based kernel, because
there it relies on __GFP_THISNODE working as intended. If a slab page
is put on wrong node's list, then further list manipulations may corrupt
the list because page_to_nid() is used to determine which node's
list_lock should be locked and thus we may take a wrong lock and race.
Current SLAB implementation seems to be immune by luck thanks to commit
511e3a0588 ("mm/slab: make cache_grow() handle the page allocated on
arbitrary node") but there may be others assuming that __GFP_THISNODE
works as promised.
We can fix it by simply removing the zonelist reset completely. There
is actually no reason to reset it, because memory policies and cpusets
don't affect the zonelist choice in the first place. This was different
when commit 183f6371aa ("mm: ignore mempolicies when using
ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their
own restricted zonelists.
We might consider this for 4.17 although I don't know if there's
anything currently broken.
SLAB is currently not affected, but in kernels older than 4.7 that don't
yet have 511e3a0588 ("mm/slab: make cache_grow() handle the page
allocated on arbitrary node") it is. That's at least 4.4 LTS. Older
ones I'll have to check.
So stable backports should be more important, but will have to be
reviewed carefully, as the code went through many changes. BTW I think
that also the ac->preferred_zoneref reset is currently useless if we
don't also reset ac->nodemask from a mempolicy to NULL first (which we
probably should for the OOM victims etc?), but I would leave that for a
separate patch.
Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 183f6371aa ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK")
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5cc41e0995 upstream.
WHen registering a new binfmt_misc handler, it is possible to overflow
the offset to get a negative value, which might crash the system, or
possibly leak kernel data.
Here is a crash log when 2500000000 was used as an offset:
BUG: unable to handle kernel paging request at ffff989cfd6edca0
IP: load_misc_binary+0x22b/0x470 [binfmt_misc]
PGD 1ef3e067 P4D 1ef3e067 PUD 0
Oops: 0000 [#1] SMP NOPTI
Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy
CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic #24-Ubuntu
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc]
Call Trace:
search_binary_handler+0x97/0x1d0
do_execveat_common.isra.34+0x667/0x810
SyS_execve+0x31/0x40
do_syscall_64+0x73/0x130
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Use kstrtoint instead of simple_strtoul. It will work as the code
already set the delimiter byte to '\0' and we only do it when the field
is not empty.
Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX. Also tested
with examples documented at Documentation/admin-guide/binfmt-misc.rst
and other registrations from packages on Ubuntu.
Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 670ae9caac upstream.
struct vhost_msg within struct vhost_msg_node is copied to userspace.
Unfortunately it turns out on 64 bit systems vhost_msg has padding after
type which gcc doesn't initialize, leaking 4 uninitialized bytes to
userspace.
This padding also unfortunately means 32 bit users of this interface are
broken on a 64 bit kernel which will need to be fixed separately.
Fixes: CVE-2018-1118
Cc: stable@vger.kernel.org
Reported-by: Kevin Easton <kevin@guarana.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d471b6b22d upstream.
The HID descriptor for the 2nd-gen Intuos Pro large (PTH-860) contains
a typo which defines an incorrect logical maximum Y value. This causes
a small portion of the bottom of the tablet to become unusable (both
because the area is below the "bottom" of the tablet and because
'wacom_wac_event' ignores out-of-range values). It also results in a
skewed aspect ratio.
To fix this, we add a quirk to 'wacom_usage_mapping' which overwrites
the data with the correct value.
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
CC: stable@vger.kernel.org # v4.10+
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebeaa36754 upstream.
Current ISH driver only registers suspend/resume PM callbacks which don't
support hibernation (suspend to disk). Basically after hiberation, the ISH
can't resume properly and user may not see sensor events (for example: screen
rotation may not work).
User will not see a crash or panic or anything except the following message
in log:
hid-sensor-hub 001F:8086:22D8.0001: timeout waiting for response from ISHTP device
So this patch adds support for S4/hiberbation to ISH by using the
SIMPLE_DEV_PM_OPS() MACRO instead of struct dev_pm_ops directly. The suspend
and resume functions will now be used for both suspend to RAM and hibernation.
If power management is disabled, SIMPLE_DEV_PM_OPS will do nothing, the suspend
and resume related functions won't be used, so mark them as __maybe_unused to
clarify that this is the intended behavior, and remove #ifdefs for power
management.
Cc: stable@vger.kernel.org
Signed-off-by: Even Xu <even.xu@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f6a4b4c9d0 upstream.
As long as a symlink inode remains in-core, the destination (and
therefore size) will not be re-fetched from the server, as it cannot
change. The original implementation of the attribute cache assumed that
setting the expiry time in the past was sufficient to cause a re-fetch
of all attributes on the next getattr. That does not work in this case.
The bug manifested itself as follows. When the command sequence
touch foo; ln -s foo bar; ls -l bar
is run, the output was
lrwxrwxrwx. 1 fedora fedora 4906 Apr 24 19:10 bar -> foo
However, after a re-mount, ls -l bar produces
lrwxrwxrwx. 1 fedora fedora 3 Apr 24 19:10 bar -> foo
After this commit, even before a re-mount, the output is
lrwxrwxrwx. 1 fedora fedora 3 Apr 24 19:10 bar -> foo
Reported-by: Becky Ligon <ligon@clemson.edu>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Fixes: 71680c18c8 ("orangefs: Cache getattr results.")
Cc: stable@vger.kernel.org
Cc: hubcap@omnibond.com
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9039d98581 upstream.
The page loading code trusts the data provided in the firmware images
a bit too much and may cause a buffer overflow or copy unknown data if
the block sizes don't match what we expect.
To prevent potential problems, harden the code by checking if the
sizes we are copying are what we expect.
Cc: stable@vger.kernel.org
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12f47073a4 upstream.
The case that interrupt affinity setting fails with -EBUSY can be handled
in the kernel completely by using the already available generic pending
infrastructure.
If a irq_chip::set_affinity() fails with -EBUSY, handle it like the
interrupts for which irq_chip::set_affinity() can only be invoked from
interrupt context. Copy the new affinity mask to irq_desc::pending_mask and
set the affinity pending bit. The next raised interrupt for the affected
irq will check the pending bit and try to set the new affinity from the
handler. This avoids that -EBUSY is returned when an affinity change is
requested from user space and the previous change has not been cleaned
up. The new affinity will take effect when the next interrupt is raised
from the device.
Fixes: dccfe3147b ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <liu.song.a23@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tariq Toukan <tariqt@mellanox.com>
Link: https://lkml.kernel.org/r/20180604162224.819273597@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a33a5d2d16 upstream.
The generic pending interrupt mechanism moves interrupts from the interrupt
handler on the original target CPU to the new destination CPU. This is
required for x86 and ia64 due to the way the interrupt delivery and
acknowledge works if the interrupts are not remapped.
However that update can fail for various reasons. Some of them are valid
reasons to discard the pending update, but the case, when the previous move
has not been fully cleaned up is not a legit reason to fail.
Check the return value of irq_do_set_affinity() for -EBUSY, which indicates
a pending cleanup, and rearm the pending move in the irq dexcriptor so it's
tried again when the next interrupt arrives.
Fixes: 996c591227 ("x86/irq: Plug vector cleanup race")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <liu.song.a23@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tariq Toukan <tariqt@mellanox.com>
Link: https://lkml.kernel.org/r/20180604162224.386544292@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0255770cc upstream.
apic_ack_edge() is explicitely for handling interrupt affinity cleanup when
interrupt remapping is not available or disable.
Remapped interrupts and also some of the platform specific special
interrupts, e.g. UV, invoke ack_APIC_irq() directly.
To address the issue of failing an affinity update with -EBUSY the delayed
affinity mechanism can be reused, but ack_APIC_irq() does not handle
that. Adding this to ack_APIC_irq() is not possible, because that function
is also used for exceptions and directly handled interrupts like IPIs.
Create a new function, which just contains the conditional invocation of
irq_move_irq() and the final ack_APIC_irq().
Reuse the new function in apic_ack_edge().
Preparatory change for the real fix.
Fixes: dccfe3147b ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <liu.song.a23@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tariq Toukan <tariqt@mellanox.com>
Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 80ae7b1a91 upstream.
Several people observed the WARN_ON() in irq_matrix_free() which triggers
when the caller tries to free an vector which is not in the allocation
range. Song provided the trace information which allowed to decode the root
cause.
The rework of the vector allocation mechanism failed to preserve a sanity
check, which prevents setting a new target vector/CPU when the previous
affinity change has not fully completed.
As a result a half finished affinity change can be overwritten, which can
cause the leak of a irq descriptor pointer on the previous target CPU and
double enqueue of the hlist head into the cleanup lists of two or more
CPUs. After one CPU cleaned up its vector the next CPU will invoke the
cleanup handler with vector 0, which triggers the out of range warning in
the matrix allocator.
Prevent this by checking the apic_data of the interrupt whether the
move_in_progress flag is false and the hlist node is not hashed. Return
-EBUSY if not.
This prevents the damage and restores the behaviour before the vector
allocation rework, but due to other changes in that area it also widens the
chance that user space can observe -EBUSY. In theory this should be fine,
but actually not all user space tools handle -EBUSY correctly. Addressing
that is not part of this fix, but will be addressed in follow up patches.
Fixes: 69cde0004a ("x86/vector: Use matrix allocator for vector assignment")
Reported-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reported-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Song Liu <liu.song.a23@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Song Liu <songliubraving@fb.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: Mike Travis <mike.travis@hpe.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/20180604162224.303870257@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 955bc61328 upstream.
According to the API, you may only call clk_get_rate() after actually
enabling it.
Found by Linux Driver Verification project (linuxtesting.org).
Fixes: a5fd9139f7 ("w1: add 1-wire master driver for i.MX27 / i.MX31")
Signed-off-by: Stefan Potyra <Stefan.Potyra@elektrobit.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cc1d5e749a upstream.
AER handling expects a successful return from slot_reset means the
driver made the device functional again. The nvme driver had been using
an asynchronous reset to recover the device, so the device
may still be initializing after control is returned to the
AER handler. This creates problems for subsequent event handling,
causing the initializion to fail.
This patch fixes that by syncing the controller reset before returning
to the AER driver, and reporting the true state of the reset.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199657
Reported-by: Alex Gagniuc <mr.nuke.me@gmail.com>
Cc: Sinan Kaya <okaya@codeaurora.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Tested-by: Alex Gagniuc <mr.nuke.me@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cfce3a86b upstream.
Commit 184add2ca2 ("libata: Apply NOLPM quirk for SanDisk
SD7UB3Q*G1001 SSDs") disabled LPM for SanDisk SD7UB3Q*G1001 SSDs.
This has lead to several reports of users of that SSD where LPM
was working fine and who know have a significantly increased idle
power consumption on their laptops.
Likely there is another problem on the T450s from the original
reporter which gets exposed by the uncore reaching deeper sleep
states (higher PC-states) due to LPM being enabled. The problem as
reported, a hardfreeze about once a day, already did not sound like
it would be caused by LPM and the reports of the SSD working fine
confirm this. The original reporter is ok with dropping the quirk.
A X250 user has reported the same hard freeze problem and for him
the problem went away after unrelated updates, I suspect some GPU
driver stack changes fixed things.
TL;DR: The original reporters problem were triggered by LPM but not
an LPM issue, so drop the quirk for the SSD in question.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1583207
Cc: stable@vger.kernel.org
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Lorenzo Dalrio <lorenzo.dalrio@gmail.com>
Reported-by: Lorenzo Dalrio <lorenzo.dalrio@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7592019634 upstream.
According to current code implementation, detecting the long
idle period is done by checking if the interval between two
adjacent utilization update handlers is long enough. Although
this mechanism can detect if the idle period is long enough
(no utilization hooks invoked during idle period), it might
not cover a corner case: if the task has occupied the CPU
for too long which causes no context switches during that
period, then no utilization handler will be launched until this
high prio task is scheduled out. As a result, the idle_periods
field might be calculated incorrectly because it regards the
100% load as 0% and makes the conservative governor who uses
this field confusing.
Change the detection to compare the idle_time with sampling_rate
directly.
Reported-by: Artem S. Tashkinov <t.artem@mailcity.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e5d295b06d upstream.
Commit 05829d9431 (cpufreq: ti-cpufreq: kfree opp_data when
failure) has fixed a memory leak in the failure path, however
the patch returned a positive value on get_cpu_device() failure
instead of the previous negative value. Fix this incorrect error
return value properly.
Fixes: 05829d9431 (cpufreq: ti-cpufreq: kfree opp_data when failure)
Cc: 4.14+ <stable@vger.kernel.org> # v4.14+
Signed-off-by: Suman Anna <s-anna@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c7d1f119c4 upstream.
If the policy limits are updated via cpufreq_update_policy() and
subsequently via sysfs, the limits stored in user_policy may be
set incorrectly.
For example, if both min and max are set via sysfs to the maximum
available frequency, user_policy.min and user_policy.max will also
be the maximum. If a policy notifier triggered by
cpufreq_update_policy() lowers both the min and the max at this
point, that change is not reflected by the user_policy limits, so
if the max is updated again via sysfs to the same lower value,
then user_policy.max will be lower than user_policy.min which
shouldn't happen. In particular, if one of the policy CPUs is
then taken offline and back online, cpufreq_set_policy() will
fail for it due to a failing limits check.
To prevent that from happening, initialize the min and max fields
of the new_policy object to the ones stored in user_policy that
were previously set via sysfs.
Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject & changelog ]
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f183464684 upstream.
From 0aa2e9b921d6db71150633ff290199554f0842a8 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Wed, 23 May 2018 10:29:00 -0700
cgwb_release() punts the actual release to cgwb_release_workfn() on
system_wq. Depending on the number of cgroups or block devices, there
can be a lot of cgwb_release_workfn() in flight at the same time.
We're periodically seeing close to 256 kworkers getting stuck with the
following stack trace and overtime the entire system gets stuck.
[<ffffffff810ee40c>] _synchronize_rcu_expedited.constprop.72+0x2fc/0x330
[<ffffffff810ee634>] synchronize_rcu_expedited+0x24/0x30
[<ffffffff811ccf23>] bdi_unregister+0x53/0x290
[<ffffffff811cd1e9>] release_bdi+0x89/0xc0
[<ffffffff811cd645>] wb_exit+0x85/0xa0
[<ffffffff811cdc84>] cgwb_release_workfn+0x54/0xb0
[<ffffffff810a68d0>] process_one_work+0x150/0x410
[<ffffffff810a71fd>] worker_thread+0x6d/0x520
[<ffffffff810ad3dc>] kthread+0x12c/0x160
[<ffffffff81969019>] ret_from_fork+0x29/0x40
[<ffffffffffffffff>] 0xffffffffffffffff
The events leading to the lockup are...
1. A lot of cgwb_release_workfn() is queued at the same time and all
system_wq kworkers are assigned to execute them.
2. They all end up calling synchronize_rcu_expedited(). One of them
wins and tries to perform the expedited synchronization.
3. However, that invovles queueing rcu_exp_work to system_wq and
waiting for it. Because #1 is holding all available kworkers on
system_wq, rcu_exp_work can't be executed. cgwb_release_workfn()
is waiting for synchronize_rcu_expedited() which in turn is waiting
for cgwb_release_workfn() to free up some of the kworkers.
We shouldn't be scheduling hundreds of cgwb_release_workfn() at the
same time. There's nothing to be gained from that. This patch
updates cgwb release path to use a dedicated percpu workqueue with
@max_active of 1.
While this resolves the problem at hand, it might be a good idea to
isolate rcu_exp_work to its own workqueue too as it can be used from
various paths and is prone to this sort of indirect A-A deadlocks.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a347c7ad8e upstream.
It is not allowed to reinit q->tag_set_list list entry while RCU grace
period has not completed yet, otherwise the following soft lockup in
blk_mq_sched_restart() happens:
[ 1064.252652] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [fio:9270]
[ 1064.254445] task: ffff99b912e8b900 task.stack: ffffa6d54c758000
[ 1064.254613] RIP: 0010:blk_mq_sched_restart+0x96/0x150
[ 1064.256510] Call Trace:
[ 1064.256664] <IRQ>
[ 1064.256824] blk_mq_free_request+0xea/0x100
[ 1064.256987] msg_io_conf+0x59/0xd0 [ibnbd_client]
[ 1064.257175] complete_rdma_req+0xf2/0x230 [ibtrs_client]
[ 1064.257340] ? ibtrs_post_recv_empty+0x4d/0x70 [ibtrs_core]
[ 1064.257502] ibtrs_clt_rdma_done+0xd1/0x1e0 [ibtrs_client]
[ 1064.257669] ib_create_qp+0x321/0x380 [ib_core]
[ 1064.257841] ib_process_cq_direct+0xbd/0x120 [ib_core]
[ 1064.258007] irq_poll_softirq+0xb7/0xe0
[ 1064.258165] __do_softirq+0x106/0x2a2
[ 1064.258328] irq_exit+0x92/0xa0
[ 1064.258509] do_IRQ+0x4a/0xd0
[ 1064.258660] common_interrupt+0x7a/0x7a
[ 1064.258818] </IRQ>
Meanwhile another context frees other queue but with the same set of
shared tags:
[ 1288.201183] INFO: task bash:5910 blocked for more than 180 seconds.
[ 1288.201833] bash D 0 5910 5820 0x00000000
[ 1288.202016] Call Trace:
[ 1288.202315] schedule+0x32/0x80
[ 1288.202462] schedule_timeout+0x1e5/0x380
[ 1288.203838] wait_for_completion+0xb0/0x120
[ 1288.204137] __wait_rcu_gp+0x125/0x160
[ 1288.204287] synchronize_sched+0x6e/0x80
[ 1288.204770] blk_mq_free_queue+0x74/0xe0
[ 1288.204922] blk_cleanup_queue+0xc7/0x110
[ 1288.205073] ibnbd_clt_unmap_device+0x1bc/0x280 [ibnbd_client]
[ 1288.205389] ibnbd_clt_unmap_dev_store+0x169/0x1f0 [ibnbd_client]
[ 1288.205548] kernfs_fop_write+0x109/0x180
[ 1288.206328] vfs_write+0xb3/0x1a0
[ 1288.206476] SyS_write+0x52/0xc0
[ 1288.206624] do_syscall_64+0x68/0x1d0
[ 1288.206774] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
What happened is the following:
1. There are several MQ queues with shared tags.
2. One queue is about to be freed and now task is in
blk_mq_del_queue_tag_set().
3. Other CPU is in blk_mq_sched_restart() and loops over all queues in
tag list in order to find hctx to restart.
Because linked list entry was modified in blk_mq_del_queue_tag_set()
without proper waiting for a grace period, blk_mq_sched_restart()
never ends, spining in list_for_each_entry_rcu_rr(), thus soft lockup.
Fix is simple: reinit list entry after an RCU grace period elapsed.
Fixes: Fixes: 705cda97ee ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list")
Cc: stable@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-block@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e2b19675d upstream.
When we stopped relying on the bdev everywhere I broke updating the
block device size on the fly, which ceph relies on. We can't just do
set_capacity, we also have to do bd_set_size so things like parted will
notice the device size change.
Fixes: 29eaadc ("nbd: stop using the bdev everywhere")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3f7c93976 upstream.
I messed up changing the size of an NBD device while it was connected by
not actually updating the device or doing the uevent. Fix this by
updating everything if we're connected and we change the size.
cc: stable@vger.kernel.org
Fixes: 639812a ("nbd: don't set the device size until we're connected")
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8364da4751 upstream.
This fixes a use after free bug, we shouldn't be doing disk->queue right
after we do del_gendisk(disk). Save the queue and do the cleanup after
the del_gendisk.
Fixes: c6a4759ea0 ("nbd: add device refcounting")
cc: stable@vger.kernel.org
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b2adf22fdf upstream.
The server detects reconnect by the (non-zero) value in PreviousSessionId
of SMB2/SMB3 SessionSetup request, but this behavior regressed due
to commit 166cea4dc3
("SMB2: Separate RawNTLMSSP authentication from SMB2_sess_setup")
CC: Stable <stable@vger.kernel.org>
CC: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 985c78d3ff upstream.
Each of the strings that we want to put into the buf[MAX_FLAG_OPT_SIZE]
in flags_read() is two characters long. But the sprintf() adds
a trailing newline and will add a terminating NUL byte. So
MAX_FLAG_OPT_SIZE needs to be 4.
sprintf() calls vsnprintf() and *that* does return:
" * The return value is the number of characters which would
* be generated for the given input, excluding the trailing
* '\0', as per ISO C99."
Note the "excluding".
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180427163707.ktaiysvbk3yhk4wm@agluck-desk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3aa60d511 upstream.
When 'kzalloc()' fails in 'snd_hda_attach_pcm_stream()', a new pcm instance is
created without setting its operators via 'snd_pcm_set_ops()'. Following
operations on the new pcm instance can trigger kernel null pointer dereferences
and cause kernel oops.
This bug was found with my work on building a gray-box fault-injection tool for
linux-kernel-module binaries. A kernel null pointer dereference was confirmed
from line 'substream->ops->open()' in function 'snd_pcm_open_substream()' in
file 'sound/core/pcm_native.c'.
This patch fixes the bug by calling 'snd_device_free()' in the error handling
path of 'kzalloc()', which removes the new pcm instance from the snd card before
returns with an error code.
Signed-off-by: Bo Chen <chenbo@pdx.edu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 986376b68d upstream.
We have several Lenovo AIOs like M810z, M820z and M920z, they have
the same design for mic-mute hotkey and led and they use the same
codec with the same pin configuration, so use the pin conf table to
apply fix to all of them.
Fixes: 29693efcea ("ALSA: hda - Fix micmute hotkey problem for a lenovo AIO machine")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ebf6b1e45 upstream.
The commit 33193dca67 ("ALSA: usb-audio: Add a quirk for Nura's
first gen headset") added a quirk for Nura headset with USB ID
0a12:1243, with a hope that it doesn't conflict with others.
Unfortunately, other devices (e.g. Philips Wecall) with the very same
ID got broken by this change, spewing an error like:
usb 2-1.8.2: 2:1: cannot set freq 48000 to ep 0x3
Until we find a proper solution, fix the regression at first by
disabling the added quirk entry.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199905
Fixes: 33193dca67 ("ALSA: usb-audio: Add a quirk for Nura's first gen headset")
Reviewed-by: Martin Peres <martin.peres@free.fr>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac0b4145d6 upstream.
[BUG]
Btrfs can create compressed extent without checksum (even though it
shouldn't), and if we then try to replace device containing such extent,
the result device will contain all the uncompressed data instead of the
compressed one.
Test case already submitted to fstests:
https://patchwork.kernel.org/patch/10442353/
[CAUSE]
When handling compressed extent without checksum, device replace will
goe into copy_nocow_pages() function.
In that function, btrfs will get all inodes referring to this data
extents and then use find_or_create_page() to get pages direct from that
inode.
The problem here is, pages directly from inode are always uncompressed.
And for compressed data extent, they mismatch with on-disk data.
Thus this leads to corrupted compressed data extent written to replace
device.
[FIX]
In this attempt, we could just remove the "optimization" branch, and let
unified scrub_pages() to handle it.
Although scrub_pages() won't bother reusing page cache, it will be a
little slower, but it does the correct csum checking and won't cause
such data corruption caused by "optimization".
Note about the fix: this is the minimal fix that can be backported to
older stable trees without conflicts. The whole callchain from
copy_nocow_pages() can be deleted, and will be in followup patches.
Fixes: ff023aac31 ("Btrfs: add code to scrub to copy read data to another disk")
CC: stable@vger.kernel.org # 4.4+
Reported-by: James Harvey <jamespharvey20@gmail.com>
Reviewed-by: James Harvey <jamespharvey20@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ remove code removal, add note why ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 090a127afa upstream.
In cow_file_range(), create_io_em() may fail, but its return value is
not recorded. Then return value may be 0 even it failed which is a
wrong behavior.
Let cow_file_range() return PTR_ERR(em) if create_io_em() failed.
Fixes: 6f9994dbab ("Btrfs: create a helper to create em for IO")
CC: stable@vger.kernel.org # 4.11+
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd4e994bd1 upstream.
If we have invalid flags set, when we error out we must drop our writer
counter and free the buffer we allocated for the arguments. This bug is
trivially reproduced with the following program on 4.7+:
#include <fcntl.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <linux/btrfs.h>
#include <linux/btrfs_tree.h>
int main(int argc, char **argv)
{
struct btrfs_ioctl_vol_args_v2 vol_args = {
.flags = UINT64_MAX,
};
int ret;
int fd;
if (argc != 2) {
fprintf(stderr, "usage: %s PATH\n", argv[0]);
return EXIT_FAILURE;
}
fd = open(argv[1], O_WRONLY);
if (fd == -1) {
perror("open");
return EXIT_FAILURE;
}
ret = ioctl(fd, BTRFS_IOC_RM_DEV_V2, &vol_args);
if (ret == -1)
perror("ioctl");
close(fd);
return EXIT_SUCCESS;
}
When unmounting the filesystem, we'll hit the
WARN_ON(mnt_get_writers(mnt)) in cleanup_mnt() and also may prevent the
filesystem to be remounted read-only as the writer count will stay
lifted.
Fixes: 6b526ed70c ("btrfs: introduce device delete by devid")
CC: stable@vger.kernel.org # 4.9+
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5c40d598f upstream.
In btrfs_clone_files(), we must check the NODATASUM flag while the
inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags()
will change the flags after we check and we can end up with a party
checksummed file.
The race window is only a few instructions in size, between the if and
the locks which is:
3834 if (S_ISDIR(src->i_mode) || S_ISDIR(inode->i_mode))
3835 return -EISDIR;
where the setflags must be run and toggle the NODATASUM flag (provided
the file size is 0). The clone will block on the inode lock, segflags
takes the inode lock, changes flags, releases log and clone continues.
Not impossible but still needs a lot of bad luck to hit unintentionally.
Fixes: 0e7b824c4e ("Btrfs: don't make a file partly checksummed through file clone")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f2f76f751 upstream.
ext4_resize_fs() has an off-by-one bug when checking whether growing of
a filesystem will not overflow inode count. As a result it allows a
filesystem with 8192 inodes per group to grow to 64TB which overflows
inode count to 0 and makes filesystem unusable. Fix it.
Cc: stable@vger.kernel.org
Fixes: 3f8a6411fb
Reported-by: Jaco Kroon <jaco@uls.co.za>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a2b307c21 upstream.
Ext4 will always create ext4 extended attributes which do not have a
value (where e_value_size is zero) with e_value_offs set to zero. In
most places e_value_offs will not be used in a substantive way if
e_value_size is zero.
There was one exception to this, which is in ext4_xattr_set_entry(),
where if there is a maliciously crafted file system where there is an
extended attribute with e_value_offs is non-zero and e_value_size is
0, the attempt to remove this xattr will result in a negative value
getting passed to memmove, leading to the following sadness:
[ 41.225365] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[ 44.538641] BUG: unable to handle kernel paging request at ffff9ec9a3000000
[ 44.538733] IP: __memmove+0x81/0x1a0
[ 44.538755] PGD 1249bd067 P4D 1249bd067 PUD 1249c1067 PMD 80000001230000e1
[ 44.538793] Oops: 0003 [#1] SMP PTI
[ 44.539074] CPU: 0 PID: 1470 Comm: poc Not tainted 4.16.0-rc1+ #1
...
[ 44.539475] Call Trace:
[ 44.539832] ext4_xattr_set_entry+0x9e7/0xf80
...
[ 44.539972] ext4_xattr_block_set+0x212/0xea0
...
[ 44.540041] ext4_xattr_set_handle+0x514/0x610
[ 44.540065] ext4_xattr_set+0x7f/0x120
[ 44.540090] __vfs_removexattr+0x4d/0x60
[ 44.540112] vfs_removexattr+0x75/0xe0
[ 44.540132] removexattr+0x4d/0x80
...
[ 44.540279] path_removexattr+0x91/0xb0
[ 44.540300] SyS_removexattr+0xf/0x20
[ 44.540322] do_syscall_64+0x71/0x120
[ 44.540344] entry_SYSCALL_64_after_hwframe+0x21/0x86
https://bugzilla.kernel.org/show_bug.cgi?id=199347
This addresses CVE-2018-10840.
Reported-by: "Xu, Wen" <wen.xu@gatech.edu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: stable@kernel.org
Fixes: dec214d00e ("ext4: xattr inode deduplication")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb9b5f01c3 upstream.
If ext4_find_inline_data_nolock() returns an error it needs to get
reflected up to ext4_iget(). In order to fix this,
ext4_iget_extra_inode() needs to return an error (and not return
void).
This is related to "ext4: do not allow external inodes for inline
data" (which fixes CVE-2018-11412) in that in the errors=continue
case, it would be useful to for userspace to receive an error
indicating that file system is corrupted.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 117166efb1 upstream.
The inline data feature was implemented before we added support for
external inodes for xattrs. It makes no sense to support that
combination, but the problem is that there are a number of extended
attribute checks that are skipped if e_value_inum is non-zero.
Unfortunately, the inline data code is completely e_value_inum
unaware, and attempts to interpret the xattr fields as if it were an
inline xattr --- at which point, Hilarty Ensues.
This addresses CVE-2018-11412.
https://bugzilla.kernel.org/show_bug.cgi?id=199803
Reported-by: Jann Horn <jannh@google.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fixes: e50e5129f3 ("ext4: xattr-in-inode support")
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eee597ac93 upstream.
Currently in ext4_punch_hole we're going to skip the mtime update if
there are no actual blocks to release. However we've actually modified
the file by zeroing the partial block so the mtime should be updated.
Moreover the sync and datasync handling is skipped as well, which is
also wrong. Fix it.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Joe Habermann <joe.habermann@quantum.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ee3ee06a8 upstream.
When ext4_ind_map_blocks() computes a length of a hole, it doesn't count
with the fact that mapped offset may be somewhere in the middle of the
completely empty subtree. In such case it will return too large length
of the hole which then results in lseek(SEEK_DATA) to end up returning
an incorrect offset beyond the end of the hole.
Fix the problem by correctly taking offset within a subtree into account
when computing a length of a hole.
Fixes: facab4d971
CC: stable@vger.kernel.org
Reported-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a447da7d00 ]
syzkaller managed to trigger a use-after-free in tls like the
following:
BUG: KASAN: use-after-free in tls_push_record.constprop.15+0x6a2/0x810 [tls]
Write of size 1 at addr ffff88037aa08000 by task a.out/2317
CPU: 3 PID: 2317 Comm: a.out Not tainted 4.17.0+ #144
Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016
Call Trace:
dump_stack+0x71/0xab
print_address_description+0x6a/0x280
kasan_report+0x258/0x380
? tls_push_record.constprop.15+0x6a2/0x810 [tls]
tls_push_record.constprop.15+0x6a2/0x810 [tls]
tls_sw_push_pending_record+0x2e/0x40 [tls]
tls_sk_proto_close+0x3fe/0x710 [tls]
? tcp_check_oom+0x4c0/0x4c0
? tls_write_space+0x260/0x260 [tls]
? kmem_cache_free+0x88/0x1f0
inet_release+0xd6/0x1b0
__sock_release+0xc0/0x240
sock_close+0x11/0x20
__fput+0x22d/0x660
task_work_run+0x114/0x1a0
do_exit+0x71a/0x2780
? mm_update_next_owner+0x650/0x650
? handle_mm_fault+0x2f5/0x5f0
? __do_page_fault+0x44f/0xa50
? mm_fault_error+0x2d0/0x2d0
do_group_exit+0xde/0x300
__x64_sys_exit_group+0x3a/0x50
do_syscall_64+0x9a/0x300
? page_fault+0x8/0x30
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This happened through fault injection where aead_req allocation in
tls_do_encryption() eventually failed and we returned -ENOMEM from
the function. Turns out that the use-after-free is triggered from
tls_sw_sendmsg() in the second tls_push_record(). The error then
triggers a jump to waiting for memory in sk_stream_wait_memory()
resp. returning immediately in case of MSG_DONTWAIT. What follows is
the trim_both_sgl(sk, orig_size), which drops elements from the sg
list added via tls_sw_sendmsg(). Now the use-after-free gets triggered
when the socket is being closed, where tls_sk_proto_close() callback
is invoked. The tls_complete_pending_work() will figure that there's
a pending closed tls record to be flushed and thus calls into the
tls_push_pending_closed_record() from there. ctx->push_pending_record()
is called from the latter, which is the tls_sw_push_pending_record()
from sw path. This again calls into tls_push_record(). And here the
tls_fill_prepend() will panic since the buffer address has been freed
earlier via trim_both_sgl(). One way to fix it is to move the aead
request allocation out of tls_do_encryption() early into tls_push_record().
This means we don't prep the tls header and advance state to the
TLS_PENDING_CLOSED_RECORD before allocation which could potentially
fail happened. That fixes the issue on my side.
Fixes: 3c4d755915 ("tls: kernel TLS support")
Reported-by: syzbot+5c74af81c547738e1684@syzkaller.appspotmail.com
Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 52acf73b6e ]
Recently people reported the NIC stops working after
"ifdown eth0; ifup eth0". It turns out in this case the TX queues are not
enabled, after the refactoring of the common detach logic: when the NIC
has sub-channels, usually we enable all the TX queues after all
sub-channels are set up: see rndis_set_subchannel() ->
netif_device_attach(), but in the case of "ifdown eth0; ifup eth0" where
the number of channels doesn't change, we also must make sure the TX queues
are enabled. The patch fixes the regression.
Fixes: 7b2ee50c0c ("hv_netvsc: common detach logic")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fd3a886258 ]
Tun, tap, virtio, packet and uml vector all use struct virtio_net_hdr
to communicate packet metadata to userspace.
For skbuffs with vlan, the first two return the packet as it may have
existed on the wire, inserting the VLAN tag in the user buffer. Then
virtio_net_hdr.csum_start needs to be adjusted by VLAN_HLEN bytes.
Commit f09e2249c4 ("macvtap: restore vlan header on user read")
added this feature to macvtap. Commit 3ce9b20f19 ("macvtap: Fix
csum_start when VLAN tags are present") then fixed up csum_start.
Virtio, packet and uml do not insert the vlan header in the user
buffer.
When introducing virtio_net_hdr_from_skb to deduplicate filling in
the virtio_net_hdr, the variant from macvtap which adds VLAN_HLEN was
applied uniformly, breaking csum offset for packets with vlan on
virtio and packet.
Make insertion of VLAN_HLEN optional. Convert the callers to pass it
when needed.
Fixes: e858fae2b0 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
Fixes: 1276f24eee ("packet: use common code for virtio_net_hdr and skb GSO conversion")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6c206b2009 ]
After commit 6b229cf77d ("udp: add batching to udp_rmem_release()")
the sk_rmem_alloc field does not measure exactly anymore the
receive queue length, because we batch the rmem release. The issue
is really apparent only after commit 0d4a6608f6 ("udp: do rmem bulk
free even if the rx sk queue is empty"): the user space can easily
check for an empty socket with not-0 queue length reported by the 'ss'
tool or the procfs interface.
We need to use a custom UDP helper to report the correct queue length,
taking into account the forward allocation deficit.
Reported-by: trevor.francis@46labs.com
Fixes: 6b229cf77d ("UDP: add batching to udp_rmem_release()")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6d8c50dcb0 ]
fchownat() doesn't even hold refcnt of fd until it figures out
fd is really needed (otherwise is ignored) and releases it after
it resolves the path. This means sock_close() could race with
sockfs_setattr(), which leads to a NULL pointer dereference
since typically we set sock->sk to NULL in ->release().
As pointed out by Al, this is unique to sockfs. So we can fix this
in socket layer by acquiring inode_lock in sock_close() and
checking against NULL in sockfs_setattr().
sock_release() is called in many places, only the sock_close()
path matters here. And fortunately, this should not affect normal
sock_close() as it is only called when the last fd refcnt is gone.
It only affects sock_close() with a parallel sockfs_setattr() in
progress, which is not common.
Fixes: 86741ec254 ("net: core: Add a UID field to struct sock.")
Reported-by: shankarapailoor <shankarapailoor@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4fd44a98ff ]
commit 079096f103 ("tcp/dccp: install syn_recv requests into ehash
table") introduced an optimization for the handling of child sockets
created for a new TCP connection.
But this optimization passes any data associated with the last ACK of the
connection handshake up the stack without verifying its checksum, because it
calls tcp_child_process(), which in turn calls tcp_rcv_state_process()
directly. These lower-level processing functions do not do any checksum
verification.
Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to
fix this.
Fixes: 079096f103 ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8d499533e0 ]
use nla_strlcpy() to avoid copying data beyond the length of TCA_DEF_DATA
netlink attribute, in case it is less than SIMP_MAX_DATA and it does not
end with '\0' character.
v2: fix errors in the commit message, thanks Hangbin Liu
Fixes: fa1b1cff3d ("net_cls_act: Make act_simple use of netlink policy.")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b718e8c8f4 ]
DP83620 register set is compatible with the DP83848, but it also supports
100base-FX. When the hardware is configured such as that fiber mode is
enabled, autonegotiation is not possible.
The chip, however, doesn't expose this information via BMSR_ANEGCAPABLE.
Instead, this bit is always set high, even if the particular hardware
configuration makes it so that auto negotiation is not possible [1]. Under
these circumstances, the phy subsystem keeps trying for autonegotiation to
happen, without success.
Hereby, we inspect BMCR_ANENABLE bit after genphy_config_init, which on
reset is set to 0 when auto negotiation is disabled, and so we use this
value instead of BMSR_ANEGCAPABLE.
[1] https://e2e.ti.com/support/interface/ethernet/f/903/p/697165/2571170
Signed-off-by: Alvaro Gamez Machado <alvaro.gamez@hazent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 349b71d6f4 ]
When pskb_trim_rcsum fails, the lack of error-handling code may
cause unexpected results.
This patch adds error-handling code after calling pskb_trim_rcsum.
Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0975764684 ]
IPVS setups with local client and remote tunnel server need
to create exception for the local virtual IP. What we do is to
change PMTU from 64KB (on "lo") to 1460 in the common case.
Suggested-by: Martin KaFai Lau <kafai@fb.com>
Fixes: 45e4fd2668 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
Fixes: 7343ff31eb ("ipv6: Don't create clones of host routes.")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eb55bbf865 ]
There is a timing issue under active-standy mode, when bond_enslave() is
called, bond->params.primary might not be initialized yet.
Any time the primary slave string changes, bond->force_primary should be
set to true to make sure the primary becomes the active slave.
Signed-off-by: Xiangning Yu <yuxiangning@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 92d44a42af ]
Commit 7771c66457 ("signal/arm: Document conflicts with SI_USER and
SIGFPE") broke the siginfo structure for userspace triggered signals,
causing the strace testsuite to regress. Fix this by eliminating
the FPE_FIXME definition (which is at the root of the breakage) and
use FPE_FLTINV instead for the case where the hardware appears to be
reporting nonsense.
Fixes: 7771c66457 ("signal/arm: Document conflicts with SI_USER and SIGFPE")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6cea14f554 ]
You can build a kernel in a cross compiling environment that doesn't
have perl in the $PATH. Commit 429f7a062e broke that for 32 bit
ARM. Fix it.
As reported by Stephen Rothwell, it appears that the symbols can be
either part of the BSS section or absolute symbols depending on the
binutils version. When they're an absolute symbol, the $(( ))
operator errors out and the build fails. Fix this as well.
Fixes: 429f7a062e ("ARM: decompressor: fix BSS size calculation")
Reported-by: Rob Landley <rob@landley.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Rob Landley <rob@landley.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2d7b3c6443 ]
When a panic() occurs, the kexec code uses smp_send_stop() to stop
the other CPUs, but this results in the CPU register state not being
saved, and gdb is unable to inspect the state of other CPUs.
Commit 0ee59413c9 ("x86/panic: replace smp_send_stop() with kdump
friendly version in panic path") addressed the issue on x86, but
ignored other architectures. Address the issue on ARM by splitting
out the crash stop implementation to crash_smp_send_stop() and
adding the necessary protection.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e07e3c33b9 ]
In commit 639da5ee37 ("ARM: add an extra temp register to the low
level debugging addruart macro") an additional temporary register was
added to the addruart macro, but the decompressor code wasn't updated.
Fixes: 639da5ee37 ("ARM: add an extra temp register to the low level debugging addruart macro")
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4f74d72aa7 ]
When CONFIG_RANDOMIZE_TEXT_OFFSET=y, TEXT_OFFSET is an arbitrary
multiple of PAGE_SIZE in the interval [0, 2MB).
The EFI stub does not account for the potential misalignment of
TEXT_OFFSET relative to EFI_KIMG_ALIGN, and produces a randomized
physical offset which is always a round multiple of EFI_KIMG_ALIGN.
This may result in statically allocated objects whose alignment exceeds
PAGE_SIZE to appear misaligned in memory. This has been observed to
result in spurious stack overflow reports and failure to make use of
the IRQ stacks, and theoretically could result in a number of other
issues.
We can OR in the low bits of TEXT_OFFSET to ensure that we have the
necessary offset (and hence preserve the misalignment of TEXT_OFFSET
relative to EFI_KIMG_ALIGN), so let's do that.
Reported-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Kim Phillips <kim.phillips@arm.com>
[ardb: clarify comment and commit log, drop unneeded parens]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 6f26b36711 ("arm64: kaslr: increase randomization granularity")
Link: http://lkml.kernel.org/r/20180518140841.9731-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 54940fa60a ]
If DELL_WMI "select"s DELL_SMBIOS, the DELL_SMBIOS dependencies are
ignored and it is still possible to end up with unmet direct
dependencies.
Change the select to a depends on.
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 01f56832cf ]
No other architecture has setup_profiling_timer() in the init section,
thus on parisc we face this section mismatch warning:
Reference from the function devm_device_add_group() to the function .init.text:setup_profiling_timer()
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f6a3463063 ]
In the following commit:
6b55c9654f ("sched/debug: Move print_cfs_rq() declaration to kernel/sched/sched.h")
the print_cfs_rq() prototype was added to <kernel/sched/sched.h>,
right next to the prototypes for print_cfs_stats(), print_rt_stats()
and print_dl_stats().
Finish this previous commit and also move related prototypes for
print_rt_rq() and print_dl_rq().
Remove existing extern declarations now that they not needed anymore.
Silences the following GCC warning, triggered by W=1:
kernel/sched/debug.c:573:6: warning: no previous prototype for ‘print_rt_rq’ [-Wmissing-prototypes]
kernel/sched/debug.c:603:6: warning: no previous prototype for ‘print_dl_rq’ [-Wmissing-prototypes]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180516195348.30426-1-malat@debian.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2b6207291b ]
There is a comment here which says that DIV_ROUND_UP() and that's where
the problem comes from. Say you pick:
args->bpp = UINT_MAX - 7;
args->width = 4;
args->height = 1;
The integer overflow in DIV_ROUND_UP() means "cpp" is UINT_MAX / 8 and
because of how we picked args->width that means cpp < UINT_MAX / 4.
I've fixed it by preventing the integer overflow in DIV_ROUND_UP(). I
removed the check for !cpp because it's not possible after this change.
I also changed all the 0xffffffffU references to U32_MAX.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180516140026.GA19340@mwanda
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5a817641f6 ]
The filesystem freezing code needs to transfer ownership of a rwsem
embedded in a percpu-rwsem from the task that does the freezing to
another one that does the thawing by calling percpu_rwsem_release()
after freezing and percpu_rwsem_acquire() before thawing.
However, the new rwsem debug code runs afoul with this scheme by warning
that the task that releases the rwsem isn't the one that acquires it,
as reported by Amir Goldstein:
DEBUG_LOCKS_WARN_ON(sem->owner != get_current())
WARNING: CPU: 1 PID: 1401 at /home/amir/build/src/linux/kernel/locking/rwsem.c:133 up_write+0x59/0x79
Call Trace:
percpu_up_write+0x1f/0x28
thaw_super_locked+0xdf/0x120
do_vfs_ioctl+0x270/0x5f1
ksys_ioctl+0x52/0x71
__x64_sys_ioctl+0x16/0x19
do_syscall_64+0x5d/0x167
entry_SYSCALL_64_after_hwframe+0x49/0xbe
To work properly with the rwsem debug code, we need to annotate that the
rwsem ownership is unknown during the tranfer period until a brave soul
comes forward to acquire the ownership. During that period, optimistic
spinning will be disabled.
Reported-by: Amir Goldstein <amir73il@gmail.com>
Tested-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jan Kara <jack@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Theodore Y. Ts'o <tytso@mit.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-fsdevel@vger.kernel.org
Link: http://lkml.kernel.org/r/1526420991-21213-3-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d7d760efad ]
There are use cases where a rwsem can be acquired by one task, but
released by another task. In thess cases, optimistic spinning may need
to be disabled. One example will be the filesystem freeze/thaw code
where the task that freezes the filesystem will acquire a write lock
on a rwsem and then un-owns it before returning to userspace. Later on,
another task will come along, acquire the ownership, thaw the filesystem
and release the rwsem.
Bit 0 of the owner field was used to designate that it is a reader
owned rwsem. It is now repurposed to mean that the owner of the rwsem
is not known. If only bit 0 is set, the rwsem is reader owned. If bit
0 and other bits are set, it is writer owned with an unknown owner.
One such value for the latter case is (-1L). So we can set owner to 1 for
reader-owned, -1 for writer-owned. The owner is unknown in both cases.
To handle transfer of rwsem ownership, the higher level code should
set the owner field to -1 to indicate a write-locked rwsem with unknown
owner. Optimistic spinning will be disabled in this case.
Once the higher level code figures who the new owner is, it can then
set the owner field accordingly.
Tested-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jan Kara <jack@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Theodore Y. Ts'o <tytso@mit.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-fsdevel@vger.kernel.org
Link: http://lkml.kernel.org/r/1526420991-21213-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2e5be528ab ]
On i.MX6 ULL using PLL3 seems to cause a freeze when setting
the parent to IMX6UL_CLK_PLL3_USB_OTG. This only seems to appear
since commit 6f9575e556 ("clk: imx: Add CLK_IS_CRITICAL flag
for busy divider and busy mux"), probably because the clock is
now forced to be on.
Fixes: 6f9575e55632("clk: imx: Add CLK_IS_CRITICAL flag for busy divider and busy mux")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f825e74d7 ]
The __DIVIDE() macro checks whether it is called with a 32-bit or 64-bit
dividend, to select the appropriate divide-and-round-up routine.
As the check uses the ternary operator, the result will always be
promoted to a type that can hold both results, i.e. unsigned long long.
When using this result in a division on a 32-bit system, this may lead
to link errors like:
ERROR: "__udivdi3" [drivers/mtd/nand/raw/nand.ko] undefined!
Fix this by casting the result of the division to the type of the
dividend.
Fixes: 8878b126df ("mtd: nand: add ->exec_op() implementation")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bb7298a7e8 ]
VPIF capture driver expects card name to be set since it
uses it without checking for NULL. The commit which
introduced VPIF display and capture support added card
name only for display, not for capture.
Set it in platform data to probe driver successfully.
While at it, also fix the display card name to something more
appropriate.
Fixes: 85609c1ccd ("DaVinci: DM646x - platform changes for vpif capture and display drivers")
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7d46899d57 ]
commit a16cb91ad9 ("[media] media: vpif: use a configurable
i2c_adapter_id for vpif display") removed hardcoded I2C adaptor
setting in VPIF driver, but missed updating platform data passed
from DM646x board.
Fix it.
Fixes: a16cb91ad9 ("[media] media: vpif: use a configurable i2c_adapter_id for vpif display")
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 73d4337ed9 ]
commit b38434145b ("ARM: davinci: irqs: Correct McASP1 TX interrupt
definition for DM646x") inadvertently removed priority setting for
timer0_12 (bottom half of timer0). This timer is used as clockevent.
When INTPRIn register setting for an interrupt is left at 0, it is
mapped to FIQ by the AINTC causing the timer interrupt to not get
generated.
Fix it by including an entry for timer0_12 in interrupt priority map
array. While at it, move the clockevent comment to the right place.
Fixes: b38434145b ("ARM: davinci: irqs: Correct McASP1 TX interrupt definition for DM646x")
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9954b80b8c ]
platform_domain_notifier contains a variable sized array, which the
pm_clk_notify() notifier treats as a NULL terminated array:
for (con_id = clknb->con_ids; *con_id; con_id++)
pm_clk_add(dev, *con_id);
Omitting the initialiser for con_ids means that the array is zero
sized, and there is no NULL terminator. This leads to pm_clk_notify()
overrunning into what ever structure follows, which may not be NULL.
This leads to an oops:
Unable to handle kernel NULL pointer dereference at virtual address 0000008c
pgd = c0003000
[0000008c] *pgd=80000800004003c, *pmd=00000000c
Internal error: Oops: 206 [#1] PREEMPT SMP ARM
Modules linked in:c
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0+ #9
Hardware name: Keystone
PC is at strlen+0x0/0x34
LR is at kstrdup+0x18/0x54
pc : [<c0623340>] lr : [<c0111d6c>] psr: 20000013
sp : eec73dc0 ip : eed780c0 fp : 00000001
r10: 00000000 r9 : 00000000 r8 : eed71e10
r7 : 0000008c r6 : 0000008c r5 : 014000c0 r4 : c03a6ff4
r3 : c09445d0 r2 : 00000000 r1 : 014000c0 r0 : 0000008c
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 30c5387d Table: 00003000 DAC: fffffffd
Process swapper/0 (pid: 1, stack limit = 0xeec72210)
Stack: (0xeec73dc0 to 0xeec74000)
...
[<c0623340>] (strlen) from [<c0111d6c>] (kstrdup+0x18/0x54)
[<c0111d6c>] (kstrdup) from [<c03a6ff4>] (__pm_clk_add+0x58/0x120)
[<c03a6ff4>] (__pm_clk_add) from [<c03a731c>] (pm_clk_notify+0x64/0xa8)
[<c03a731c>] (pm_clk_notify) from [<c004614c>] (notifier_call_chain+0x44/0x84)
[<c004614c>] (notifier_call_chain) from [<c0046320>] (__blocking_notifier_call_chain+0x48/0x60)
[<c0046320>] (__blocking_notifier_call_chain) from [<c0046350>] (blocking_notifier_call_chain+0x18/0x20)
[<c0046350>] (blocking_notifier_call_chain) from [<c0390234>] (device_add+0x36c/0x534)
[<c0390234>] (device_add) from [<c047fc00>] (of_platform_device_create_pdata+0x70/0xa4)
[<c047fc00>] (of_platform_device_create_pdata) from [<c047fea0>] (of_platform_bus_create+0xf0/0x1ec)
[<c047fea0>] (of_platform_bus_create) from [<c047fff8>] (of_platform_populate+0x5c/0xac)
[<c047fff8>] (of_platform_populate) from [<c08b1f04>] (of_platform_default_populate_init+0x8c/0xa8)
[<c08b1f04>] (of_platform_default_populate_init) from [<c000a78c>] (do_one_initcall+0x3c/0x164)
[<c000a78c>] (do_one_initcall) from [<c087bd9c>] (kernel_init_freeable+0x10c/0x1d0)
[<c087bd9c>] (kernel_init_freeable) from [<c0628db0>] (kernel_init+0x8/0xf0)
[<c0628db0>] (kernel_init) from [<c00090d8>] (ret_from_fork+0x14/0x3c)
Exception stack(0xeec73fb0 to 0xeec73ff8)
3fa0: 00000000 00000000 00000000 00000000
3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
Code: e3520000 1afffff7 e12fff1e c0801730 (e5d02000)
---[ end trace cafa8f148e262e80 ]---
Fix this by adding the necessary initialiser.
Fixes: fc20ffe121 ("ARM: keystone: add PM domain support for clock management")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ebc3dd688c ]
It has been observed that writing 0xF2 to the power register while it
reads as 0xF4 results in the register having the value 0xF0, i.e. clearing
RESUME and setting SUSPENDM in one go does not work. It might also violate
the USB spec to transition directly from resume to suspend, especially
when not taking T_DRSMDN into account. But this is what happens when a
remote wakeup occurs between SetPortFeature USB_PORT_FEAT_SUSPEND on the
root hub and musb_bus_suspend being called.
This commit returns -EBUSY when musb_bus_suspend is called while remote
wakeup is signalled and thus avoids to reset the RESUME bit. Ignoring
this error when musb_port_suspend is called from musb_hub_control is ok.
Signed-off-by: Daniel Glöckner <dg@emlix.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4776cab43f ]
Some AFS servers refuse to accept unencrypted traffic, so can't be accessed
with kAFS. Set the AF_RXRPC security level to encrypt client calls to deal
with this.
Note that incoming service calls are set by the remote client and so aren't
affected by this.
This requires an AF_RXRPC patch to pass the value set by setsockopt to calls
begun by the kernel.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f9c1bba3d3 ]
The code that looks up servers by addresses makes the assumption
that the list of addresses for a server is sorted. It exits the
loop if it finds that the target address is larger than the
current candidate. As the list is not currently sorted, this
can lead to a failure to find a matching server, which can cause
callbacks from that server to be ignored.
Remove the early exit case so that the complete list is searched.
Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 001ab5a67e ]
Fix the handling of the CB.InitCallBackState3 service call to find the
record of a server that we're using by looking it up by the UUID passed as
the parameter rather than by its address (of which it might have many, and
which may change).
Fixes: c35eccb1f6 ("[AFS]: Implement the CB.InitCallBackState3 operation.")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3d9fa91161 ]
If a volume location record lists multiple file servers for a volume, then
it's possible that due to a misconfiguration or a changing configuration
that one of the file servers doesn't know about it yet and will abort
VNOVOL. Currently, the rotation algorithm will stop with EREMOTEIO.
Fix this by moving on to try the next server if VNOVOL is returned. Once
all the servers have been tried and the record rechecked, the algorithm
will stop with EREMOTEIO or ENOMEDIUM.
Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ea739a287f ]
Commit 9e343e87d2 ("mtd: cfi: convert inline functions to macros")
changed map_word_andequal() into a macro, but also changed the right
hand side of the comparison from val3 to val2. Change it back to use
val3 on the right hand side.
Thankfully this did not cause a regression because all callers
currently pass the same argument for val2 and val3.
Fixes: 9e343e87d2 ("mtd: cfi: convert inline functions to macros")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ec5a3b4b50 ]
The server rotation algorithm just gives up if it fails to probe a
fileserver. Fix this by rotating to the next fileserver instead.
Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d4a96bec7a ]
The refcounting on afs_cb_interest struct objects in
afs_register_server_cb_interest() is wrong as it uses the server list
entry's call back interest pointer without regard for the fact that it
might be replaced at any time and the object thrown away.
Fix this by:
(1) Put a lock on the afs_server_list struct that can be used to
mediate access to the callback interest pointers in the servers array.
(2) Keep a ref on the callback interest that we get from the entry.
(3) Dropping the old reference held by vnode->cb_interest if we replace
the pointer.
Fixes: c435ee3455 ("afs: Overhaul the callback handling")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 01fd79e6de ]
The parsing of port specifiers in the address list obtained from the DNS
resolution upcall doesn't work as in4_pton() and in6_pton() will fail on
encountering an unexpected delimiter (in this case, the '+' marking the
port number). However, in*_pton() can't be given multiple specifiers.
Fix this by finding the delimiter in advance and not relying on in*_pton()
to find the end of the address for us.
Fixes: 8b2a464ced ("afs: Add an address list concept")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e438302920 ]
While reflinking an inode, we create a new inode in orphan directory,
then take EX lock on it, reflink the original inode to orphan inode and
release EX lock. Once the lock is released another node could request
it in EX mode from ocfs2_recover_orphans() which causes downconvert of
the lock, on this node, to NL mode.
Later we attempt to initialize security acl for the orphan inode and
move it to the reflink destination. However, while doing this we dont
take EX lock on the inode. This could potentially cause problems
because we could be starting transaction, accessing journal and
modifying metadata of the inode while holding NL lock and with another
node holding EX lock on the inode.
Fix this by taking orphan inode cluster lock in EX mode before
initializing security and moving orphan inode to reflink destination.
Use the __tracker variant while taking inode lock to avoid recursive
locking in the ocfs2_init_security_and_acl() call chain.
Link: http://lkml.kernel.org/r/1523475107-7639-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c89ebb968f ]
The error clean up path kfree's adapter->ipsec and should be
instead kfree'ing ipsec. Fix this. Also, the err1 error exit path
does not need to kfree ipsec because this failure path was for
the failed allocation of ipsec.
Detected by CoverityScan, CID#146424 ("Resource Leak")
Fixes: 63a67fe229 ("ixgbe: add ipsec offload add and remove SA")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ac21fc2dcb ]
Commit 0fa1c57934 ("of/fdt: use memblock_virt_alloc for early alloc")
inadvertently switched the DT unflattening allocations from memblock to
bootmem which doesn't work because the unflattening happens before
bootmem is initialized. Swapping the order of bootmem init and
unflattening could also fix this, but removing bootmem is desired. So
enable NO_BOOTMEM on SH like other architectures have done.
Fixes: 0fa1c57934 ("of/fdt: use memblock_virt_alloc for early alloc")
Reported-by: Rich Felker <dalias@libc.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rich Felker <dalias@libc.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6356ee0c96 ]
The IP increment should be done after the hypercall emulation, after
calling the various handlers. In this way, these handlers can accurately
identify the the IP of the VMCALL if they need it.
This patch keeps the same functionality for the Hyper-V handler which does
not use the return code of the standard kvm_skip_emulated_instruction()
call.
Signed-off-by: Marian Rotariu <mrotariu@bitdefender.com>
[Hyper-V hypercalls also need kvm_skip_emulated_instruction() - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ddc9cfb79c ]
Our virtual machines make use of device assignment by configuring
12 NVMe disks for high I/O performance. Each NVMe device has 129
MSI-X Table entries:
Capabilities: [50] MSI-X: Enable+ Count=129 Masked-Vector table: BAR=0 offset=00002000
The windows virtual machines fail to boot since they will map the number of
MSI-table entries that the NVMe hardware reported to the bus to msi routing
table, this will exceed the 1024. This patch extends MAX_IRQ_ROUTES to 4096
for all archs, in the future this might be extended again if needed.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Tonny Lu <tonnylu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 93864fc3ff ]
Fix the kernel call initiation to set the minimum security level for kernel
initiated calls (such as from kAFS) from the sockopt value.
Fixes: 19ffa01c9c ("rxrpc: Use structs to hold connection params and protocol info")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f2aeed3a59 ]
AF_RXRPC tries to turn on IP_RECVERR and IP_MTU_DISCOVER on the UDP socket
it just opened for communications with the outside world, regardless of the
type of socket. Unfortunately, this doesn't work with an AF_INET6 socket.
Fix this by turning on IPV6_RECVERR and IPV6_MTU_DISCOVER instead if the
socket is of the AF_INET6 family.
Without this, kAFS server and address rotation doesn't work correctly
because the algorithm doesn't detect received network errors.
Fixes: 75b54cb57c ("rxrpc: Add IPv6 support")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c54e43d752 ]
The expect_rx_by call timeout is supposed to be set when a call is started
to indicate that we need to receive a packet by that point. This is
currently put back every time we receive a packet, but it isn't started
when we first send a packet. Without this, the call may wait forever if
the server doesn't deign to reply.
Fix this by setting the timeout upon a successful UDP sendmsg call for the
first DATA packet. The timeout is initiated only for initial transmission
and not for subsequent retries as we don't want the retry mechanism to
extend the timeout indefinitely.
Fixes: a158bdd324 ("rxrpc: Fix call timeouts")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dec60f3a9b ]
Both ‘uninorth_remove_memory’ and ‘null_cache_flush’ can be made
static. So make them.
Silence the following gcc warning (W=1):
drivers/char/agp/uninorth-agp.c:198:5: warning: no previous prototype for ‘uninorth_remove_memory’ [-Wmissing-prototypes]
and
drivers/char/agp/uninorth-agp.c:473:6: warning: no previous prototype for ‘null_cache_flush’ [-Wmissing-prototypes]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ae2cd7fb47 ]
As per listxattr(2):
On success, a nonnegative number is returned indicating the size
of the extended attribute name list. On failure, -1 is returned
and errno is set appropriately.
In SMB1, when the server returns an empty EA list through a listxattr(),
it will correctly return 0 as there are no EAs for the given file.
However, in SMB2+, it returns -ENODATA in listxattr() which is wrong since
the request and response were sent successfully, although there's no actual
EA for the given file.
This patch fixes listxattr() for SMB2+ by returning 0 in cifs_listxattr()
when the server returns an empty list of EAs.
Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2796d303e3 ]
The data buffer allocated on the stack can't be DMA'ed, ib_dma_map_page will
return an invalid DMA address for a buffer on stack. Even worse, this
incorrect address can't be detected by ib_dma_mapping_error. Sending data
from this address to hardware will not fail, but the remote peer will get
junk data.
Fix this by allocating the request on the heap in smb3_validate_negotiate.
Changes in v2:
Removed duplicated code on freeing buffers on function exit.
(Thanks to Parav Pandit <parav@mellanox.com>)
Fixed typo in the patch title.
Changes in v3:
Added "Fixes" to the patch.
Changed several sizeof() to use *pointer in place of struct.
Changes in v4:
Added detailed comments on the failure through RDMA.
Allocate request buffer using GPF_NOFS.
Fixed possible memory leak.
Changes in v5:
Removed variable ret for checking return value.
Changed to use pneg_inbuf->Dialects[0] to calculate unused space in pneg_inbuf.
Fixes: ff1c038add ("Check SMB3 dialects against downgrade attacks")
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Tom Talpey <ttalpey@microsoft.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71c23a821c ]
bus-off is usually caused by hardware malfunction or configuration error
(baud rate mismatch) and causes a complete loss of communication.
Increase the "bus-off" message's severity from netdev_dbg() to
netdev_info() to make it visible to the user.
A can interface going into bus-off is similar in severity to ethernet's
"Link is Down" message, which is also printed at info level.
It is debatable whether the the "restarted" message should also be
changed to netdev_info() to make the interface state changes
comprehensible from the kernel log. I have chosen to keep the
"restarted" message at dbg for now as the "bus-off" message should be
enough for the user to notice and investigate the problem.
Signed-off-by: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>
Cc: linux-can@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6c0a8f6b5a ]
The build is failing with CONFIG_NUMA=n and some compiler versions:
arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
hotplug-cpu.c:(.text+0x12c): undefined reference to `timed_topology_update'
arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_cpu_remove':
hotplug-cpu.c:(.text+0x400): undefined reference to `timed_topology_update'
Fix it by moving the empty version of timed_topology_update() into the
existing #ifdef block, which has the right guard of SPLPAR && NUMA.
Fixes: cee5405da4 ("powerpc/hotplug: Improve responsiveness of hotplug change")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58d813afbe upstream.
This was originally mistakenly submitted to net-next. Resubmitting to net.
The comparison of numvecs < 0 is always false because numvecs is a u32
and hence the error return from a failed call to pci_alloc_irq_vectores
is never detected. Fix this by using the signed int ret to handle the
error return and assign numvecs to err.
Detected by CoverityScan, CID#1468650 ("Unsigned compared against 0")
Fixes: a09bd81b54 ("net: aquantia: Limit number of vectors to actually allocated irqs")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a09bd81b54 ]
Driver should use pci_alloc_irq_vectors return value to correct number
of allocated vectors and napi instances. Otherwise it'll panic later
in pci_irq_vector.
Driver also should allow more than one MSI vectors to be allocated.
Error return path from pci_alloc_irq_vectors is also fixed to revert
resources in a correct sequence when error happens.
Reported-by: Long, Nicholas <nicholas.a.long@baesystems.com>
Fixes: 23ee07a ("net: aquantia: Cleanup pci functions module")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d1ecfa9d1f ]
This patch fixes crashes during boot for HVM guests on older (pre HVM
vector callback) Xen versions. Without this, current kernels will always
fail to boot on those Xen versions.
Sample stack trace:
BUG: unable to handle kernel paging request at ffffffffff200000
IP: __xen_evtchn_do_upcall+0x1e/0x80
PGD 1e0e067 P4D 1e0e067 PUD 1e10067 PMD 235c067 PTE 0
Oops: 0002 [#1] SMP PTI
Modules linked in:
CPU: 0 PID: 512 Comm: kworker/u2:0 Not tainted 4.14.33-52.13.amzn1.x86_64 #1
Hardware name: Xen HVM domU, BIOS 3.4.3.amazon 11/11/2016
task: ffff88002531d700 task.stack: ffffc90000480000
RIP: 0010:__xen_evtchn_do_upcall+0x1e/0x80
RSP: 0000:ffff880025403ef0 EFLAGS: 00010046
RAX: ffffffff813cc760 RBX: ffffffffff200000 RCX: ffffc90000483ef0
RDX: ffff880020540a00 RSI: ffff880023c78000 RDI: 000000000000001c
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880025403f5c R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880025400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff200000 CR3: 0000000001e0a000 CR4: 00000000000006f0
Call Trace:
<IRQ>
do_hvm_evtchn_intr+0xa/0x10
__handle_irq_event_percpu+0x43/0x1a0
handle_irq_event_percpu+0x20/0x50
handle_irq_event+0x39/0x60
handle_fasteoi_irq+0x80/0x140
handle_irq+0xaf/0x120
do_IRQ+0x41/0xd0
common_interrupt+0x7d/0x7d
</IRQ>
During boot, the HYPERVISOR_shared_info page gets remapped to make it work
with KASLR. This means that any pointer derived from it needs to be
adjusted.
The only value that this applies to is the vcpu_info pointer for VCPU 0.
For PV and HVM with the callback vector feature, this gets done via the
smp_ops prepare_boot_cpu callback. Older Xen versions do not support the
HVM callback vector, so there is no Xen-specific smp_ops set up in that
scenario. So, the vcpu_info pointer for VCPU 0 never gets set to the proper
value, and the first reference of it will be bad. Fix this by resetting it
immediately after the remap.
Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Reviewed-by: Eduardo Valentin <eduval@amazon.com>
Reviewed-by: Alakesh Haloi <alakeshh@amazon.com>
Reviewed-by: Vallish Vaidyeshwara <vallish@amazon.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel@lists.xenproject.org
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 675c7215aa ]
As per ARM documentation
PPI(0) ID27 - global timer interrupt is rising-edge sensitive.
set IRQ triggering type to IRQ_TYPE_EDGE_RISING for ARM Global timers.
Fixes: c9ad7bc5fe ("ARM: dts: Enable Broadcom Cygnus SoC")
Signed-off-by: Clément Péron <peron.clem@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0d74d872c3 ]
__printf is useful to verify format and arguments. Remove the following
warning (with W=1):
drivers/ata/libata-eh.c:183:10: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 12d9f07022 ]
Currently only nvme_ctrl will take a reference counter of
nvme_subsystem, nvme_ns_head also needs it. Otherwise
nvme_free_ns_head will access the nvme_subsystem.ns_ida
which has been freed by __nvme_release_subsystem after all the
reference of nvme_subsystem have been released by nvme_free_ctrl.
This could cause memory corruption.
BUG: KASAN: use-after-free in radix_tree_next_chunk+0x9f/0x4b0
Read of size 8 at addr ffff88036494d2e8 by task fio/1815
CPU: 1 PID: 1815 Comm: fio Kdump: loaded Tainted: G W 4.17.0-rc1+ #18
Hardware name: LENOVO 10MLS0E339/3106, BIOS M1AKT22A 06/27/2017
Call Trace:
dump_stack+0x91/0xeb
print_address_description+0x6b/0x290
kasan_report+0x261/0x360
radix_tree_next_chunk+0x9f/0x4b0
ida_remove+0x8b/0x180
ida_simple_remove+0x26/0x40
nvme_free_ns_head+0x58/0xc0
__blkdev_put+0x30a/0x3a0
blkdev_close+0x44/0x50
__fput+0x184/0x380
task_work_run+0xaf/0xe0
do_exit+0x501/0x1440
do_group_exit+0x89/0x140
__x64_sys_exit_group+0x28/0x30
do_syscall_64+0x72/0x230
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 407879b690 ]
The IEEE P802.11-REVmd D1.0 specification updated the SAE authentication
timeout to be 2000 milliseconds (see dot11RSNASAERetransPeriod). Update
the SAE timeout setting accordingly.
While at it, reduce some code duplication in the timeout configuration.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ab9d3db5b3 ]
This change prevents userland from referencing TEE shared memory
outside the area initially allocated by its owner. Prior this change an
application could not reference or access memory it did not own but
it could reference memory not explicitly allocated by owner but still
allocated to the owner due to the memory allocation granule.
Reported-by: Alexandre Jutras <alexandre.jutras@nxp.com>
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6cb465972c ]
The sh asm/smp.h defines a fallback hard_smp_processor_id macro for
the !SMP case, but linux/smp.h never includes asm/smp.h in the !SMP
case.
Signed-off-by: Rich Felker <dalias@libc.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 642ef99be9 ]
Since commit d677a4d601 ("Makefile: support flag
-fsanitizer-coverage=trace-cmp"), you miss to build the SANCOV
plugin under some circumstances.
CONFIG_KCOV=y
CONFIG_KCOV_ENABLE_COMPARISONS=y
Your compiler does not support -fsanitize-coverage=trace-pc
Your compiler does not support -fsanitize-coverage=trace-cmp
Under this condition, $(CFLAGS_KCOV) is not empty but contains a
space, so the following ifeq-conditional is false.
ifeq ($(CFLAGS_KCOV),)
Then, scripts/Makefile.gcc-plugins misses to add sancov_plugin.so to
gcc-plugin-y while the SANCOV plugin is necessary as an alternative
means.
Fixes: d677a4d601 ("Makefile: support flag -fsanitizer-coverage=trace-cmp")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b5bf9a90bb ]
Gaurav reported a perceived problem with TASK_PARKED, which turned out
to be a broken wait-loop pattern in __kthread_parkme(), but the
reported issue can (and does) in fact happen for states that do not do
condition based sleeps.
When the 'current->state = TASK_RUNNING' store of a previous
(concurrent) try_to_wake_up() collides with the setting of a 'special'
sleep state, we can loose the sleep state.
Normal condition based wait-loops are immune to this problem, but for
sleep states that are not condition based are subject to this problem.
There already is a fix for TASK_DEAD. Abstract that and also apply it
to TASK_STOPPED and TASK_TRACED, both of which are also without
condition based wait-loop.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d0f1a451e3 ]
Commit 9ef09e35e5 ("bpf: fix possible spectre-v1 in find_and_alloc_map()")
converted find_and_alloc_map() over to use array_index_nospec() to sanitize
map type that user space passes on map creation, and this patch does an
analogous conversion for progs in find_prog_type() as it's also passed from
user space when loading progs as attr->prog_type.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0ccc1c8f02 ]
If an interlaced video mode is selected, a IOMMU pagefault is
triggered by vp_video_buffer().
Fix the most apparent bugs:
- pitch value for chroma plane
- divide by two of height and vpos of source and destination
Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
[ a.hajda: Halved also destination height and vpos, updated commit message ]
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2eced8e917 ]
In case of interlace mode video processor registers and mixer config
register must be check to ensure internal state is in sync with shadow
registers.
This patch fixes page-faults in interlaced mode.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9aa169213d ]
When commit [1] was added, SGID was queried to derive the SMAC address.
Then, later on during a refactor [2], SMAC was no longer needed. However,
the now useless GID query remained. Then during additional code changes
later on, the GID query was being done in such a way that it caused iWARP
queries to start breaking. Remove the useless GID query and resolve the
iWARP breakage at the same time.
This is discussed in [3].
[1] commit dd5f03beb4 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
[2] commit 5c266b2304 ("IB/cm: Remove the usage of smac and vid of qp_attr and cm_av")
[3] https://www.spinics.net/lists/linux-rdma/msg63951.html
Suggested-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b03bcde962 ]
When the kernel was compiled using the UBSAN option,
we saw the following stack trace:
[ 1184.827917] UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx4/mr.c:349:27
[ 1184.828114] signed integer overflow:
[ 1184.828247] -2147483648 - 1 cannot be represented in type 'int'
The problem was caused by calling round_up in procedure
mlx4_ib_umem_calc_optimal_mtt_size (on line 349, as noted in the stack
trace) with the second parameter (1 << block_shift) (which is an int).
The second parameter should have been (1ULL << block_shift) (which
is an unsigned long long).
(1 << block_shift) is treated by the compiler as an int (because 1 is
an integer).
Now, local variable block_shift is initialized to 31.
If block_shift is 31, 1 << block_shift is 1 << 31 = 0x80000000=-214748368.
This is the most negative int value.
Inside the round_up macro, there is a cast applied to ((1 << 31) - 1).
However, this cast is applied AFTER ((1 << 31) - 1) is calculated.
Since (1 << 31) is treated as an int, we get the negative overflow
identified by UBSAN in the process of calculating ((1 << 31) - 1).
The fix is to change (1 << block_shift) to (1ULL << block_shift) on
line 349.
Fixes: 9901abf583 ("IB/mlx4: Use optimal numbers of MTT entries")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e9777ad439 ]
When allocating device data, if there's an allocation failure, the
already allocated memory won't be freed such as per-cpu counters.
Fix memory leaks in exception path by creating a common reentrant
clean up function hfi1_clean_devdata() to be used at driver unload
time and device data allocation failure.
To accomplish this, free_platform_config() and clean_up_i2c() are
changed to be reentrant to remove dependencies when they are called
in different order. This helps avoid NULL pointer dereferences
introduced by this patch if those two functions weren't reentrant.
In addition, set dd->int_counter, dd->rcv_limit,
dd->send_schedule and dd->tx_opstats to NULL after they're freed in
hfi1_clean_devdata(), so that hfi1_clean_devdata() is fully reentrant.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1a2f474d32 ]
If the I2C adapter that the PD controller is attached to
does not support SMBus protocol, the driver needs to handle
block reads separately. The first byte returned in block
read protocol will show the total number of bytes. It needs
to be stripped away.
This is handled separately in the driver only because right
now we have no way of requesting the used protocol with
regmap-i2c. This is in practice a workaround for what is
really a problem in regmap-i2c. The other option would have
been to register custom regmap, or not use regmap at all,
however, since the solution is very simple, I choose to use
it in this case for convenience. It is easy to remove once
we figure out how to handle this kind of cases in
regmap-i2c.
Fixes: 0a4c005bd1 ("usb: typec: driver for TI TPS6598x USB Power Delivery controllers")
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 647efef69d ]
The missing "compatible" entries are needed by drivers/clk/ti/clkctrl.c,
and without them the structures initialized in drivers/clk/ti/clk-814x.c
are not passed to configuration code. The result is a "not found from
clkctrl data" error message, although boot proceeds anyway.
The reason why the compatible is not found is because the board specific
files override the SoC compatible without including it. This did not
cause any issues until with the clkctrl nodes got introduced.
Very lightly tested on a (lurching) AM3874 design that's in the middle
of a kernel upgrade from TI's abandoned 2.6.37 tree.
Also tested on j5eco-evm and hp-t410 to verify the clkctrl clocks are
found.
Fixes: bb30465b59 ("ARM: dts: dm814x: add clkctrl nodes")
Fixes: 80a06c0d83 ("ARM: dts: dm816x: add clkctrl nodes")
Signed-off-by: Graeme Smecher <gsmecher@threespeedlogic.com>
[tony: updated to fix for 8168-evm, updated comments]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit baf64250b4 ]
The deferred_fiq handler used to limit hardware operations to IRQ
unmask only, relying on gpio-omap assigned handler performing the ACKs.
Since commit 80ac93c274 ("gpio: omap: Fix lost edge interrupts") this
is no longer the case as handle_edge_irq() has been replaced with
handle_simmple_irq() which doesn't touch the hardware.
Add single ACK operation per each active IRQ pin to the handler. While
being at it, move unmask operation out of irq_counter loop so it is
also called only once for each active IRQ pin.
Fixes: 80ac93c274 ("gpio: omap: Fix lost edge interrupts")
Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a785dbccd9 ]
When CONFIG_NVME_MULTIPATH is set, but we're not using nvme to multipath,
namespaces with multiple paths were not creating unique names due to
reusing the same instance number from the namespace's head.
This patch fixes this by falling back to the non-multipath naming method
when the parameter disabled using multipath.
Reported-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5cadde8019 ]
We can't allow the user to change multipath settings at runtime, as this
will create naming conflicts due to the different naming schemes used
for each mode.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit aa7528fe35 ]
It was noticed that the IRTE configured for guest OS kernel
was over-written while the guest was running. As a result,
vt-d Posted Interrupts configured for the guest are not being
delivered directly, and instead bounces off the host. Every
interrupt delivery takes a VM Exit.
It was noticed that the following stack is doing the over-write:
[ 147.463177] modify_irte+0x171/0x1f0
[ 147.463405] intel_ir_set_affinity+0x5c/0x80
[ 147.463641] msi_domain_set_affinity+0x32/0x90
[ 147.463881] irq_do_set_affinity+0x37/0xd0
[ 147.464125] irq_set_affinity_locked+0x9d/0xb0
[ 147.464374] __irq_set_affinity+0x42/0x70
[ 147.464627] write_irq_affinity.isra.5+0xe1/0x110
[ 147.464895] proc_reg_write+0x38/0x70
[ 147.465150] __vfs_write+0x36/0x180
[ 147.465408] ? handle_mm_fault+0xdf/0x200
[ 147.465671] ? _cond_resched+0x15/0x30
[ 147.465936] vfs_write+0xad/0x1a0
[ 147.466204] SyS_write+0x52/0xc0
[ 147.466472] do_syscall_64+0x74/0x1a0
[ 147.466744] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
reversing the sense of force check in intel_ir_reconfigure_irte()
restores proper posted interrupt functionality
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Fixes: d491bdff88 ('iommu/vt-d: Reevaluate vector configuration on activate()')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9df50ba76a ]
Need to configure PHY interrupt as active low for P3310 Tegra186
platform otherwise it results in spurious interrupts.
This issue wasn't seen before because the generic PHY driver without
interrupt support was used.
Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 85f1abe001 ]
Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().
When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.
Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.
However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:
- observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
the park and set TASK_RUNNING, or
- __kthread_bind()'s wait_task_inactive() to observe the competing
TASK_RUNNING store.
Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.
Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.
The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 741a76b350 ]
Gaurav reported a problem with __kthread_parkme() where a concurrent
try_to_wake_up() could result in competing stores to ->state which,
when the TASK_PARKED store got lost bad things would happen.
The comment near set_current_state() actually mentions this competing
store, but only mentions the case against TASK_RUNNING. This same
store, with different timing, can happen against a subsequent !RUNNING
store.
This normally is not a problem, because as per that same comment, the
!RUNNING state store is inside a condition based wait-loop:
for (;;) {
set_current_state(TASK_UNINTERRUPTIBLE);
if (!need_sleep)
break;
schedule();
}
__set_current_state(TASK_RUNNING);
If we loose the (first) TASK_UNINTERRUPTIBLE store to a previous
(concurrent) wakeup, the schedule() will NO-OP and we'll go around the
loop once more.
The problem here is that the TASK_PARKED store is not inside the
KTHREAD_SHOULD_PARK condition wait-loop.
There is a genuine issue with sleeps that do not have a condition;
this is addressed in a subsequent patch.
Reported-by: Gaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b819439fea ]
Fix two section mismatches in drivers.c:
1) Section mismatch in reference from the function alloc_tree_node() to
the function .init.text:create_tree_node().
2) Section mismatch in reference from the function walk_native_bus() to
the function .init.text:alloc_pa_dev().
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 39f56ca945 ]
The JIT logic in jit_subprogs() is as follows: for all subprogs we
allocate a bpf_prog_alloc(), populate it (prog->is_func = 1 here),
and pass it to bpf_int_jit_compile(). If a failure occurred during
JIT and prog->jited is not set, then we bail out from attempting to
JIT the whole program, and punt to the interpreter instead. In case
JITing went successful, we fixup BPF call offsets and do another
pass to bpf_int_jit_compile() (extra_pass is true at that point) to
complete JITing calls. Given that requires to pass JIT context around
addrs and jit_data from x86 JIT are freed in the extra_pass in
bpf_int_jit_compile() when calls are involved (if not, they can
be freed immediately). However, if in the original pass, the JIT
image didn't converge then we leak addrs and jit_data since image
itself is NULL, the prog->is_func is set and extra_pass is false
in that case, meaning both will become unreachable and are never
cleaned up, therefore we need to free as well on !image. Only x64
JIT is affected.
Fixes: 1c2a088a66 ("bpf: x64: add JIT support for multi-function programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3aab8884c9 ]
While reviewing x64 JIT code, I noticed that we leak the prior allocated
JIT image in the case where proglen != oldproglen during the JIT passes.
Prior to the commit e0ee9c1215 ("x86: bpf_jit: fix two bugs in eBPF JIT
compiler") we would just break out of the loop, and using the image as the
JITed prog since it could only shrink in size anyway. After e0ee9c1215,
we would bail out to out_addrs label where we free addrs and jit_data but
not the image coming from bpf_jit_binary_alloc().
Fixes: e0ee9c1215 ("x86: bpf_jit: fix two bugs in eBPF JIT compiler")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 83b9dc1131 ]
When we dropped the custom Linux GPIO translation it resulted that the
IRQ numbers changed slightly as well. Normally this would be fine
because everyone is expected to use controller relative GPIO numbers and
ACPI GpioIo/GpioInt resources. However, there is a certain set of
Intel_Strago based Chromebooks where i8042 keyboard controller IRQ
number is hardcoded be 182 (this is corrected with newer coreboot but
the older ones still have the hardcoded Linux IRQ number). Because of
this hardcoded IRQ number keyboard on those systems accidentally broke
again.
Fix this by iteratively associating IRQ descriptors to the chip irqdomain
so that there are no gaps on those systems. Other systems are not
affected.
Fixes: 03c4749dd6 ("gpio / ACPI: Drop unnecessary ACPI GPIO to Linux GPIO translation")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199463
Reported-by: Sultan Alsawaf <sultanxda@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9411ac07cd ]
The GPIO chip is called davinci_gpio.0 in legacy mode. Fix it, so that
I2C can correctly lookup the recovery gpios.
Note that it is the gpio-davinci driver that sets the gpiochip label to
davinci_gpio.0.
Also, the I2C device uses an id of 1 on DM644x and DM355.
While at it, convert to using GPIO_TO_PIN() for referring to GPIO pin
numbers, like it is done in rest of the board support files.
Fixes: e535376537 ("i2c/ARM: davinci: Deep refactoring of I2C recovery")
Reviewed-by: David Lechner <david@lechnology.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f4b024271a ]
The vmw_pvscsi driver returns DID_ABORT for commands aborted internally
by the adapter, leading to the filesystem going read-only. Change the
result to DID_BUS_BUSY, causing the kernel to retry the command.
Signed-off-by: Jim Gill <jgill@vmware.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a57ab96ef9 ]
We already have memcpy_toio(), but not memset_io(), so let's
add the obvious version to allow building an allmodconfig kernel
without errors like
drivers/gpu/drm/ttm/ttm_bo_util.c: In function 'ttm_bo_move_memcpy':
drivers/gpu/drm/ttm/ttm_bo_util.c:390:3: error: implicit declaration of function 'memset_io' [-Werror=implicit-function-declaration]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 068bdb67ef ]
The automatic update mechanism will trigger an update if the
info block CRCs are different between maxtouch configuration
file (maxtouch.cfg) and chip.
The driver compared the CRCs without retrieving the chip CRC,
resulting always in a failure and firmware flashing action
triggered. Fix this issue by retrieving the chip info block
CRC before the check.
Note that this solution has the benefit that by reading the
information block and the object table into a contiguous region
of memory, we can verify the checksum at probe time. This means
we make sure that we are indeed talking to a chip that supports
object protocol correctly.
Using this patch on a kevin chromebook, the touchscreen and
touchpad drivers are able to match the CRC:
atmel_mxt_ts 3-004b: Family: 164 Variant: 14 Firmware V2.3.AA Objects: 40
atmel_mxt_ts 5-004a: Family: 164 Variant: 17 Firmware V2.0.AA Objects: 31
atmel_mxt_ts 3-004b: Resetting device
atmel_mxt_ts 5-004a: Resetting device
atmel_mxt_ts 3-004b: Config CRC 0x573E89: OK
atmel_mxt_ts 3-004b: Touchscreen size X4095Y2729
input: Atmel maXTouch Touchscreen as /devices/platform/ff130000.i2c/i2c-3/3-004b/input/input5
atmel_mxt_ts 5-004a: Config CRC 0x0AF6BA: OK
atmel_mxt_ts 5-004a: Touchscreen size X1920Y1080
input: Atmel maXTouch Touchpad as /devices/platform/ff140000.i2c/i2c-5/5-004a/input/input6
Signed-off-by: Nick Dyer <nick.dyer@shmanahar.org>
Acked-by: Benson Leung <bleung@chromium.org>
[Ezequiel: minor patch massage]
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Tested-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 054f155721 ]
The ro_release_mr methods check whether mr->mr_list is empty.
Therefore, be sure to always use list_del_init when removing an MR
linked into a list using that field. Otherwise, when recovering from
transport failures or device removal, list corruption can result, or
MRs can get mapped or unmapped an odd number of times, resulting in
IOMMU-related failures.
In general this fix is appropriate back to v4.8. However, code
changes since then make it impossible to apply this patch directly
to stable kernels. The fix would have to be applied by hand or
reworked for kernels earlier than v4.16.
Backport guidance -- there are several cases:
- When creating an MR, initialize mr_list so that using list_empty
on an as-yet-unused MR is safe.
- When an MR is being handled by the remote invalidation path,
ensure that mr_list is reinitialized when it is removed from
rl_registered.
- When an MR is being handled by rpcrdma_destroy_mrs, it is removed
from mr_all, but it may still be on an rl_registered list. In
that case, the MR needs to be removed from that list before being
released.
- Other cases are covered by using list_del_init in rpcrdma_mr_pop.
Fixes: 9d6b040978 ('xprtrdma: Place registered MWs on a ... ')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 95e59fc3c3 ]
The Audio has worked, but the mute pin has a weak pulldown which alows
some of the audio signal to pass very quietly. This patch fixes
that so the mute pin is actively driven high for mute or low for normal
operation.
Fixes: ab8dd3aed0 ("ARM: DTS: Add minimal Support for Logic
PD DM3730 SOM-LV")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 189822cbcb ]
The VAUX3 rail from the PMIC powers a clock driver which clocks
the WL127x. This corrects a bug which did not correctly associate
the vin-supply with the proper power rail.
This also fixes a typo in the pinmuxing to properly configure the
interrupt pin.
Fixes: ab8dd3aed0 ("ARM: DTS: Add minimal Support for Logic PD
DM3730 SOM-LV")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 33e9572483 ]
smp_processor_id() checks preemption if CONFIG_DEBUG_PREEMPT is enabled,
causing a warning dump during boot:
[ 5.042377] BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
[ 5.050281] caller is pwrdm_set_next_pwrst+0x48/0x88
[ 5.055330] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.14.24-g57341df0b4 #1
Use the raw_smp_processor_id() for the trace instead, this value does
not need to be perfectly correct. The alternative of disabling preempt
is too heavy weight operation to be applied in PM hot path for just
tracing purposes.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5c054de228 ]
Since commit 09f3756bb9 ("dm9000: Return an ERR_PTR() in all
error conditions of dm9000_parse_dt()"), passing either non-NULL
platform data or device-tree for dm9000 driver to probe is
mandatory.
DM335 board was using none, so networking failed to initialize.
Fix it by passing non-NULL (but empty) platform data.
Fixes: 09f3756bb9 ("dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt()")
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d45622c0ea ]
commit c4dc56be7e ("ARM: davinci: fix the GPIO lookup for omapl138-hawk")
fixed the GPIO chip name for look-up of MMC/SD CD and WP pins, but forgot
to change the GPIO numbers passed.
The GPIO numbers are not offsets from within a 32 GPIO bank. Fix the
GPIO numbers as well as remove the misleading comment.
Fixes: c4dc56be7e ("ARM: davinci: fix the GPIO lookup for omapl138-hawk")
Reviewed-by: David Lechner <david@lechnology.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 67c6b6ff22 ]
The GPIO chip is called davinci_gpio.0 in legacy mode. Fix it, so that
mmc can correctly lookup the wp and cp gpios. Also fix the GPIO numbers
as they are not offsets within a bank.
Note that it is the gpio-davinci driver that sets the gpiochip label to
davinci_gpio.0.
Fixes: bdf0e8364f ("ARM: davinci: da850-evm: use gpio descriptor for mmc pins")
Reviewed-by: David Lechner <david@lechnology.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 51e9f12163 ]
The GPIO chip is called davinci_gpio.0 in legacy mode. Fix it, so that
mmc can correctly lookup the wp and cp gpios. Also fix the GPIO numbers
as they are not offsets within a bank.
Note that it is the gpio-davinci driver that sets the gpiochip label to
davinci_gpio.0.
Fixes: b5e1438cf9 ("ARM: davinci: da830-evm: use gpio descriptor for mmc pins")
Reported-by: David Lechner <david@lechnology.com>
Reviewed-by: David Lechner <david@lechnology.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit db82476f37 ]
Currently, the kernel protects access to the agent ID allocator on a per
port basis using a spinlock, so it is impossible for two apps/threads on
the same port to get the same TID, but it is entirely possible for two
threads on different ports to end up with the same TID.
As this can be confusing (regardless of it being legal according to the
IB Spec 1.3, C13-18.1.1, in section 13.4.6.4 - TransactionID usage),
and as the rdma-core user space API for /dev/umad devices implies unique
TIDs even across ports, make the TID an atomic type so that no two
allocations, regardless of port number, will be the same.
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 19b9ad6731 ]
The comment claims that this helper will try not to loose bits, but for
64bit long it looses the high bits before hashing 64bit long into 32bit
int. Use the helper hash_long() to do the right thing for 64bit long.
For 32bit long, there is no change.
All the callers of end_name_hash() either assign the result to
qstr->hash, which is u32 or return the result as an int value (e.g.
full_name_hash()). Change the helper return type to int to conform to
its users.
[ It took me a while to apply this, because my initial reaction to it
was - incorrectly - that it could make for slower code.
After having looked more at it, I take back all my complaints about
the patch, Amir was right and I was mis-reading things or just being
stupid.
I also don't worry too much about the possible performance impact of
this on 64-bit, since most architectures that actually care about
performance end up not using this very much (the dcache code is the
most performance-critical, but the word-at-a-time case uses its own
hashing anyway).
So this ends up being mostly used for filesystems that do their own
degraded hashing (usually because they want a case-insensitive
comparison function).
A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS,
and then this potentially makes things more expensive on 64-bit
architectures with slow or lacking multipliers even for the normal
case.
That said, realistically the only such architecture I can think of is
PA-RISC. Nobody really cares about performance on that, it's more of a
"look ma, I've got warts^W an odd machine" platform.
So the patch is fine, and all my initial worries were just misplaced
from not looking at this properly. - Linus ]
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9fd4350ba8 ]
When skb is sent, it will pass the following functions in soft roce.
rxe_send [rdma_rxe]
ip_local_out
__ip_local_out
ip_output
ip_finish_output
ip_finish_output2
dev_queue_xmit
__dev_queue_xmit
dev_hard_start_xmit
In the above functions, if error occurs in the above functions or
iptables rules drop skb after ip_local_out, kfree_skb will be called.
So it is not necessary to call kfree_skb in soft roce module again.
Or else crash will occur.
The steps to reproduce:
server client
--------- ---------
|1.1.1.1|<----rxe-channel--->|1.1.1.2|
--------- ---------
On server: rping -s -a 1.1.1.1 -v -C 10000 -S 512
On client: rping -c -a 1.1.1.1 -v -C 10000 -S 512
The kernel configs CONFIG_DEBUG_KMEMLEAK and
CONFIG_DEBUG_OBJECTS are enabled on both server and client.
When rping runs, run the following command in server:
iptables -I OUTPUT -p udp --dport 4791 -j DROP
Without this patch, crash will occur.
CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2da36d44a9 ]
w/o RXE_START_MASK, the last_psn of IB_OPCODE_RC_SEND_ONLY_INV
will not be updated in update_wqe_psn, and the corresponding
wqe will not be acked in rxe_completer due to its last_psn is
zero. Finally, the other wqe will also not be able to be acked,
because the wqe of IB_OPCODE_RC_SEND_ONLY_INV with last_psn 0
is still there. This causes large amount of io timeout when
nvmeof is over rxe.
Add RXE_START_MASK for IB_OPCODE_RC_SEND_ONLY_INV to fix this.
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f96416cea7 ]
In the cases where iwpm_hash_bucket is NULL and where function
get_mapinfo_hash_bucket returns NULL then the map_info is never added
to hash_bucket_head and hence there is a leak of map_info. Fix this
by nullifying hash_bucket_head and if that is null we know that
that map_info was not added to hash_bucket_head and hence map_info
should be free'd.
Detected by CoverityScan, CID#1222481 ("Resource Leak")
Fixes: 30dc5e63d6 ("RDMA/core: Add support for iWARP Port Mapper user space service")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2918c1a900 ]
There are few issues with validation of netdevice and listen id lookup
for IB (IPoIB) while processing incoming CM request as below.
1. While performing lookup of bind_list in cma_ps_find(), net namespace
of the netdevice can get deleted in cma_exit_net(), resulting in use
after free access of idr and/or net namespace structures.
This lookup occurs from the workqueue context (and not userspace
context where net namespace is always valid).
CPU0 CPU1
==== ====
bind_list = cma_ps_find();
move netdevice to new namespace
delete net namespace
cma_exit_net()
idr_destroy(idr);
[..]
cma_find_listener(bind_list, ..);
2. While netdevice is validated for IP address in given net namespace,
netdevice's net namespace and/or ifindex can change in
cma_get_net_dev() and cma_match_net_dev().
Above issues are overcome by using rcu lock along with netdevice
UP/DOWN state as described below.
When a net namespace is getting deleted, netdevice is closed and
shutdown before moving it back to init_net namespace.
change_net_namespace() synchronizes with any existing use of netdevice
before changing the netdev properties such as net or ifindex.
Once netdevice IFF_UP flags is cleared, such fields are not guaranteed
to be valid.
Therefore, rcu lock along with netdevice state check ensures that,
while route lookup and cm_id lookup is in progress, netdevice of
interest won't migrate to any other net namespace.
This ensures that associated net namespace of netdevice won't get
deleted while rcu lock is held for netdevice which is in IFF_UP state.
Fixes: fa20105e09 ("IB/cma: Add support for network namespaces")
Fixes: 4be74b42a6 ("IB/cma: Separate port allocation to network namespaces")
Fixes: f887f2ac87 ("IB/cma: Validate routing of incoming requests")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f604db645a ]
Previously, if a method contained mandatory attributes in a namespace
that wasn't given by the user, these attributes weren't validated.
Fixing this by iterating over all specification namespaces.
Fixes: fac9658cab ("IB/core: Add new ioctl interface")
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a468f2dbf9 ]
Currently, KVM flushes the TLB after a change to the APIC access page
address or the APIC mode when EPT mode is enabled. However, even in
shadow paging mode, a TLB flush is needed if VPIDs are being used, as
specified in the Intel SDM Section 29.4.5.
So replace vmx_flush_tlb_ept_only() with vmx_flush_tlb(), which will
flush if either EPT or VPIDs are in use.
Signed-off-by: Junaid Shahid <junaids@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c55ca688ed ]
For very very old generation of the management FW Ethernet port
information table may theoretically not be available. This in
turn will cause the nfp_port structures to not be allocated.
Make sure we don't crash the kernel when there is no eth_tbl:
RIP: 0010:nfp_net_pci_probe+0xf2/0xb40 [nfp]
...
Call Trace:
nfp_pci_probe+0x6de/0xab0 [nfp]
local_pci_probe+0x47/0xa0
work_for_cpu_fn+0x1a/0x30
process_one_work+0x1de/0x3e0
Found while working with broken/development version of management FW.
Fixes: a5950182c0 ("nfp: map mac_stats and vf_cfg BARs")
Fixes: 93da7d9660 ("nfp: provide nfp_port to of nfp_net_get_mac_addr()")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7dbc73e612 ]
Commit 36a50a989e ("tipc: fix infinite loop when dumping link monitor
summary") intended to fix a problem with user tool looping when max
number of bearers are enabled.
Unfortunately, the wrong version of the commit was posted, so the
problem was not solved at all.
This commit adds the missing part.
Fixes: 36a50a989e ("tipc: fix infinite loop when dumping link monitor summary")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 444261ca6f ]
Starting from commit 72f36be061 ("net/mlx5: Fix mlx5_get_uars_page to
return error code") the mlx5_get_uars_page() call returns error in case
of failure, but it was mistakenly overlooked in the merge commit.
Fixes: e7996a9a77 ("Merge tag v4.15 of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git")
Reported-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2a01046120 ]
We found the I2C controller count register is unreliable sometimes,
that will cause I2C to lose data. Thus we can read the data count
from 'i2c_dev->count' instead of the I2C controller count register.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit da33aa03fa ]
Add one flag to indicate if the i2c controller has been in suspend state,
which can prevent i2c accesses after i2c controller is suspended following
system suspend.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e6914365fd ]
For LD20, the bit 5 of the offset 0x200c turned out to be a USB3
reset. The hardware document says it is the GIO reset despite LD20
has no GIO bus, confusingly.
Also, fix confusing comments for PXs3.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b4331a6818 ]
A vti6 interface can carry IPv4 as well, so it makes no sense to
enforce a minimum MTU of IPV6_MIN_MTU.
If the user sets an MTU below IPV6_MIN_MTU, IPv6 will be
disabled on the interface, courtesy of addrconf_notify().
Reported-by: Xin Long <lucien.xin@gmail.com>
Fixes: b96f9afee4 ("ipv4/6: use core net MTU range checking")
Fixes: c6741fbed6 ("vti6: Properly adjust vti6 MTU from MTU of lower device")
Fixes: 53c81e95df ("ip6_vti: adjust vti mtu according to mtu of lower device")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 815425567d ]
Here the variable cont is used as the saved_pointer for a call to
strtok_r(). It is safe to use the value uninitialized in this
context however and the later reference is only ever used if
the strtok_r is successful. But, 'gcc-5' at least doesn't have all
this knowledge so initialize cont to NULL. Additionally, do the
natural NULL check before accessing just for completness.
The warning is the following:
./bpf/tools/bpf/bpf_dbg.c: In function ‘cmd_load’:
./bpf/tools/bpf/bpf_dbg.c:1077:13: warning: ‘cont’ may be used uninitialized in this function [-Wmaybe-uninitialized]
} else if (matches(subcmd, "pcap") == 0) {
Fixes: fd981e3c32 "filter: bpf_dbg: add minimal bpf debugger"
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b837913fc2 ]
Make kernel print the correct number of TLB entries on Intel Xeon Phi 7210
(and others)
Before:
[ 0.320005] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
After:
[ 0.320005] Last level dTLB entries: 4KB 256, 2MB 128, 4MB 128, 1GB 16
The entries do exist in the official Intel SMD but the type column there is
incorrect (states "Cache" where it should read "TLB"), but the entries for
the values 0x6B, 0x6C and 0x6D are correctly described as 'Data TLB'.
Signed-off-by: Jacek Tomaka <jacek.tomaka@poczta.fm>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20180423161425.24366-1-jacekt@dugeo.com
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit daa2e3bdbb ]
There is an issue(Errata Ref#226) that the SATA can not be
detected via SATA Port-MultiPlayer(PMP) with following
error log:
ata1.15: PMP product ID mismatch
ata1.15: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata1.15: Port Multiplier vendor mismatch '0x1b4b'!='0x0'
ata1.15: PMP revalidation failed (errno=-19)
After debugging, the reason is found that the value Port-x
FIS-based Switching Control(PxFBS@0x40) become wrong.
According to design, the bits[11:8, 0] of register PxFBS
are cleared when Port Command and Status (0x18) bit[0]
changes its value from 1 to 0, i.e. falling edge of Port
Command and Status bit[0] sends PULSE that resets PxFBS
bits[11:8; 0].
So it needs a mvebu SATA WA to save the port PxFBS register
before PxCMD ST write and restore it afterwards.
This patch implements the WA in a separate function of
ahci_mvebu_stop_engine to override ahci_stop_gngine.
Signed-off-by: Evan Wang <xswang@marvell.com>
Cc: Ofer Heifetz <oferh@marvell.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fa89f53bd7 ]
Marvell armada37xx, armada7k and armada8k share the same
AHCI sata controller IP, and currently there is an issue
(Errata Ref#226)that the SATA can not be detected via SATA
Port-MultiPlayer(PMP). After debugging, the reason is
found that the value of Port-x FIS-based Switching Control
(PxFBS@0x40) became wrong.
According to design, the bits[11:8, 0] of register PxFBS
are cleared when Port Command and Status (0x18) bit[0]
changes its value from 1 to 0, i.e. falling edge of Port
Command and Status bit[0] sends PULSE that resets PxFBS
bits[11:8; 0].
So it needs save the port PxFBS register before PxCMD
ST write and restore the port PxFBS register afterwards
in ahci_stop_engine().
This commit allows drivers to override ahci_stop_engine
behavior for use by the Marvell AHCI driver(and potentially
other drivers in the future).
Signed-off-by: Evan Wang <xswang@marvell.com>
Cc: Ofer Heifetz <oferh@marvell.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5e1ca5e23b ]
It's possible for userspace to control n. Sanitize n when using it as an
array index.
Note that while it appears that n must be bound to the interval [0,3]
due to the way it is extracted from addr, we cannot guarantee that
compiler transformations (and/or future refactoring) will ensure this is
the case, and given this is a slow path it's better to always perform
the masking.
Found by smatch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bf0ddaba65 ]
When the blk-mq inflight implementation was added, /proc/diskstats was
converted to use it, but /sys/block/$dev/inflight was not. Fix it by
adding another helper to count in-flight requests by data direction.
Fixes: f299b7c7a9 ("blk-mq: provide internal in-flight variant")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1612a981b7 ]
Commit 2a5418a13f ("bpf: improve dead code sanitizing") replaced dead
code with a series of ja-1 instructions, for safety. That made JIT
compilation much more complex for some BPF programs. One instance of such
programs is, for example:
bool flag = false
...
/* A bunch of other code */
...
if (flag)
do_something()
In some cases llvm is not able to remove at compile time the code for
do_something(), so the generated BPF program ends up with a large amount
of dead instructions. In one specific real life example, there are two
series of ~500 and ~1000 dead instructions in the program. When the
verifier replaces them with a series of ja-1 instructions, it causes an
interesting behavior at JIT time.
During the first pass, since all the instructions are estimated at 64
bytes, the ja-1 instructions end up being translated as 5 bytes JMP
instructions (0xE9), since the jump offsets become increasingly large (>
127) as each instruction gets discovered to be 5 bytes instead of the
estimated 64.
Starting from the second pass, the first N instructions of the ja-1
sequence get translated into 2 bytes JMPs (0xEB) because the jump offsets
become <= 127 this time. In particular, N is defined as roughly 127 / (5
- 2) ~= 42. So, each further pass will make the subsequent N JMP
instructions shrink from 5 to 2 bytes, making the image shrink every time.
This means that in order to have the entire program converge, there need
to be, in the real example above, at least ~1000 / 42 ~= 24 passes just
for translating the dead code. If we add this number to the passes needed
to translate the other non dead code, it brings such program to 40+
passes, and JIT doesn't complete. Ultimately the userspace loader fails
because such BPF program was supposed to be part of a prog array owner
being JITed.
While it is certainly possible to try to refactor such programs to help
the compiler remove dead code, the behavior is not really intuitive and it
puts further burden on the BPF developer who is not expecting such
behavior. To make things worse, such programs are working just fine in all
the kernel releases prior to the ja-1 fix.
A possible approach to mitigate this behavior consists into noticing that
for ja-1 instructions we don't really need to rely on the estimated size
of the previous and current instructions, we know that a -1 BPF jump
offset can be safely translated into a 0xEB instruction with a jump offset
of -2.
Such fix brings the BPF program in the previous example to complete again
in ~9 passes.
Fixes: 2a5418a13f ("bpf: improve dead code sanitizing")
Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a230cd52b8 ]
The IBM/Lenovo Scrollpoint mice feature a trackpoint-like stick instead of a
scrolling wheel capable of 2-D (vertical+horizontal) scrolling. hid-generic
does only expose 1-D (vertical) scrolling functionality for these mice. This
patch adds support for horizontal scrolling for the IBM/Lenovo Scrollpoint mice
to hid-lenovo.
[jkosina@suse.cz: remove change versioning from git changelog]
Signed-off-by: Peter Ganzhorn <peter.ganzhorn@gmail.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Peter De Wachter <pdewacht@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 292c34c102 ]
When counting uncore event with alias, core event is mistakenly
involved, for example:
perf stat --no-merge -e "unc_m_cas_count.all" -C0 sleep 1
Performance counter stats for 'CPU(s) 0':
0 unc_m_cas_count.all [uncore_imc_4]
0 unc_m_cas_count.all [uncore_imc_2]
0 unc_m_cas_count.all [uncore_imc_0]
153,640 unc_m_cas_count.all [cpu]
0 unc_m_cas_count.all [uncore_imc_5]
25,026 unc_m_cas_count.all [uncore_imc_3]
0 unc_m_cas_count.all [uncore_imc_1]
1.001447890 seconds time elapsed
The reason is that current implementation doesn't check PMU name of a
event when adding its alias into the alias list for core PMU. The
uncore event aliases are mistakenly added.
This bug was introduced in:
commit 14b22ae028 ("perf pmu: Add helper function is_pmu_core to
detect PMU CORE devices")
Checking the PMU name for all PMUs on X86 and other architectures except
ARM.
There is no behavior change for ARM.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: 14b22ae028 ("perf pmu: Add helper function is_pmu_core to detect PMU CORE devices")
Link: http://lkml.kernel.org/r/1524594014-79243-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9478f1927e ]
Our arm64_skip_faulting_instruction() helper advances the userspace
singlestep state machine, but this is also called by the kernel BRK
handler, as used for WARN*().
Thus, if we happen to hit a WARN*() while the user singlestep state
machine is in the active-no-pending state, we'll advance to the
active-pending state without having executed a user instruction, and
will take a step exception earlier than expected when we return to
userspace.
Let's fix this by only advancing the state machine when skipping a user
instruction.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 85602bea29 ]
Debian toolcahin defaults to PIE, and I guess that will also be the case
of most distributions. This causes the following build failure:
AS arch/riscv/kernel/vdso/getcpu.o
AS arch/riscv/kernel/vdso/flush_icache.o
VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg
OBJCOPY arch/riscv/kernel/vdso/vdso.so
AS arch/riscv/kernel/vdso/vdso.o
VDSOLD arch/riscv/kernel/vdso/vdso-dummy.o
LD arch/riscv/kernel/vdso/vdso-syms.o
riscv64-linux-gnu-ld: attempted static link of dynamic object `arch/riscv/kernel/vdso/vdso-dummy.o'
make[2]: *** [arch/riscv/kernel/vdso/Makefile:43: arch/riscv/kernel/vdso/vdso-syms.o] Error 1
make[1]: *** [scripts/Makefile.build:575: arch/riscv/kernel/vdso] Error 2
make: *** [Makefile:1018: arch/riscv/kernel] Error 2
While the root Makefile correctly passes "-fno-PIE" to build individual
object files, the RISC-V kernel also builds vdso-dummy.o as an
executable, which is therefore linked as PIE. Fix that by updating this
specific link rule to also include "-no-pie".
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2707df9773 ]
When Qav mode is enabled, queue 0 should be kept on Stream Reservation
mode. From the i210 datasheet, section 8.12.19:
"Note: Queue0 QueueMode must be set to 1b when TransmitMode is set to
Qav." ("QueueMode 1b" represents the Stream Reservation mode)
The solution is to give queue 0 the all the credits it might need, so
it has priority over queue 1.
A situation where this can happen is when cbs is "installed" only on
queue 1, leaving queue 0 alone. For example:
$ tc qdisc replace dev enp2s0 handle 100: parent root mqprio num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0
$ tc qdisc replace dev enp2s0 parent 100:2 cbs locredit -1470 \
hicredit 30 sendslope -980000 idleslope 20000 offload 1
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f4e5200fc0 ]
The property of the legacy mode for the eMMC PHY turned out to
be wrong. Some eMMC devices are unstable due to the set-up/hold
timing violation. Correct the delay value.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 117e3b7fed ]
Dan Carpenter had pointed this out a while ago, but the code around
this had changed so wasn't causing any problems since that field
was not used in this error path.
Still, it is cleaner to always initialize this field, so changing
the error path to set it.
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 13b86f50ea ]
Starting with kernel 4.17 thermal_cooling_device_register() will call the
get_max_state() op during register.
Since we deref priv->priv in int3403_get_max_state() this means we must
set priv->priv before calling thermal_cooling_device_register().
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1cf6cc74bb ]
Currently if a user requests clock counters for a node without a GPU
resource we will always return EINVAL.
Instead if no GPU resource is attached, fill the gpu_clock_counter
argument with zeroes so that we may proceed and return valid CPU
counters.
Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a0a37862a4 ]
WDAT table on Lenovo Z50-70 is using RTC SRAM (ports 0x70 and 0x71) to
store state of the timer. This conflicts with Linux RTC driver
(rtc-cmos.c) who fails to reserve those ports for itself preventing RTC
from functioning. In addition the WDAT table seems not to be fully
functional because it does not reset the system when the watchdog times
out.
On this system iTCO_wdt works just fine so we simply prefer to use it
instead of WDAT. This makes RTC working again and also results working
watchdog via iTCO_wdt.
Reported-by: Peter Milley <pbmilley@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199033
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 94a82284ad ]
Remove unused #address-cells and #size-cells from pinmux
node. This fixes W=1 warnings of the type:
arch/arm/boot/dts/da850-lcdk.dtb: Warning (avoid_unnecessary_addr_size): /soc@1c00000/pinmux@14120: unnecessary #address-cells/#size-cells without "ranges" or child "reg" property
Tested on DA850 LCDK by checking output of:
/sys/kernel/debug/pinctrl/1c14120.pinmux-pinctrl-single/pins
before and after the change.
Reviewed-by: David Lechner <david@lechnology.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b6a930fa88 ]
If WOL event happened once, the LED[2] interrupt pin will not be
cleared unless we read the CSISR register. If interrupts are in use,
the normal interrupt handling will clear the WOL event. Let's clear the
WOL event before enabling it if !phy_interrupt_is_valid().
Signed-off-by: Jingju Hou <Jingju.Hou@synaptics.com>
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7fd6641de2 ]
Don't do this via custom code, instead now that we have support in the
arch hotplug/hotunplug code, rely on those routines to do the right
thing.
The existing flush doesn't work because it uses ppc64_caches.l1d.size
instead of ppc64_caches.l1d.line_size.
Fixes: 9d5171a8f2 ("powerpc/powernv: Enable removal of memory for in memory tracing")
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cc6a0e315a ]
At least on one Dell system the PNP motherboard resources device
includes resources used by WDAT table. Since PNP gets initialized before
WDAT it results following error and no watchdog:
platform wdat_wdt: failed to claim resource 3: [io 0x046a-0x046c]
ACPI: watchdog: Device creation failed: -16
Now, the PNP system driver is already accustomed with the situation that
it cannot reserve all those motherboard resources because drivers using
those might have reserved them already. In addition putting WDAT table
resources under motherboard resources device makes sense in general.
Fix this by initializing WDAT right before PNP. This allows WDAT to
reserve all its resources and still keeps PNP system driver happy.
Reported-by: Shubhrata.Priyadarsh@dell.com
Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit db71bbbd11 ]
Submitting a cmd IO request (usually on the WRITE device, but for IDX
also on the READ device) is currently done with ccw_device_start()
and a manual timeout in the caller.
On timeout, the caller cleans up the related resources (eg. IO buffer).
But 1) the IO might still be active and utilize those resources, and
2) when the IO completes, qeth_irq() will attempt to clean up the
same resources again.
Instead of introducing additional resource locking, switch to
ccw_device_start_timeout() to ensure IO termination after timeout, and
let the IRQ handler alone deal with cleaning up after a request.
This also removes a stray write->irq_pending reset from
clear_ipacmd_list(). The routine doesn't terminate any pending IO on
the WRITE device, so this should be handled properly via IO timeout
in the IRQ handler.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2f860691c2 ]
There is the following build error with CONFIG_TYPEC_UCSI=m, CONFIG_FTRACE=y
and CONFIG_TRACING=n:
ERROR: "__tracepoint_ucsi_command" [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
ERROR: "__tracepoint_ucsi_register_port" [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
ERROR: "__tracepoint_ucsi_notify" [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
ERROR: "__tracepoint_ucsi_reset_ppm" [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
ERROR: "__tracepoint_ucsi_run_command" [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
ERROR: "__tracepoint_ucsi_ack" [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
ERROR: "__tracepoint_ucsi_connector_change" [drivers/usb/typec/ucsi/typec_ucsi.ko] undefined!
This compination is quite hard to create because CONFIG_TRACING gets selected
only in rare cases without CONFIG_FTRACE.
The build failure is caused by conditionally compiling trace.c depending on
the wrong option CONFIG_FTRACE. Change this to depend on CONFIG_TRACING like
other users of tracepoints do.
Fixes: c1b0bc2dab ("usb: typec: Add support for UCSI interface")
Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a841aa83df ]
Chun-Yi reported a kernel warning message below:
WARNING: CPU: 0 PID: 0 at ../mm/early_ioremap.c:182 early_iounmap+0x4f/0x12c()
early_iounmap(ffffffffff200180, 00000118) [0] size not consistent 00000120
The problem is x86 kexec_file_load adds extra alignment to the efi
memmap: in bzImage64_load():
efi_map_sz = efi_get_runtime_map_size();
efi_map_sz = ALIGN(efi_map_sz, 16);
And __efi_memmap_init maps with the size including the alignment bytes
but efi_memmap_unmap use nr_maps * desc_size which does not include the
extra bytes.
The alignment in kexec code is only needed for the kexec buffer internal
use Actually kexec should pass exact size of the efi memmap to 2nd
kernel.
Link: http://lkml.kernel.org/r/20180417083600.GA1972@dhcp-128-65.nay.redhat.com
Signed-off-by: Dave Young <dyoung@redhat.com>
Reported-by: joeyli <jlee@suse.com>
Tested-by: Randy Wright <rwright@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4bc83b3f27 ]
In the case when the phy_mask is bitwise anded with the phy_index bit is
zero the continue statement currently jumps to the next iteration of the
while loop and phy_index is never actually incremented, potentially
causing an infinite loop if phy_index is less than SCI_MAX_PHS. Fix this
by turning the while loop into a for loop.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f76cdd00ef ]
The read_persistent_clock() uses a timespec, which is not year 2038 safe
on 32bit systems. On parisc architecture, we have implemented generic
RTC drivers that can be used to compensate the system suspend time, but
the RTC time can not represent the nanosecond resolution, so this patch
just converts to read_persistent_clock64() with timespec64.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a9e5b73288 ]
In do_mount() when the MS_* flags are being converted to MNT_* flags,
MS_RDONLY got accidentally convered to SB_RDONLY.
Undo this change.
Fixes: e462ec50cb ("VFS: Differentiate mount flags (MS_*) from internal superblock flags")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 660625922b ]
AFS server records get removed from the net->fs_servers tree when
they're deleted, but not from the net->fs_addresses{4,6} lists, which
can lead to an oops in afs_find_server() when a server record has been
removed, for instance during rmmod.
Fix this by deleting the record from the by-address lists before posting
it for RCU destruction.
The reason this hasn't been noticed before is that the fileserver keeps
probing the local cache manager, thereby keeping the service record
alive, so the oops would only happen when a fileserver eventually gets
bored and stops pinging or if the module gets rmmod'd and a call comes
in from the fileserver during the window between the server records
being destroyed and the socket being closed.
The oops looks something like:
BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
...
Workqueue: kafsd afs_process_async_call [kafs]
RIP: 0010:afs_find_server+0x271/0x36f [kafs]
...
Call Trace:
afs_deliver_cb_init_call_back_state3+0x1f2/0x21f [kafs]
afs_deliver_to_call+0x1ee/0x5e8 [kafs]
afs_process_async_call+0x5b/0xd0 [kafs]
process_one_work+0x2c2/0x504
worker_thread+0x1d4/0x2ac
kthread+0x11f/0x127
ret_from_fork+0x24/0x30
Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f4ea89110d ]
When longer interface names are used, the action names exposed in
/proc/interrupts and /proc/irq/* maybe truncated. For example, when
using the predictable name algorithm in systemd on a HiSilicon D05,
I see:
ubuntu@d05-3:~$ grep enahisic2i0-tx /proc/interrupts | sed 's/.* //'
enahisic2i0-tx0
enahisic2i0-tx1
[...]
enahisic2i0-tx8
enahisic2i0-tx9
enahisic2i0-tx1
enahisic2i0-tx1
enahisic2i0-tx1
enahisic2i0-tx1
enahisic2i0-tx1
enahisic2i0-tx1
Increase the max ring name length to allow for an interface name
of IFNAMSIZE. After this change, I now see:
$ grep enahisic2i0-tx /proc/interrupts | sed 's/.* //'
enahisic2i0-tx0
enahisic2i0-tx1
enahisic2i0-tx2
[...]
enahisic2i0-tx8
enahisic2i0-tx9
enahisic2i0-tx10
enahisic2i0-tx11
enahisic2i0-tx12
enahisic2i0-tx13
enahisic2i0-tx14
enahisic2i0-tx15
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 901932a3f9 ]
The initializing of q->root_blkg is currently outside of queue lock
and rcu, so the blkg may be destroied before the initializing, which
may cause dangling/null references. On the other side, the destroys
of blkg are protected by queue lock or rcu. Put the initializing
inside the queue lock and rcu to make it safer.
Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
CC: Tejun Heo <tj@kernel.org>
CC: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a4af89286f ]
The function dsi_get_cmd_fmt returns enum dsi_cmd_dst_format,
use the correct enum value also for MIPI_DSI_FMT_RGB666/_PACKED.
This has been discovered using clang:
drivers/gpu/drm/msm/dsi/dsi_host.c:743:35: warning: implicit conversion
from enumeration type 'enum dsi_vid_dst_format' to different
enumeration type 'enum dsi_cmd_dst_format' [-Wenum-conversion]
case MIPI_DSI_FMT_RGB666: return VID_DST_FORMAT_RGB666;
~~~~~~ ^~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3976626ea3 ]
Commit 62e3a3e342 changed get_pages() to initialise
msm_gem_object::pages before trying to initialise msm_gem_object::sgt,
so that put_pages() would properly clean up pages in the failure
case.
However, this means that put_pages() now needs to check that
msm_gem_object::sgt is not null before trying to clean it up, and
this check was only applied to part of the cleanup code. Move
it all into the conditional block. (Strictly speaking we don't
need to make the kfree() conditional, but since we can't avoid
checking for null ourselves we may as well do so.)
Fixes: 62e3a3e342 ("drm/msm: fix leak in failed get_pages")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 39f2ff0816 ]
Move these options inside the scope of the 'if' NF_TABLES and
NF_TABLES_IPV6 dependencies. This patch fixes:
net/ipv6/netfilter/nft_chain_nat_ipv6.o: In function `nft_nat_do_chain':
>> net/ipv6/netfilter/nft_chain_nat_ipv6.c:37: undefined reference to `nft_do_chain'
net/ipv6/netfilter/nft_chain_nat_ipv6.o: In function `nft_chain_nat_ipv6_exit':
>> net/ipv6/netfilter/nft_chain_nat_ipv6.c:94: undefined reference to `nft_unregister_chain_type'
net/ipv6/netfilter/nft_chain_nat_ipv6.o: In function `nft_chain_nat_ipv6_init':
>> net/ipv6/netfilter/nft_chain_nat_ipv6.c:87: undefined reference to `nft_register_chain_type'
that happens with:
CONFIG_NF_TABLES=m
CONFIG_NFT_CHAIN_NAT_IPV6=y
Fixes: 02c7b25e5f ("netfilter: nf_tables: build-in filter chain type")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit af17092810 ]
Instead of always multicasting responses, send a unicast netlink message
directed at the correct pid. This will be needed if we ever want to
support multiple userspace processes interacting with the kernel over
iSCSI netlink simultaneously. Limitations can currently be seen if you
attempt to run multiple iscsistart commands in parallel.
We've fixed up the userspace issues in iscsistart that prevented
multiple instances from running, so now attempts to speed up booting by
bringing up multiple iscsi sessions at once in the initramfs are just
running into misrouted responses that this fixes.
Signed-off-by: Chris Leech <cleech@redhat.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 36a50a989e ]
When configuring the number of used bearers to MAX_BEARER and issuing
command "tipc link monitor summary", the command enters infinite loop
in user space.
This issue happens because function tipc_nl_node_dump_monitor() returns
the wrong 'prev_bearer' value when all potential monitors have been
scanned.
The correct behavior is to always try to scan all monitors until either
the netlink message is full, in which case we return the bearer identity
of the affected monitor, or we continue through the whole bearer array
until we can return MAX_BEARERS. This solution also caters for the case
where there may be gaps in the bearer array.
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b7b0d7b25 ]
The Khadas VIM2 board connects the dwc3 controller to an internal 4-port
USB hub which. Two of these ports are accessible directly soldered to
the board, while the other two are accessible through the 40-pin "GPIO"
header.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b83687f359 ]
The LibreTech CC ("Le Potato") board provides four USB connectors.
These are provided by a hub which is connected to the SoC's USB
controller.
Enable the SoC's USB controller to make the USB ports usable. Also turn
on the HDMI_5V regulator when powering on the PHY because (even though
it's not shown in the schematics) HDMI_5V also supplies the USB VBUS.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 972cd12a02 ]
All S905D (GXL) and S912 (GXM) reference boards (namely these are
P230, P231, Q200 and Q201) provide USB connectors.
This enables the USB controller on these boards to make the USB ports
actually usable.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b9f07cb4f4 ]
All boards based on the P212 reference design (the P212 reference board
itself and the Khadas VIM) have USB connectors (in case of the Khadas
VIM the first port is exposed through the USB Type-C connector, the
second port is connected to a 4-port USB hub).
This enables the USB controller on these boards to make the USB ports
actually usable.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 458baa95c8 ]
The USB configuration on GXM is slightly different than on GXL. The dwc3
controller's internal hub has three USB2 ports (instead of 2 on GXL)
along with a dedicated USB2 PHY for this port. However, it seems that
there are no pins on GXM which would allow connecting the third port to
a physical USB port.
Passing the third PHY is required though, because without it none of the
other USB ports is working (this seems to be a limitation of how the
internal USB hub works, if one PHY is disabled then no USB port works).
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8aec5fc1d4 ]
This adds USB host support to the Meson GXL SoC. A dwc3 controller is
used for host-mode, while a dwc2 controller (not added in this patch
because I could not get it working) is used for device-mode only.
The dwc3 controller's internal roothub has two USB2 ports enabled but no
USB3 port. Each of the ports is supplied by a separate PHY. The USB pins
are connected to the SoC's USBHOST_A and USBOTG_B pins.
Due to the way the roothub works internally the USB PHYs are left
enabled. When the dwc3 controller is disabled the PHY is never powered on
so it does not draw any extra power. However, when the dwc3 host
controller is enabled then all PHYs also have to be enabled, otherwise
USB devices will not be detected (regardless of whether they are plugged
into an enabled port or not). This means that only the dwc3 controller
has to be enabled on boards with USB support (instead of requiring all
boards to enable the PHYs additionally with the chance of forgetting to
enable one and breaking all other ports with that as well).
This also adds the USB3 PHY which currently only does some basic
initialization. That however is required because without it high-speed
devices (like USB thumb drives) do not work on some devices (probably
because the bootloader does not configure the USB3 PHY registers).
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 946b81da11 ]
As described in the comment of blkcg_activate_policy(),
*Update of each blkg is protected by both queue and blkcg locks so
that holding either lock and testing blkcg_policy_enabled() is
always enough for dereferencing policy data.*
with queue lock held, there is no need to hold blkcg lock in
blkcg_deactivate_policy(). Similar case is in
blkcg_activate_policy(), which has removed holding of blkcg lock in
commit 4c55f4f9ad.
Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
CC: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 49530e6411 ]
In case of xspi work in busy condition, may send bytes failed.
once something wrong, spi controller did't work any more
My test found this situation appear in both of read/write process.
so when TX FIFO is full, add one byte delay before send data;
Signed-off-by: sxauwsk <sxauwsk@163.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bf9a41377d ]
When vgic_prune_ap_list() finds an interrupt that needs to be migrated
to a new VCPU, we should notify this VCPU of the pending interrupt,
since it requires immediate action.
Kick this VCPU once we have added the new IRQ to the list, but only
after dropping the locks.
Reported-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5db8f8d109 ]
As documented in the devicetree bindings (pci/kirin-pcie.txt) and the
reset gpio name must be 'reset-gpios'. However, current driver
erroneously looks for a 'reset-gpio' resource which makes the driver
probe fail. Fix it.
Fixes: fc5165db24 ("PCI: kirin: Add HiSilicon Kirin SoC PCIe controller driver")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Xiaowei Song <songxiaowei@hisilicon.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9dfbf78e41 ]
If there is no d-cache-size property in the device tree, l1d_size could
be zero. We don't actually expect that to happen, it's only been seen
on mambo (simulator) in some configurations.
A zero-size l1d_size leads to the loop in the asm wrapping around to
2^64-1, and then walking off the end of the fallback area and
eventually causing a page fault which is fatal.
Just default to 64K which is correct on some CPUs, and sane enough to
not cause a crash on others.
Fixes: aa8a5e0062 ('powerpc/64s: Add support for RFI flush of L1-D cache')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Rewrite comment and change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 144345a4a8 ]
If CONFIG_RASPBERRYPI_FIRMWARE=n:
drivers/gpio/gpio-raspberrypi-exp.c: In function ‘rpi_exp_gpio_get_polarity’:
drivers/gpio/gpio-raspberrypi-exp.c:71: warning: ‘get.polarity’ is used uninitialized in this function
drivers/gpio/gpio-raspberrypi-exp.c: In function ‘rpi_exp_gpio_get_direction’:
drivers/gpio/gpio-raspberrypi-exp.c:150: warning: ‘get.direction’ is used uninitialized in this function
The dummy firmware interface functions return 0, which means success,
causing subsequent code to make use of the never initialized output
parameter.
Fix this by making the dummy functions return an error code (-ENOSYS)
instead.
Note that this assumes the firmware always fills in the requested data
in the CONFIG_RASPBERRYPI_FIRMWARE=y case.
Fixes: d45f1a563b ("staging: vc04_services: fix up rpi firmware functions")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0a12e80ce4 ]
Commit a09cd35658 ("ARM: bcm2835: add rpi power domain driver")
attempted to annotate the structure rpi_power_domain_packet with
__packed but introduced a typo and made it named __packet instead. Just
drop the annotation since the structure is naturally aligned already.
Fixes: a09cd35658 ("ARM: bcm2835: add rpi power domain driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e86281e700 ]
Both ecryptfs_filldir() and ecryptfs_readlink_lower() use
ecryptfs_decode_and_decrypt_filename() to translate lower filenames to
upper filenames. The function correctly passes up lower filenames,
unchanged, when filename encryption isn't in use. However, it was also
passing up lower filenames when the filename wasn't encrypted or
when decryption failed. Since 88ae4ab980, eCryptfs refuses to lookup
lower plaintext names when filename encryption is enabled so this
resulted in a situation where userspace would see lower plaintext
filenames in calls to getdents(2) but then not be able to lookup those
filenames.
An example of this can be seen when enabling filename encryption on an
eCryptfs mount at the root directory of an Ext4 filesystem:
$ ls -1i /lower
12 ECRYPTFS_FNEK_ENCRYPTED.FWYZD8TcW.5FV-TKTEYOHsheiHX9a-w.NURCCYIMjI8pn5BDB9-h3fXwrE--
11 lost+found
$ ls -1i /upper
ls: cannot access '/upper/lost+found': No such file or directory
? lost+found
12 test
With this change, the lower lost+found dentry is ignored:
$ ls -1i /lower
12 ECRYPTFS_FNEK_ENCRYPTED.FWYZD8TcW.5FV-TKTEYOHsheiHX9a-w.NURCCYIMjI8pn5BDB9-h3fXwrE--
11 lost+found
$ ls -1i /upper
12 test
Additionally, some potentially noisy error/info messages in the related
code paths are turned into debug messages so that the logs can't be
easily filled.
Fixes: 88ae4ab980 ("ecryptfs_lookup(): try either only encrypted or plaintext name")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bc8a3ef194 ]
The size of these modules is 0x2000, not 0x3000. The extra 0x1000
after 0x2000 is for the interconnect target agent which is a separate
device.
Fixes: 7415b0b4c6 ("ARM: dts: omap4: add minimal l4 bus layout with
control module support")
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4ad69b80e8 ]
CLK_MUX_ROUND_CLOSEST is part of the clk_mux documentation but clk_mux
directly calls __clk_mux_determine_rate(), which overrides the flag.
As result, if clk_mux is instantiated with CLK_MUX_ROUND_CLOSEST, the
flag will be ignored and the clock rounded down.
To solve this, this patch expose clk_mux_determine_rate_flags() in the
clk-provider API and uses it in the determine_rate() callback of clk_mux.
Fixes: 15a02c1f6d ("clk: Add __clk_mux_determine_rate_closest")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 10b4640833 ]
The change fixes a bit field overflow which allows to write to higher
bits while calculating SPI transfer clock and setting BRPS and BRDV
bit fields, the problem is reproduced if 'parent_rate' to 'spi_hz'
ratio is greater than 1024, for instance
p->min_div = 2,
MSO rate = 33333333,
SPI device rate = 10000
results in
k = 5, i.e. BRDV = 0b100 or 1/32 prescaler output,
BRPS = 105,
TSCR value = 0x6804, thus MSSEL and MSIMM bit fields are non-zero.
Fixes: 65d5665bb2 ("spi: sh-msiof: Update calculation of frequency dividing")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4f34a5130a ]
When specifying string type mount option (e.g., iocharset)
several times in a mount, current option parsing may
cause memory leak. Hence, call kfree for previous one
in this case. Meanwhile, check memory allocation result
for it.
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 760dd0eeae ]
The module exit function of the smsgiucv module uses the incorrect CP
command to disable SMSG messages. The correct command is "SET SMSG OFF".
Use it.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a1cc7034e3 ]
While a barrier is present in the writeX() functions before the register
write, a similar barrier is missing in the readX() functions after the
register read. This could allow memory accesses following readX() to
observe stale data.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/19069/
[jhogan@kernel.org: Tidy commit message]
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 92183a4289 ]
The ignore mask logic in send_to_group() does not match the logic
in fanotify_should_send_event(). In the latter, a vfsmount mark ignore
mask precedes an inode mark mask and in the former, it does not.
That difference may cause events to be sent to fanotify backend for no
reason. Fix the logic in send_to_group() to match that of
fanotify_should_send_event().
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cf2cbadc20 ]
Introduce a second skb list for handling control messages and limit the
number of allowed messages. Some control messages are considered more
crucial than others, resulting in the need for a second skb list. By
splitting the list into a separate high and low priority list we can
ensure that messages on the high list get added to the head of the list
that gets processed, this however has no functional impact. Previously
there was no limit on the number of messages allowed on the queue, this
could result in the queue growing boundlessly and eventually the host
running out of memory.
Fixes: b985f870a5 ("nfp: process control messages in workqueue in flower app")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5496295aef ]
We currently allow signals to interrupt the wait for management FW
commands. Exiting the wait should not cause trouble, the FW will
just finish executing the command in the background and new commands
will wait for the old one to finish.
However, this may not be what users expect (Ctrl-C not actually stopping
the command). Moreover some systems routinely request link information
with signals pending (Ubuntu 14.04 runs a landscape-sysinfo python tool
from MOTD) worrying users with errors like these:
nfp 0000:04:00.0: nfp_nsp: Error -512 waiting for code 0x0007 to start
nfp 0000:04:00.0: nfp: reading port table failed -512
Make the wait for management FW responses non-interruptible.
Fixes: 1a64821c6a ("nfp: add support for service processor access")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f6b7aeee8f ]
writeX() has strong ordering semantics with respect to memory updates.
In the absence of a write barrier or a compiler barrier, the compiler
can reorder register and memory update instructions. This breaks the
writeX() API.
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18997/
[jhogan@kernel.org: Tidy commit message]
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f0f4cf5b30 ]
According to the sub-section titled 'VM-Execution Control Fields' in the
section titled 'Basic VM-Entry Checks' in Intel SDM vol. 3C, the following
vmentry check must be enforced:
If the 'virtualize APIC-accesses' VM-execution control is 1, the
APIC-access address must satisfy the following checks:
- Bits 11:0 of the address must be 0.
- The address should not set any bits beyond the processor's
physical-address width.
This patch adds the necessary check to conform to this rule. If the check
fails, we cause the L2 VMENTRY to fail which is what the associated unit
test (following patch) expects.
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 90619eb1dc ]
The split between ACPI and PCI platforms generated issues with randconfig:
with SND_SST_ATOM_HIFI2_PLATFORM_PCI=y and
SND_SST_ATOM_HIFI2_PLATFORM=m, we get this module link failure:
ERROR: "sst_context_init"
[sound/soc/intel/atom/sst/snd-intel-sst-acpi.ko] undefined!
ERROR: "sst_context_cleanup"
[sound/soc/intel/atom/sst/snd-intel-sst-acpi.ko] undefined!
ERROR: "sst_alloc_drv_context"
[sound/soc/intel/atom/sst/snd-intel-sst-acpi.ko] undefined!
ERROR: "intel_sst_pm" [sound/soc/intel/atom/sst/snd-intel-sst-acpi.ko]
undefined!
ERROR: "sst_configure_runtime_pm"
[sound/soc/intel/atom/sst/snd-intel-sst-acpi.ko] undefined!
To keep things simple, let's expose two configs for
SND_SST_ATOM_HIFI2_PLATFORM_PCI and SND_SST_ATOM_HIFI2_PLATFORM_ACPI,
which select a common SND_SST_ATOM_HIFI2_PLATFORM option. To avoid
breaking existing solutions with the semantics change,
SND_SST_ATOM_HIFI2_PLATFORM_ACPI uses "default ACPI" so that "make
oldnoconfig" and "make olddefconfig" still work as expected.
Also remove mentions of Medfield while we are at it since it was
removed recently.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 4772c16ede ("ASoC: Intel: Kconfig: Simplify-clarify ACPI/PCI
dependencies")
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-By: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 506a7be93f ]
According to i.MX7ULP reference manual, TPM_SC_CPWMS can ONLY be written when
counter is disabled, TPM_SC_TOF is write-1-clear, TPM_C0SC_CHF is also
write-1-clear, correct these registers initialization flow;
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 730f23b660 upstream.
In p8_aes_xts_init() we do a printk(KERN_INFO ...) to report the
fallback implementation we're using. However with a slow console this
can significantly affect the speed of crypto operations. So remove it.
Fixes: c07f5d3da6 ("crypto: vmx - Adding support for XTS")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1411b5218a upstream.
In the vmx AES init routines we do a printk(KERN_INFO ...) to report
the fallback implementation we're using.
However with a slow console this can significantly affect the speed of
crypto operations. Using 'cryptsetup benchmark' the removal of the
printk() leads to a ~5x speedup for aes-cbc decryption.
So remove them.
Fixes: 8676590a15 ("crypto: vmx - Adding AES routines for VMX module")
Fixes: 8c755ace35 ("crypto: vmx - Adding CBC routines for VMX module")
Fixes: 4f7f60d312 ("crypto: vmx - Adding CTR routines for VMX module")
Fixes: cc333cd68d ("crypto: vmx - Adding GHASH routines for VMX module")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c782a8c43e upstream.
After issuing a request an endless loop was used to read the
completion state from memory which is asynchronously updated
by the ZIP coprocessor.
Add an upper bound to the retry attempts to prevent a CPU getting stuck
forever in case of an error. Additionally, add a read memory barrier
and a small delay between the reading attempts.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37ff02acaa upstream.
Enabling virtual mapped kernel stacks breaks the thunderx_zip
driver. On compression or decompression the executing CPU hangs
in an endless loop. The reason for this is the usage of __pa
by the driver which does no longer work for an address that is
not part of the 1:1 mapping.
The zip driver allocates a result struct on the stack and needs
to tell the hardware the physical address within this struct
that is used to signal the completion of the request.
As the hardware gets the wrong address after the broken __pa
conversion it writes to an arbitrary address. The zip driver then
waits forever for the completion byte to contain a non-zero value.
Allocating the result struct from 1:1 mapped memory resolves this
bug.
Signed-off-by: Jan Glauber <jglauber@cavium.com>
Reviewed-by: Robert Richter <rrichter@cavium.com>
Cc: stable <stable@vger.kernel.org> # 4.14
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a488aaec6 upstream.
There are two IV-related issues:
(1) crypto API does not guarantee to provide an IV buffer that is DMAable,
thus it's incorrect to DMA map it
(2) for in-place decryption, since ciphertext is overwritten with
plaintext, updated IV (req->info) will contain the last block of plaintext
(instead of the last block of ciphertext)
While these two issues could be fixed separately, it's straightforward
to fix both in the same time - by using the {ablkcipher,aead}_edesc
extended descriptor to store the IV that will be fed to the crypto engine;
this allows for fixing (2) by saving req->src[last_block] in req->info
directly, i.e. without allocating yet another temporary buffer.
A side effect of the fix is that it's no longer possible to have the IV
contiguous with req->src or req->dst.
Code checking for this case is removed.
Cc: <stable@vger.kernel.org> # 4.14+
Fixes: a68a193805 ("crypto: caam/qi - properly set IV after {en,de}crypt")
Link: http://lkml.kernel.org/r/20170113084620.GF22022@gondor.apana.org.au
Reported-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 115957bb3e upstream.
There are two IV-related issues:
(1) crypto API does not guarantee to provide an IV buffer that is DMAable,
thus it's incorrect to DMA map it
(2) for in-place decryption, since ciphertext is overwritten with
plaintext, updated req->info will contain the last block of plaintext
(instead of the last block of ciphertext)
While these two issues could be fixed separately, it's straightforward
to fix both in the same time - by allocating extra space in the
ablkcipher_edesc for the IV that will be fed to the crypto engine;
this allows for fixing (2) by saving req->src[last_block] in req->info
directly, i.e. without allocating another temporary buffer.
A side effect of the fix is that it's no longer possible to have the IV
and req->src contiguous. Code checking for this case is removed.
Cc: <stable@vger.kernel.org> # 4.13+
Fixes: 854b06f768 ("crypto: caam - properly set IV after {en,de}crypt")
Link: http://lkml.kernel.org/r/20170113084620.GF22022@gondor.apana.org.au
Reported-by: Gilad Ben-Yossef <gilad@benyossef.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c9fa24ca7 upstream.
The functions that were used in the emulation of fxrstor, fxsave, sgdt and
sidt were originally meant for task switching, and as such they did not
check privilege levels. This is very bad when the same functions are used
in the emulation of unprivileged instructions. This is CVE-2018-10853.
The obvious fix is to add a new argument to ops->read_std and ops->write_std,
which decides whether the access is a "system" access or should use the
processor's CPL.
Fixes: 129a72a0d3 ("KVM: x86: Introduce segmented_write_std", 2017-01-12)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a7e625ce5 upstream.
Commit 9b96fbacda ("serial: PL011: clear pending interrupts")
clears the RX and receive timeout interrupts on pl011 startup, to
avoid a screaming-interrupt scenario that can occur when the
firmware or bootloader leaves these interrupts asserted.
This has been noted as an issue when running Linux on qemu [1].
Unfortunately, the above fix seems to lead to potential
misbehaviour if the RX FIFO interrupt is asserted _non_ spuriously
on driver startup, if the RX FIFO is also already full to the
trigger level.
Clearing the RX FIFO interrupt does not change the FIFO fill level.
In this scenario, because the interrupt is now clear and because
the FIFO is already full to the trigger level, no new assertion of
the RX FIFO interrupt can occur unless the FIFO is drained back
below the trigger level. This never occurs because the pl011
driver is waiting for an RX FIFO interrupt to tell it that there is
something to read, and does not read the FIFO at all until that
interrupt occurs.
Thus, simply clearing "spurious" interrupts on startup may be
misguided, since there is no way to be sure that the interrupts are
truly spurious, and things can go wrong if they are not.
This patch instead clears the interrupt condition by draining the
RX FIFO during UART startup, after clearing any potentially
spurious interrupt. This should ensure that an interrupt will
definitely be asserted if the RX FIFO subsequently becomes
sufficiently full.
The drain is done at the point of enabling interrupts only. This
means that it will occur any time the UART is newly opened through
the tty layer. It will not apply to polled-mode use of the UART by
kgdboc: since that scenario cannot use interrupts by design, this
should not matter. kgdboc will interact badly with "normal" use of
the UART in any case: this patch makes no attempt to paper over
such issues.
This patch does not attempt to address the case where the RX FIFO
fills faster than it can be drained: that is a pathological
hardware design problem that is beyond the scope of the driver to
work around. As a failsafe, the number of poll iterations for
draining the FIFO is limited to twice the FIFO size. This will
ensure that the kernel at least boots even if it is impossible to
drain the FIFO for some reason.
[1] [Qemu-devel] [Qemu-arm] [PATCH] pl011: do not put into fifo
before enabled the interruption
https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg06446.html
Reported-by: Wei Xu <xuwei5@hisilicon.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Fixes: 9b96fbacda ("serial: PL011: clear pending interrupts")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Cc: stable <stable@vger.kernel.org>
Tested-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13dc04d0e5 upstream.
I noticed that unused UARTs won't necessarily idle properly always
unless at least one byte tx transfer is done first.
After some debugging I narrowed down the problem to the scr register
dma configuration bits that need to be set before softreset for the
clocks to idle. Unless we do this, the module clkctrl idlest bits
may be set to 1 instead of 3 meaning the clock will never idle and
is blocking deeper idle states for the whole domain.
This might be related to the configuration done by the bootloader
or kexec booting where certain configurations cause the 8250 or
the clkctrl clock to jam in a way where setting of the scr bits
and reset is needed to clear it. I've tried diffing the 8250
registers for the various modes, but did not see anything specific.
So far I've only seen this on omap4 but I'm suspecting this might
also happen on the other clkctrl using SoCs considering they
already have a quirk enabled for UART_ERRATA_CLOCK_DISABLE.
Let's fix the issue by configuring scr before reset for basic dma
even if we don't use it. The scr register will be reset when we do
softreset few lines after, and we restore scr on resume. We should
do this for all the SoCs with UART_ERRATA_CLOCK_DISABLE quirk flag
set since the ones with UART_ERRATA_CLOCK_DISABLE are all based
using clkctrl similar to omap4.
Looks like both OMAP_UART_SCR_DMAMODE_1 | OMAP_UART_SCR_DMAMODE_CTL
bits are needed for the clkctrl to idle after a softreset.
And we need to add omap4 to also use the UART_ERRATA_CLOCK_DISABLE
for the related workaround to be enabled. This same compatible
value will also be used for omap5.
Fixes: cdb929e445 ("serial: 8250_omap: workaround errata around idling UART after using DMA")
Cc: Keerthy <j-keerthy@ti.com>
Cc: Matthijs van Duin <matthijsvanduin@gmail.com>
Cc: Sekhar Nori <nsekhar@ti.com>
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa2f80e752 upstream.
The best granularity of residue that DMA engine can report is in the BURST
units, so the serial driver must use MAXBURST = 1 and DMA_SLAVE_BUSWIDTH_1_BYTE
if it relies on exact number of bytes transferred by DMA engine.
Fixes: 62c37eedb7 ("serial: samsung: add dma reqest/release functions")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9594b5be7e upstream.
I was puzzled while looking at /proc/interrupts and random things showed
up between reboots. This occurred more often but I realised it later. The
"correct" output should be:
|38: 11861 atmel-aic5 2 Level ttyS0
but I saw sometimes
|38: 6426 atmel-aic5 2 Level tty1
and accounted it wrongly as correct. This is use after free and the
former example randomly got the "old" pointer which pointed to the same
content. With SLAB_FREELIST_RANDOM and HARDENED I even got
|38: 7067 atmel-aic5 2 Level E=Started User Manager for UID 0
or other nonsense.
As it turns out the tty, pointer that is accessed in atmel_startup(), is
freed() before atmel_shutdown(). It seems to happen quite often that the
tty for ttyS0 is allocated and freed while ->shutdown is not invoked. I
don't do anything special - just a systemd boot :)
Use dev_name(&pdev->dev) as the IRQ name for request_irq(). This exists
as long as the driver is loaded so no use-after-free here.
Cc: stable@vger.kernel.org
Fixes: 761ed4a945 ("tty: serial_core: convert uart_close to use tty_port_close")
Acked-by: Richard Genoud <richard.genoud@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 003bc1dee2 upstream.
This patch fixes an issue that this driver cannot call phy_init()
if a gadget driver is alreadly loaded because usb_add_gadget_udc()
might call renesas_usb3_start() via .udc_start.
This patch also revises the typo (s/an optional/optional/).
Fixes: 279d4bc640 ("usb: gadget: udc: renesas_usb3: add support for generic phy")
Cc: <stable@vger.kernel.org> # v4.15+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d998844016 upstream.
This patch fixes an issue that this driver causes panic if a gadget
driver is already loaded because usb_add_gadget_udc() might call
renesas_usb3_start() via .udc_start, and then pm_runtime_get_sync()
in renesas_usb3_start() doesn't work correctly.
Note that the usb3_to_dev() macro should not be called at this timing
because the macro uses the gadget structure.
Fixes: cf06df3fae ("usb: gadget: udc: renesas_usb3: move pm_runtime_{en,dis}able()")
Cc: <stable@vger.kernel.org> # v4.15+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a014a7339 upstream.
When printer_write() calls usb_ep_queue(), a udc driver (e.g.
renesas_usbhs driver) may call usb_gadget_giveback_request() in
the udc .queue ops immediately. Then, printer_write() calls
list_add(&req->list, &dev->tx_reqs_active) wrongly. After that,
if we do unbind the printer driver, WARN_ON() happens in
printer_func_unbind() because the list entry is not removed.
So, this patch moves list_add(&req->list, &dev->tx_reqs_active)
calling before usb_ep_queue().
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05826ff135 upstream.
The USB Type-C PHY in Intel WhiskeyCove PMIC has build-in
USB Type-C state machine which we were relying on to
configure the CC lines correctly. This patch removes that
dependency and configures the CC line according to commands
from the port manager (tcpm.c) in wcove_set_cc().
This fixes an issue where USB devices attached to the USB
Type-C port do not get enumerated. When acting as
source/host, the HW FSM sometimes fails to configure the PHY
correctly.
Fixes: 3c4fb9f169 ("usb: typec: wcove: start using tcpm for USB PD support")
Cc: stable@vger.kernel.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b4555e776 upstream.
Driver currently crashes due to NULL pointer deference
while updating PHY tune register if nvmem cell is NULL.
Since, fused value for Tune1/2 register is optional,
we'd rather bail out.
Fixes: ca04d9d3e1 ("phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips")
Reviewed-by: Vivek Gautam <vivek.gautam@codeaurora.org>
Reviewed-by: Evan Green <evgreen@chromium.org>
Cc: stable <stable@vger.kernel.org> # 4.14+
Signed-off-by: Manu Gautam <mgautam@codeaurora.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c4e97ddfe upstream.
The ALWAYS_SYNC flag is currently honored by the usb-storage driver but not UAS
and is required to work around devices that become unstable upon being
queried for cache. This code is taken straight from:
drivers/usb/storage/scsiglue.c:284
Signed-off-by: Alexander Kappner <agk@godking.net>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0d6ec8809 upstream.
pdev_nr and rhport can be controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/usb/usbip/vhci_sysfs.c:238 detach_store() warn: potential spectre issue 'vhcis'
drivers/usb/usbip/vhci_sysfs.c:328 attach_store() warn: potential spectre issue 'vhcis'
drivers/usb/usbip/vhci_sysfs.c:338 attach_store() warn: potential spectre issue 'vhci->vhci_hcd_ss->vdev'
drivers/usb/usbip/vhci_sysfs.c:340 attach_store() warn: potential spectre issue 'vhci->vhci_hcd_hs->vdev'
Fix this by sanitizing pdev_nr and rhport before using them to index
vhcis and vhci->vhci_hcd_ss->vdev respectively.
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dbafc28955 upstream.
It's amazing that this driver ever worked, but now that x86 doesn't
allow USB data to be sent off of the stack, it really does not work at
all. Fix this up by properly allocating the data for the small
"commands" that get sent to the device off of the stack.
We do this for one command by having a whole urb just for ack messages,
as they can be submitted in interrupt context, so we can not use
usb_bulk_msg(). But the poweron command can sleep (and does), so use
usb_bulk_msg() for that transfer.
Reported-by: Carlos Manuel Santos <cmmpsantos@gmail.com>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45ad559a29 upstream.
Syzbot reported yet another warning with Ion:
WARNING: CPU: 0 PID: 1467 at drivers/staging/android/ion/ion.c:122
ion_buffer_destroy+0xd4/0x190 drivers/staging/android/ion/ion.c:122
Kernel panic - not syncing: panic_on_warn set ...
This is catching that a buffer was freed with an existing kernel mapping
still present. This can be easily be triggered from userspace by calling
DMA_BUF_SYNC_START without calling DMA_BUF_SYNC_END. Switch to a single
pr_warn_once to indicate the error without being disruptive.
Reported-by: syzbot+cd8bcd40cb049efa2770@syzkaller.appspotmail.com
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce14e868a5 upstream.
Int the next patch the emulator's .read_std and .write_std callbacks will
grow another argument, which is not needed in kvm_read_guest_virt and
kvm_write_guest_virt_system's callers. Since we have to make separate
functions, let's give the currently existing names a nicer interface, too.
Fixes: 129a72a0d3 ("KVM: x86: Introduce segmented_write_std", 2017-01-12)
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 727ba748e1 upstream.
VMX instructions executed inside a L1 VM will always trigger a VM exit
even when executed with cpl 3. This means we must perform the
privilege check in software.
Fixes: 70f3aac964ae("kvm: nVMX: Remove superfluous VMX instruction fault checks")
Cc: stable@vger.kernel.org
Signed-off-by: Felix Wilhelm <fwilhelm@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79367a6574 upstream.
Wrap the common invocation of ctxt->ops->read_std and ctxt->ops->write_std, so
as to have a smaller patch when the functions grow another argument.
Fixes: 129a72a0d3 ("KVM: x86: Introduce segmented_write_std", 2017-01-12)
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a780a3ea62 upstream.
MSB of CR3 is a reserved bit if the PCIDE bit is not set in CR4.
It should be checked when PCIDE bit is not set, however commit
'd1cd3ce900441 ("KVM: MMU: check guest CR3 reserved bits based on
its physical address width")' removes the bit 63 checking
unconditionally. This patch fixes it by checking bit 63 of CR3
when PCIDE bit is not set in CR4.
Fixes: d1cd3ce900 (KVM: MMU: check guest CR3 reserved bits based on its physical address width)
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: stable@vger.kernel.org
Reviewed-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b66af2d63 upstream.
Key extensions (struct sadb_key) include a user-specified number of key
bits. The kernel uses that number to determine how much key data to copy
out of the message in pfkey_msg2xfrm_state().
The length of the sadb_key message must be verified to be long enough,
even in the case of SADB_X_AALG_NULL. Furthermore, the sadb_key_len value
must be long enough to include both the key data and the struct sadb_key
itself.
Introduce a helper function verify_key_len(), and call it from
parse_exthdrs() where other exthdr types are similarly checked for
correctness.
Signed-off-by: Kevin Easton <kevin@guarana.org>
Reported-by: syzbot+5022a34ca5a3d49b84223653fab632dfb7b4cf37@syzkaller.appspotmail.com
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 327ea4adcf upstream.
Avoid that complaints similar to the following appear in the kernel log
if the number of zones is sufficiently large:
fio: page allocation failure: order:9, mode:0x140c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
Call Trace:
dump_stack+0x63/0x88
warn_alloc+0xf5/0x190
__alloc_pages_slowpath+0x8f0/0xb0d
__alloc_pages_nodemask+0x242/0x260
alloc_pages_current+0x6a/0xb0
kmalloc_order+0x18/0x50
kmalloc_order_trace+0x26/0xb0
__kmalloc+0x20e/0x220
blkdev_report_zones_ioctl+0xa5/0x1a0
blkdev_ioctl+0x1ba/0x930
block_ioctl+0x41/0x50
do_vfs_ioctl+0xaa/0x610
SyS_ioctl+0x79/0x90
do_syscall_64+0x79/0x1b0
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Fixes: 3ed05a987e ("blk-zoned: implement ioctls")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Shaun Tancheff <shaun.tancheff@seagate.com>
Cc: Damien Le Moal <damien.lemoal@hgst.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c826fed67 upstream.
-Tx request and data is copied to HW Q in 64B desc, check for
end of queue and adjust the current position to start from
beginning before passing the additional request info.
-key context copy should check key length only
-Few reverse christmas tree correction
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76ef6b28ea upstream.
Since we have the ttm and gem vma managers using a subset
of the file address space for objects, and these start at
0x100000000 they will overflow the new mmap checks.
I've checked all the mmap routines I could see for any
bad behaviour but overall most people use GEM/TTM VMA
managers even the legacy drivers have a hashtable.
Reported-and-Tested-by: Arthur Marsh (amarsh04 on #radeon)
Fixes: be83bbf806 (mmap: introduce sane default mmap limits)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3635da2a3 upstream.
Before the guest finishes the device initialization, the device can be
removed anytime by the host, and after that the host won't respond to
the guest's request, so the guest should be prepared to handle this
case.
Add a polling mechanism to detect device presence.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: edited commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f5a4941aa6 ]
After commit e2b3b35eb9 ("vhost_net: batch used ring update in rx"),
we tend to batch updating used heads. But it doesn't flush batched
heads before trying to do busy polling, this will cause vhost to wait
for guest TX which waits for the used RX. Fixing by flush batched
heads before busy loop.
1 byte TCP_RR performance recovers from 13107.83 to 50402.65.
Fixes: e2b3b35eb9 ("vhost_net: batch used ring update in rx")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3125642695 ]
The netsec network controller IP can drive 64 address bits for DMA, and
the DMA mask is set accordingly in the driver. However, the SynQuacer
SoC, which is the only silicon incorporating this IP at the moment,
integrates this IP in a manner that leaves address bits [63:40]
unconnected.
Up until now, this has not resulted in any problems, given that the DDR
controller doesn't decode those bits to begin with. However, recent
firmware updates for platforms incorporating this SoC allow the IOMMU
to be enabled, which does decode address bits [47:40], and allocates
top down from the IOVA space, producing DMA addresses that have bits
set that have been left unconnected.
Both the DT and ACPI (IORT) descriptions of the platform take this into
account, and only describe a DMA address space of 40 bits (using either
dma-ranges DT properties, or DMA address limits in IORT named component
nodes). However, even though our IOMMU and bus layers may take such
limitations into account by setting a narrower DMA mask when creating
the platform device, the netsec probe() entrypoint follows the common
practice of setting the DMA mask uncondionally, according to the
capabilities of the IP block itself rather than to its integration into
the chip.
It is currently unclear what the correct fix is here. We could hack around
it by only setting the DMA mask if it deviates from its default value of
DMA_BIT_MASK(32). However, this makes it impossible for the bus layer to
use DMA_BIT_MASK(32) as the bus limit, and so it appears that a more
comprehensive approach is required to take DMA limits imposed by the
SoC as a whole into account.
In the mean time, let's limit the DMA mask to 40 bits. Given that there
is currently only one SoC that incorporates this IP, this is a reasonable
approach that can be backported to -stable and buys us some time to come
up with a proper fix going forward.
Fixes: 533dd11a12 ("net: socionext: Add Synquacer NetSec driver")
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Jassi Brar <jaswinder.singh@linaro.org>
Cc: Masahisa Kojima <masahisa.kojima@linaro.org>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Jassi Brar <jaswinder.singh@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 82612de1c9 ]
After commit f6cc9c054e, the following conf is broken (note that the
default loopback mtu is 65536, ie IP_MAX_MTU + 1):
$ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev lo
add tunnel "gre0" failed: Invalid argument
$ ip l a type dummy
$ ip l s dummy1 up
$ ip l s dummy1 mtu 65535
$ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev dummy1
add tunnel "gre0" failed: Invalid argument
dev_set_mtu() doesn't allow to set a mtu which is too large.
First, let's cap the mtu returned by ip_tunnel_bind_dev(). Second, remove
the magic value 0xFFF8 and use IP_MAX_MTU instead.
0xFFF8 seems to be there for ages, I don't know why this value was used.
With a recent kernel, it's also possible to set a mtu > IP_MAX_MTU:
$ ip l s dummy1 mtu 66000
After that patch, it's also possible to bind an ip tunnel on that kind of
interface.
CC: Petr Machata <petrm@mellanox.com>
CC: Ido Schimmel <idosch@mellanox.com>
Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a
Fixes: f6cc9c054e ("ip_tunnel: Emit events for post-register MTU changes")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6890418bbb ]
After a linearized packet was redirected by XDP, we should not go for
the err path which will try to pop buffers for the next packet and
increase the drop counter. Fixing this by just drop the page refcnt
for the original page.
Fixes: 186b3c998c ("virtio-net: support XDP_REDIRECT")
Reported-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f8f4bef322 ]
When dealing with ingress rule on a netdev, if we did fine through the
conventional path, there's no need to continue into the egdev route,
and we can stop right there.
Not doing so may cause a 2nd rule to be added by the cls api layer
with the ingress being the egdev.
For example, under sriov switchdev scheme, a user rule of VFR A --> VFR B
will end up with two HW rules (1) VF A --> VF B and (2) uplink --> VF B
Fixes: 208c0f4b52 ('net: sched: use tc_setup_cb_call to call per-block callbacks')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5040cc990c ]
In the Broadcom Cygnus SoC, the brcm tag needs to be inserted
in between the mac address and the ether type (should use
'DSA_PROTO_TAG_BRCM') for the packets sent to the internal
b53 switch.
Since the Cygnus was added with the BCM58XX device id and the
BCM58XX uses 'DSA_PROTO_TAG_BRCM_PREPEND', the data path is
broken, due to the incorrect brcm tag location.
Add a new b53 device id (BCM583XX) for Cygnus family to fix the
issue. Add the new device id to the BCM58XX family as Cygnus
is similar to the BCM58XX in most other functionalities.
Fixes: 1160603960 ("net: dsa: b53: Support prepended Broadcom tags")
Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Reported-by: Clément Péron <peron.clem@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Clément Péron <peron.clem@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 850e088d5b ]
If we successfully linearize the packet, num_buf will be set to zero
which may confuse error handling path which assumes num_buf is at
least 1 and this can lead the code tries to pop the descriptor of next
buffer. Fixing this by checking num_buf against 1 before decreasing.
Fixes: 4941d472bf ("virtio-net: do not reset during XDP set")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 902a545904 ]
When RXFCS feature is enabled, the HW do not strip the FCS data,
however it is not present in the checksum calculated by the HW.
Fix that by manually calculating the FCS checksum and adding it to the SKB
checksum field.
Add helper function to find the FCS data for all SKB forms (linear,
one fragment or more).
Fixes: 102722fc68 ("net/mlx5e: Add support for RXFCS feature flag")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d546b67cda ]
spin_lock/unlock was used instead of spin_un/lock_irq
in a procedure used in process space, on a spinlock
which can be grabbed in an interrupt.
This caused the stack trace below to be displayed (on kernel
4.17.0-rc1 compiled with Lock Debugging enabled):
[ 154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
[ 154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G I
[ 154.675856] -----------------------------------------------------
[ 154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core]
[ 154.700927]
and this task is already holding:
[ 154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib]
[ 154.718028] which would create a new lock dependency:
[ 154.723705] (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.}
[ 154.731922]
but this new dependency connects a SOFTIRQ-irq-safe lock:
[ 154.740798] (&(&cq->lock)->rlock){..-.}
[ 154.740800]
... which became SOFTIRQ-irq-safe at:
[ 154.752163] _raw_spin_lock_irqsave+0x3e/0x50
[ 154.757163] mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib]
[ 154.762554] ipoib_tx_poll+0x4a/0xf0 [ib_ipoib]
...
to a SOFTIRQ-irq-unsafe lock:
[ 154.815603] (&(&qp_table->lock)->rlock){+.+.}
[ 154.815604]
... which became SOFTIRQ-irq-unsafe at:
[ 154.827718] ...
[ 154.827720] _raw_spin_lock+0x35/0x50
[ 154.833912] mlx4_qp_lookup+0x1e/0x50 [mlx4_core]
[ 154.839302] mlx4_flow_attach+0x3f/0x3d0 [mlx4_core]
Since mlx4_qp_lookup() is called only in process space, we can
simply replace the spin_un/lock calls with spin_un/lock_irq calls.
Fixes: 6dc06c08be ("net/mlx4: Fix the check in attaching steering rules")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3d62b2a0db ]
We need to drop refcnt to xdp_page if we see a gso packet. Otherwise
it will be leaked. Fixing this by moving the check of gso packet above
the linearizing logic. While at it, remove useless comment as well.
Cc: John Fastabend <john.fastabend@gmail.com>
Fixes: 72979a6c35 ("virtio_net: xdp, add slowpath case for non contiguous buffers")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5d458a13dd ]
We should not go for the error path after successfully transmitting a
XDP buffer after linearizing. Since the error path may try to pop and
drop next packet and increase the drop counters. Fixing this by simply
drop the refcnt of original page and go for xmit path.
Fixes: 72979a6c35 ("virtio_net: xdp, add slowpath case for non contiguous buffers")
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 664088f8d6 ]
This patch reorders the error cases in showing the XPS configuration so
that we hold off on memory allocation until after we have verified that we
can support XPS on a given ring.
Fixes: 184c449f91 ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 733a969a7e ]
We are currently doing auxiliary control register reads with the shadow
register value 0b111 (0x7) which incidentally is also the selector value
that should be present in bits [2:0]. Fix this by using the appropriate
selector mask which is defined (MII_BCM54XX_AUXCTL_SHDWSEL_MASK).
This does not have a functional impact yet because we always access the
MII_BCM54XX_AUXCTL_SHDWSEL_MISC (0x7) register in the current code.
This might change at some point though.
Fixes: 5b4e290051 ("net: phy: broadcom: add bcm54xx_auxctl_read")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2f17becfbe ]
Use the right device to determine if redirect should be sent especially
when using vrf. Same as well as when sending the redirect.
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1b15ad683a ]
DaeRyong Jeong reports a race between vhost_dev_cleanup() and
vhost_process_iotlb_msg():
Thread interleaving:
CPU0 (vhost_process_iotlb_msg) CPU1 (vhost_dev_cleanup)
(In the case of both VHOST_IOTLB_UPDATE and
VHOST_IOTLB_INVALIDATE)
===== =====
vhost_umem_clean(dev->iotlb);
if (!dev->iotlb) {
ret = -EFAULT;
break;
}
dev->iotlb = NULL;
The reason is we don't synchronize between them, fixing by protecting
vhost_process_iotlb_msg() with dev mutex.
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Fixes: 6b1e6cc785 ("vhost: new device IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 25ea66544b ]
This code was introduced in 2011 around the same time that we made
netdev_features_t a u64 type. These days a u32 is not big enough to
hold all the potential features.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1d88ba1ebb ]
syzbot reported a rcu_sched self-detected stall on CPU which is caused
by too small value set on rto_min with SCTP_RTOINFO sockopt. With this
value, hb_timer will get stuck there, as in its timer handler it starts
this timer again with this value, then goes to the timer handler again.
This problem is there since very beginning, and thanks to Eric for the
reproducer shared from a syzbot mail.
This patch fixes it by not allowing sctp_transport_timeout to return a
smaller value than HZ/5 for hb_timer, which is based on TCP's min rto.
Note that it doesn't fix this issue by limiting rto_min, as some users
are still using small rto and no proper value was found for it yet.
Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fdd13dd350 ]
ILT entry requires 12 bit right shifted physical address.
Existing mask for ILT entry of physical address i.e.
ILT_ENTRY_PHY_ADDR_MASK is not sufficient to handle 64bit
address because upper 8 bits of 64 bit address were getting
masked which resulted in completer abort error on
PCIe bus due to invalid address.
Fix that mask to handle 64bit physical address.
Fixes: fe56b9e6a8 ("qed: Add module with basic common support")
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9aad13b087 ]
Commit b84bbaf7a6 ("packet: in packet_snd start writing at link
layer allocation") ensures that packet_snd always starts writing
the link layer header in reserved headroom allocated for this
purpose.
This is needed because packets may be shorter than hard_header_len,
in which case the space up to hard_header_len may be zeroed. But
that necessary padding is not accounted for in skb->len.
The fix, however, is buggy. It calls skb_push, which grows skb->len
when moving skb->data back. But in this case packet length should not
change.
Instead, call skb_reserve, which moves both skb->data and skb->tail
back, without changing length.
Fixes: b84bbaf7a6 ("packet: in packet_snd start writing at link layer allocation")
Reported-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f7c728332 ]
Testing Telit LM940 with ICMP packets > 14552 bytes revealed that
the modem needs FLAG_SEND_ZLP to properly work, otherwise the cdc
mbim data interface won't be anymore responsive.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79fb218d97 ]
On newer PHYs, we need to select the expansion register to write with
setting bits [11:8] to 0xf. This was done correctly by bcm7xxx.c prior
to being migrated to generic code under bcm-phy-lib.c which
unfortunately used the older implementation from the BCM54xx days.
Fix this by creating an inline stub: bcm_write_exp_sel() which adds the
correct value (MII_BCM54XX_EXP_SEL_ER) and update both the Cygnus PHY
and BCM7xxx PHY drivers which require setting these bits.
broadcom.c is unchanged because some PHYs even use a different selector
method, so let them specify it directly (e.g: SerDes secondary selector).
Fixes: a1cba5613e ("net: phy: Add Broadcom phy library for common interfaces")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8005b09d99 ]
The current error handling code has an issue where it does:
if (priv->txchan)
cpdma_chan_destroy(priv->txchan);
The problem is that ->txchan is either valid or an error pointer (which
would lead to an Oops). I've changed it to use multiple error labels so
that the test can be removed.
Also there were some missing calls to netif_napi_del().
Fixes: 3ef0fdb234 ("net: davinci_emac: switch to new cpdma layer")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 75d4e704fa ]
Per discussion with David at netconf 2018, let's clarify
DaveM's position of handling stable backports in netdev-FAQ.
This is important for people relying on upstream -stable
releases.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3d609342cc ]
Commit d02ba2a611 ("l2tp: fix race in pppol2tp_release with session
object destroy") tried to fix a race condition where a PPPoL2TP socket
would disappear while the L2TP session was still using it. However, it
missed the root issue which is that an L2TP session may accept to be
reconnected if its associated socket has entered the release process.
The tentative fix makes the session hold the socket it is connected to.
That saves the kernel from crashing, but introduces refcount leakage,
preventing the socket from completing the release process. Once stalled,
everything the socket depends on can't be released anymore, including
the L2TP session and the l2tp_ppp module.
The root issue is that, when releasing a connected PPPoL2TP socket, the
session's ->sk pointer (RCU-protected) is reset to NULL and we have to
wait for a grace period before destroying the socket. The socket drops
the session in its ->sk_destruct callback function, so the session
will exist until the last reference on the socket is dropped.
Therefore, there is a time frame where pppol2tp_connect() may accept
reconnecting a session, as it only checks ->sk to figure out if the
session is connected. This time frame is shortened by the fact that
pppol2tp_release() calls l2tp_session_delete(), making the session
unreachable before resetting ->sk. However, pppol2tp_connect() may
grab the session before it gets unhashed by l2tp_session_delete(), but
it may test ->sk after the later got reset. The race is not so hard to
trigger and syzbot found a pretty reliable reproducer:
https://syzkaller.appspot.com/bug?id=418578d2a4389074524e04d641eacb091961b2cf
Before d02ba2a611, another race could let pppol2tp_release()
overwrite the ->__sk pointer of an L2TP session, thus tricking
pppol2tp_put_sk() into calling sock_put() on a socket that is different
than the one for which pppol2tp_release() was originally called. To get
there, we had to trigger the race described above, therefore having one
PPPoL2TP socket being released, while the session it is connected to is
reconnecting to a different PPPoL2TP socket. When releasing this new
socket fast enough, pppol2tp_release() overwrites the session's
->__sk pointer with the address of the new socket, before the first
pppol2tp_put_sk() call gets scheduled. Then the pppol2tp_put_sk() call
invoked by the original socket will sock_put() the new socket,
potentially dropping its last reference. When the second
pppol2tp_put_sk() finally runs, its socket has already been freed.
With d02ba2a611, the session takes a reference on both sockets.
Furthermore, the session's ->sk pointer is reset in the
pppol2tp_session_close() callback function rather than in
pppol2tp_release(). Therefore, ->__sk can't be overwritten and
pppol2tp_put_sk() is called only once (l2tp_session_delete() will only
run pppol2tp_session_close() once, to protect the session against
concurrent deletion requests). Now pppol2tp_put_sk() will properly
sock_put() the original socket, but the new socket will remain, as
l2tp_session_delete() prevented the release process from completing.
Here, we don't depend on the ->__sk race to trigger the bug. Getting
into the pppol2tp_connect() race is enough to leak the reference, no
matter when new socket is released.
So it all boils down to pppol2tp_connect() failing to realise that the
session has already been connected. This patch drops the unneeded extra
reference counting (mostly reverting d02ba2a611) and checks that
neither ->sk nor ->__sk is set before allowing a session to be
connected.
Fixes: d02ba2a611 ("l2tp: fix race in pppol2tp_release with session object destroy")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6009d1fe6b ]
In divasmain.c, the function divas_write() firstly invokes the function
diva_xdi_open_adapter() to open the adapter that matches with the adapter
number provided by the user, and then invokes the function diva_xdi_write()
to perform the write operation using the matched adapter. The two functions
diva_xdi_open_adapter() and diva_xdi_write() are located in diva.c.
In diva_xdi_open_adapter(), the user command is copied to the object 'msg'
from the userspace pointer 'src' through the function pointer 'cp_fn',
which eventually calls copy_from_user() to do the copy. Then, the adapter
number 'msg.adapter' is used to find out a matched adapter from the
'adapter_queue'. A matched adapter will be returned if it is found.
Otherwise, NULL is returned to indicate the failure of the verification on
the adapter number.
As mentioned above, if a matched adapter is returned, the function
diva_xdi_write() is invoked to perform the write operation. In this
function, the user command is copied once again from the userspace pointer
'src', which is the same as the 'src' pointer in diva_xdi_open_adapter() as
both of them are from the 'buf' pointer in divas_write(). Similarly, the
copy is achieved through the function pointer 'cp_fn', which finally calls
copy_from_user(). After the successful copy, the corresponding command
processing handler of the matched adapter is invoked to perform the write
operation.
It is obvious that there are two copies here from userspace, one is in
diva_xdi_open_adapter(), and one is in diva_xdi_write(). Plus, both of
these two copies share the same source userspace pointer, i.e., the 'buf'
pointer in divas_write(). Given that a malicious userspace process can race
to change the content pointed by the 'buf' pointer, this can pose potential
security issues. For example, in the first copy, the user provides a valid
adapter number to pass the verification process and a valid adapter can be
found. Then the user can modify the adapter number to an invalid number.
This way, the user can bypass the verification process of the adapter
number and inject inconsistent data.
This patch reuses the data copied in
diva_xdi_open_adapter() and passes it to diva_xdi_write(). This way, the
above issues can be avoided.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fa1be7e01e ]
Some of the code paths calculating flow hash for IPv6 use flowlabel member
of struct flowi6 which, despite its name, encodes both flow label and
traffic class. If traffic class changes within a TCP connection (as e.g.
ssh does), ECMP route can switch between path. It's also inconsistent with
other code paths where ip6_flowlabel() (returning only flow label) is used
to feed the key.
Use only flow label everywhere, including one place where hash key is set
using ip6_flowinfo().
Fixes: 51ebd31815 ("ipv6: add support of equal cost multipath (ECMP)")
Fixes: f70ea018da ("net: Add functions to get skb->hash based on flow structures")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 730c54d594 ]
A precondition check in ip_recv_error triggered on an otherwise benign
race. Remove the warning.
The warning triggers when passing an ipv6 socket to this ipv4 error
handling function. RaceFuzzer was able to trigger it due to a race
in setsockopt IPV6_ADDRFORM.
---
CPU0
do_ipv6_setsockopt
sk->sk_socket->ops = &inet_dgram_ops;
---
CPU1
sk->sk_prot->recvmsg
udp_recvmsg
ip_recv_error
WARN_ON_ONCE(sk->sk_family == AF_INET6);
---
CPU0
do_ipv6_setsockopt
sk->sk_family = PF_INET;
This socket option converts a v6 socket that is connected to a v4 peer
to an v4 socket. It updates the socket on the fly, changing fields in
sk as well as other structs. This is inherently non-atomic. It races
with the lockless udp_recvmsg path.
No other code makes an assumption that these fields are updated
atomically. It is benign here, too, as ip_recv_error cares only about
the protocol of the skbs enqueued on the error queue, for which
sk_family is not a precise predictor (thanks to another isue with
IPV6_ADDRFORM).
Link: http://lkml.kernel.org/r/20180518120826.GA19515@dragonet.kaist.ac.kr
Fixes: 7ce875e5ec ("ipv4: warn once on passing AF_INET6 socket to ip_recv_error")
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 848235edb5 ]
Currently, raw6_sk(sk)->ip6mr_table is set unconditionally during
ip6_mroute_setsockopt(MRT6_TABLE). A subsequent attempt at the same
setsockopt will fail with -ENOENT, since we haven't actually created
that table.
A similar fix for ipv4 was included in commit 5e1859fbcc ("ipv4: ipmr:
various fixes and cleanups").
Fixes: d1db275dd3 ("ipv6: ip6mr: support multiple tables")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 322eaa06d5 ]
In commit 624dbf55a3 ("driver/net: enic: Try DMA 64 first, then
failover to DMA") DMA mask was changed from 40 bits to 64 bits.
Hardware actually supports only 47 bits.
Fixes: 624dbf55a3 ("driver/net: enic: Try DMA 64 first, then failover to DMA")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dd612f18a4 ]
Nearby code that also tests port suggests that the P0 constant should be
used when port is zero.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression e,e1;
@@
* e ? e1 : e1
// </smpl>
Fixes: 6c3218c6f7 ("bnx2x: Adjust ETS to 578xx")
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d2c2725c2c ]
Check for 0xE00 (RECOVERABLE_ERR) along with ARMFW UE (0x0)
in be_detect_error() to know whether the error is valid error or not
Fixes: 673c96e5a ("be2net: Fix UE detection logic for BE3")
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ae89c7a82 upstream.
In file included from scripts/kconfig/zconf.tab.c:2485:
scripts/kconfig/confdata.c: In function ‘conf_write’:
scripts/kconfig/confdata.c:773:22: warning: ‘%s’ directive writing likely 7 or more bytes into a region of size between 1 and 4097 [-Wformat-overflow=]
sprintf(newname, "%s%s", dirname, basename);
^~
scripts/kconfig/confdata.c:773:19: note: assuming directive output of 7 bytes
sprintf(newname, "%s%s", dirname, basename);
^~~~~~
scripts/kconfig/confdata.c:773:2: note: ‘sprintf’ output 1 or more bytes (assuming 4104) into a destination of size 4097
sprintf(newname, "%s%s", dirname, basename);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
scripts/kconfig/confdata.c:776:23: warning: ‘.tmpconfig.’ directive writing 11 bytes into a region of size between 1 and 4097 [-Wformat-overflow=]
sprintf(tmpname, "%s.tmpconfig.%d", dirname, (int)getpid());
^~~~~~~~~~~
scripts/kconfig/confdata.c:776:3: note: ‘sprintf’ output between 13 and 4119 bytes into a destination of size 4097
sprintf(tmpname, "%s.tmpconfig.%d", dirname, (int)getpid());
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Increase the size of tmpname and newname to make GCC happy.
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a79fd3908 upstream.
Some drivers, such as vxlan and wireguard, use the skb's dst in order to
determine things like PMTU. They therefore loose functionality when flow
offloading is enabled. So, we ensure the skb has it before xmit'ing it
in the offloading path.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 423913ad4a upstream.
Commit be83bbf806 ("mmap: introduce sane default mmap limits") was
introduced to catch problems in various ad-hoc character device drivers
doing mmap and getting the size limits wrong. In the process, it used
"known good" limits for the normal cases of mapping regular files and
block device drivers.
It turns out that the "s_maxbytes" limit was less "known good" than I
thought. In particular, /proc doesn't set it, but exposes one regular
file to mmap: /proc/vmcore. As a result, that file got limited to the
default MAX_INT s_maxbytes value.
This went unnoticed for a while, because apparently the only thing that
needs it is the s390 kernel zfcpdump, but there might be other tools
that use this too.
Vasily suggested just changing s_maxbytes for all of /proc, which isn't
wrong, but makes me nervous at this stage. So instead, just make the
new mmap limit always be MAX_LFS_FILESIZE for regular files, which won't
affect anything else. It wasn't the regular file case I was worried
about.
I'd really prefer for maxsize to have been per-inode, but that is not
how things are today.
Fixes: be83bbf806 ("mmap: introduce sane default mmap limits")
Reported-by: Vasily Gorbik <gor@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be83bbf806 upstream.
The internal VM "mmap()" interfaces are based on the mmap target doing
everything using page indexes rather than byte offsets, because
traditionally (ie 32-bit) we had the situation that the byte offset
didn't fit in a register. So while the mmap virtual address was limited
by the word size of the architecture, the backing store was not.
So we're basically passing "pgoff" around as a page index, in order to
be able to describe backing store locations that are much bigger than
the word size (think files larger than 4GB etc).
But while this all makes a ton of sense conceptually, we've been dogged
by various drivers that don't really understand this, and internally
work with byte offsets, and then try to work with the page index by
turning it into a byte offset with "pgoff << PAGE_SHIFT".
Which obviously can overflow.
Adding the size of the mapping to it to get the byte offset of the end
of the backing store just exacerbates the problem, and if you then use
this overflow-prone value to check various limits of your device driver
mmap capability, you're just setting yourself up for problems.
The correct thing for drivers to do is to do their limit math in page
indices, the way the interface is designed. Because the generic mmap
code _does_ test that the index doesn't overflow, since that's what the
mmap code really cares about.
HOWEVER.
Finding and fixing various random drivers is a sisyphean task, so let's
just see if we can just make the core mmap() code do the limiting for
us. Realistically, the only "big" backing stores we need to care about
are regular files and block devices, both of which are known to do this
properly, and which have nice well-defined limits for how much data they
can access.
So let's special-case just those two known cases, and then limit other
random mmap users to a backing store that still fits in "unsigned long".
Realistically, that's not much of a limit at all on 64-bit, and on
32-bit architectures the only worry might be the GPU drivers, which can
have big physical address spaces.
To make it possible for drivers like that to say that they are 64-bit
clean, this patch does repurpose the "FMODE_UNSIGNED_OFFSET" bit in the
file flags to allow drivers to mark their file descriptors as safe in
the full 64-bit mmap address space.
[ The timing for doing this is less than optimal, and this should really
go in a merge window. But realistically, this needs wide testing more
than it needs anything else, and being main-line is the only way to do
that.
So the earlier the better, even if it's outside the proper development
cycle - Linus ]
Cc: Kees Cook <keescook@chromium.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4faa99965e upstream.
If io_destroy() gets to cancelling everything that can be cancelled and
gets to kiocb_cancel() calling the function driver has left in ->ki_cancel,
it becomes vulnerable to a race with IO completion. At that point req
is already taken off the list and aio_complete() does *NOT* spin until
we (in free_ioctx_users()) releases ->ctx_lock. As the result, it proceeds
to kiocb_free(), freing req just it gets passed to ->ki_cancel().
Fix is simple - remove from the list after the call of kiocb_cancel(). All
instances of ->ki_cancel() already have to cope with the being called with
iocb still on list - that's what happens in io_cancel(2).
Cc: stable@kernel.org
Fixes: 0460fef2a9 "aio: use cancellation list lazily"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ed2424b91 upstream.
Commit d5c435df4a ("intel_th: msu: Use the real device in case of IOMMU
domain allocation") changes dma buffer allocation to use the actual
underlying device, but forgets to change the deallocation path, which leads
to (if you've got CAP_SYS_RAWIO):
> # echo 0,0 > /sys/bus/intel_th/devices/0-msc0/nr_pages
> ------------[ cut here ]------------
> kernel BUG at ../linux/drivers/iommu/intel-iommu.c:3670!
> CPU: 3 PID: 231 Comm: sh Not tainted 4.17.0-rc1+ #2729
> RIP: 0010:intel_unmap+0x11e/0x130
...
> Call Trace:
> intel_free_coherent+0x3e/0x60
> msc_buffer_win_free+0x100/0x160 [intel_th_msu]
This patch fixes the buffer deallocation code to use the correct device.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Fixes: d5c435df4a ("intel_th: msu: Use the real device in case of IOMMU domain allocation")
Reported-by: Baofeng Tian <baofeng.tian@intel.com>
CC: stable@vger.kernel.org # v4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52a1923629 upstream.
This reverts commit fb47ada8dc.
In some situations when we set TXOP_BACKOFF, the probe frame is
not sent at all. What it worse then sending probe frame as part
of AMPDU and can degrade 11n performance to 11g rates.
Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d077d4b59 upstream.
Swapping load on huge=always tmpfs (with khugepaged tuned up to be very
eager, but I'm not sure that is relevant) soon hung uninterruptibly,
waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most
often when "cp -a" was trying to write to a smallish file. Debug showed
that the page in question was not locked, and page->mapping NULL by now,
but page->index consistent with having been in a huge page before.
Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede7
("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") added
in; but took hours to reproduce on a 4.17 kernel (no idea why).
The culprit proved to be the __ClearPageDirty() on tails beyond i_size in
__split_huge_page(): the non-atomic __bitoperation may have been safe when
4.8's baa355fd33 ("thp: file pages support for split_huge_page()")
introduced it, but liable to erase PageWaiters after 4.10's 6290602709
("mm: add PageWaiters indicating tasks are waiting for a page bit").
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1805291841070.3197@eggly.anvils
Fixes: 6290602709 ("mm: add PageWaiters indicating tasks are waiting for a page bit")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a840c93ca7 upstream.
When a GID entry is invalid EAGAIN is returned. This is an incorrect error
code, there is nothing that will make this GID entry valid again in
bounded time.
Some user space tools fail incorrectly if EAGAIN is returned here, and
this represents a small ABI change from earlier kernels.
The first patch in the Fixes list makes entries that were valid before
to become invalid, allowing this code to trigger, while the second patch
in the Fixes list introduced the wrong EAGAIN.
Therefore revert the return result to EINVAL which matches the historical
expectations of the ibv_query_gid_type() API of the libibverbs user space
library.
Cc: <stable@vger.kernel.org>
Fixes: 598ff6bae6 ("IB/core: Refactor GID modify code for RoCE")
Fixes: 03db3a2d81 ("IB/core: Add RoCE GID table management")
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 806e30873f upstream.
Commit b5e2ced9bf ("stm class: Use vmalloc for the master map") caused
a build error on some arches as vmalloc.h was not explicitly included.
Fix that by adding it to the list of includes.
Fixes: b5e2ced9bf ("stm class: Use vmalloc for the master map")
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5e2ced9bf upstream.
Fengguang is running into a warning from the buddy allocator:
> swapper/0: page allocation failure: order:9, mode:0x14040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.17.0-rc1 #262
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> Call Trace:
...
> __kmalloc+0x14b/0x180: ____cache_alloc at mm/slab.c:3127
> stm_register_device+0xf3/0x5c0: stm_register_device at drivers/hwtracing/stm/core.c:695
...
Which is basically a result of the stm class trying to allocate ~512kB
for the dummy_stm with its default parameters. There's no reason, however,
for it not to be vmalloc()ed instead, which is what this patch does.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
CC: stable@vger.kernel.org # v4.4+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9ddf73476 upstream.
Since an SRP remote port is attached as a child to shost->shost_gendev
and as the only child, the translation from the shost pointer into an
rport pointer must happen by looking up the shost child that is an
rport. This patch fixes the following KASAN complaint:
BUG: KASAN: slab-out-of-bounds in srp_timed_out+0x57/0x110 [scsi_transport_srp]
Read of size 4 at addr ffff880035d3fcc0 by task kworker/1:0H/19
CPU: 1 PID: 19 Comm: kworker/1:0H Not tainted 4.16.0-rc3-dbg+ #1
Workqueue: kblockd blk_mq_timeout_work
Call Trace:
dump_stack+0x85/0xc7
print_address_description+0x65/0x270
kasan_report+0x231/0x350
srp_timed_out+0x57/0x110 [scsi_transport_srp]
scsi_times_out+0xc7/0x3f0 [scsi_mod]
blk_mq_terminate_expired+0xc2/0x140
bt_iter+0xbc/0xd0
blk_mq_queue_tag_busy_iter+0x1c7/0x350
blk_mq_timeout_work+0x325/0x3f0
process_one_work+0x441/0xa50
worker_thread+0x76/0x6c0
kthread+0x1b2/0x1d0
ret_from_fork+0x24/0x30
Fixes: e68ca75200 ("scsi_transport_srp: Reduce failover time")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 28e4213dd3 upstream.
Having PR_FP_MODE_FRE (i.e. Config5.FRE) set without PR_FP_MODE_FR (i.e.
Status.FR) is not supported as the lone purpose of Config5.FRE is to
emulate Status.FR=0 handling on FPU hardware that has Status.FR=1
hardwired[1][2]. Also we do not handle this case elsewhere, and assume
throughout our code that TIF_HYBRID_FPREGS and TIF_32BIT_FPREGS cannot
be set both at once for a task, leading to inconsistent behaviour if
this does happen.
Return unsuccessfully then from prctl(2) PR_SET_FP_MODE calls requesting
PR_FP_MODE_FRE to be set with PR_FP_MODE_FR clear. This corresponds to
modes allowed by `mips_set_personality_fp'.
References:
[1] "MIPS Architecture For Programmers, Vol. III: MIPS32 / microMIPS32
Privileged Resource Architecture", Imagination Technologies,
Document Number: MD00090, Revision 6.02, July 10, 2015, Table 9.69
"Config5 Register Field Descriptions", p. 262
[2] "MIPS Architecture For Programmers, Volume III: MIPS64 / microMIPS64
Privileged Resource Architecture", Imagination Technologies,
Document Number: MD00091, Revision 6.03, December 22, 2015, Table
9.72 "Config5 Register Field Descriptions", p. 288
Fixes: 9791554b45 ("MIPS,prctl: add PR_[GS]ET_FP_MODE prctl options for MIPS")
Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.0+
Patchwork: https://patchwork.linux-mips.org/patch/19327/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32795631e6 upstream.
While doing a global software reset, these bits are not cleared and let
some bootloader fail to initialise the GPHYs. The bootloader don't
expect the GPHYs in reset, as they aren't during power on.
The asserts were a workaround for a wrong syscon-reboot mask. With a
mask set which includes the GPHY resets, these resets aren't required
any more.
Fixes: 126534141b ("MIPS: lantiq: Add a GPHY driver which uses the RCU syscon-mfd")
Signed-off-by: Mathias Kresin <dev@kresin.me>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: John Crispin <john@phrozen.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.14+
Patchwork: https://patchwork.linux-mips.org/patch/19003/
[jhogan@kernel.org: Fix build warnings]
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76974ef9d1 upstream.
We need to select the buffer code, otherwise we get build errors
with undefined functions on the trigger and buffer,
if we select just IIO and then AT91_SAMA5D2_ADC from menuconfig
This adds a Kconfig 'select' statement like other ADC
drivers have it already.
Fixes: 5e1a1da0f8 ("iio: adc: at91-sama5d2_adc: add hw trigger and buffer support")
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f0c8d1f6dc upstream.
When iterating through the channels, the index in the array is not the
scan index. Added an xlate function to translate to the proper index.
The result of the bug is that the channel array is indexed with a wrong index,
thus instead of the proper channel, we access invalid memory, which may
lead to invalid results and/or corruption.
This will be used also for devicetree channel xlate.
Fixes: 5e1a1da0f ("iio: adc: at91-sama5d2_adc: add hw trigger and buffer support")
Fixes: 073c66201 ("iio: adc: at91-sama5d2_adc: add support for DMA")
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7531cf59bf upstream.
When doing successive oversampling settings, it may fail to update filter
parameters silently:
- First time oversampling is being set, it will be successful, as fl->res
is 0 initially.
- Next attempts with various oversamp value may return 0 (success), but
keep previous filter parameters, due to 'res' never reaches above or
equal current 'fl->res'.
This is particularly true when setting sampling frequency (that relies on
oversamp). Typical failure without error:
- run 1st test @16kHz samp freq will succeed
- run new test @8kHz will succeed as well
- run new test @16kHz (again): sample rate will remain 8kHz without error
Fixes: e2e6771c64 ("IIO: ADC: add STM32 DFSDM sigma delta ADC support")
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d13de4b02 upstream.
Currently, the following causes a kernel OOPS in memcpy:
echo 1073741825 > buffer/length
echo 1 > buffer/enable
Note that using 1073741824 instead of 1073741825 causes "write error:
Cannot allocate memory" but no OOPS.
This is because 1073741824 == 2^30 and 1073741825 == 2^30+1. Since kfifo
rounds up to the nearest power of 2, it will actually call kmalloc with
roundup_pow_of_two(length) * bytes_per_datum.
Using length == 1073741825 and bytes_per_datum == 2, we get:
kmalloc(roundup_pow_of_two(1073741825) * 2
or kmalloc(2147483648 * 2)
or kmalloc(4294967296)
or kmalloc(UINT_MAX + 1)
so this overflows to 0, causing kmalloc to return ZERO_SIZE_PTR and
subsequent memcpy to fail once the device is enabled.
Fix this by checking for overflow prior to allocating a kfifo. With this
check added, the above code returns -EINVAL when enabling the buffer,
rather than causing an OOPS.
Signed-off-by: Martin Kelly <mkelly@xevo.com>
cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c043ec1ca5 upstream.
Currently, we use int for buffer length and bytes_per_datum. However,
kfifo uses unsigned int for length and size_t for element size. We need
to make sure these matches or we will have bugs related to overflow (in
the range between INT_MAX and UINT_MAX for length, for example).
In addition, set_bytes_per_datum uses size_t while bytes_per_datum is an
int, which would cause bugs for large values of bytes_per_datum.
Change buffer length to use unsigned int and bytes_per_datum to use
size_t.
Signed-off-by: Martin Kelly <mkelly@xevo.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f92253024 upstream.
hid_sensor_set_power_work() powers the sensors back up after a resume
based on the user_requested_state atomic_t.
But hid_sensor_power_state() treats this as a boolean flag, leading to
the following problematic scenario:
1) Some app starts using the iio-sensor in buffered / triggered mode,
hid_sensor_data_rdy_trigger_set_state(true) gets called, setting
user_requested_state to 1.
2) Something directly accesses a _raw value through sysfs, leading
to a call to hid_sensor_power_state(true) followed by
hid_sensor_power_state(false) call, this sets user_requested_state
to 1 followed by setting it to 0.
3) Suspend/resume the machine, hid_sensor_set_power_work() now does
NOT power the sensor back up because user_requested_state (wrongly)
is 0. Which stops the app using the sensor in buffered mode from
receiving any new values.
This commit changes user_requested_state to a counter tracking how many
times hid_sensor_power_state(true) was called instead, fixing this.
Cc: Bastien Nocera <hadess@hadess.net>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 490fba90a9 upstream.
This commit is a follow-up to changes made to ad_sigma_delta.h
in staging: iio: ad7192: implement IIO_CHAN_INFO_SAMP_FREQ
which broke ad7793 as it was not altered to match those changes.
This driver predates the availability of IIO_CHAN_INFO_SAMP_FREQ
attribute wherein usage has some advantages like it can be accessed by
in-kernel consumers as well as reduces the code size.
Therefore, use IIO_CHAN_INFO_SAMP_FREQ to implement the
sampling_frequency attribute instead of using IIO_DEV_ATTR_SAMP_FREQ()
macro.
Move code from the functions associated with IIO_DEV_ATTR_SAMP_FREQ()
into respective read and write hooks with the mask set to
IIO_CHAN_INFO_SAMP_FREQ.
Fixes: a13e831fca ("staging: iio: ad7192: implement IIO_CHAN_INFO_SAMP_FREQ")
Signed-off-by: Michael Nosthoff <committed@heine.so>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb239c1209 upstream.
In _rtl92c_get_txpower_writeval_by_regulatory() the variable writeVal
is assigned to itself in an if ... else statement, apparently only to
document that the branch condition is handled and that a previously read
value should be returned unmodified. The self-assignment causes clang to
raise the following warning:
drivers/net/wireless/realtek/rtlwifi/rtl8192cu/rf.c:304:13:
error: explicitly assigning value of variable of type 'u32'
(aka 'unsigned int') to itself [-Werror,-Wself-assign]
writeVal = writeVal;
Delete the branch with the self-assignment.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42b5122e82 upstream.
In several locations the driver uses AMD_CG_STATE_UNGATE (type enum
amd_clockgating_state) instead of AMD_PG_STATE_UNGATE (type enum
amd_powergating_stat) and vice versa. Both constants have the same
value, so this doesn't cause any problems, but we still want to pass
the correct type.
Fixing the mismatch resolves multiple warnings like this when building
with clang:
drivers/gpu/drm/amd/amdgpu/../powerplay/hwmgr/cz_clockpowergating.c:169:7:
error: implicit conversion from enumeration type 'enum
amd_powergating_state' to different enumeration type 'enum
amd_clockgating_state' [-Werror,-Wenum-conversion]
AMD_PG_STATE_UNGATE);
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 814596495d upstream.
wiphy names were recently limited to 128 bytes by commit a7cfebcb75
("cfg80211: limit wiphy names to 128 bytes"). As it turns out though,
this isn't sufficient because dev_vprintk_emit() needs the syslog header
string "SUBSYSTEM=ieee80211\0DEVICE=+ieee80211:$devname" to fit into 128
bytes. This triggered the "device/subsystem name too long" WARN when
the device name was >= 90 bytes. As before, this was reproduced by
syzbot by sending an HWSIM_CMD_NEW_RADIO command to the MAC80211_HWSIM
generic netlink family.
Fix it by further limiting wiphy names to 64 bytes.
Reported-by: syzbot+e64565577af34b3768dc@syzkaller.appspotmail.com
Fixes: a7cfebcb75 ("cfg80211: limit wiphy names to 128 bytes")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit efe3de79e0 upstream.
Call trace:
[<ffffff9203a8d7a8>] dump_backtrace+0x0/0x428
[<ffffff9203a8dbf8>] show_stack+0x28/0x38
[<ffffff920409bfb8>] dump_stack+0xd4/0x124
[<ffffff9203d187e8>] print_address_description+0x68/0x258
[<ffffff9203d18c00>] kasan_report.part.2+0x228/0x2f0
[<ffffff9203d1927c>] kasan_report+0x5c/0x70
[<ffffff9203d1776c>] check_memory_region+0x12c/0x1c0
[<ffffff9203d17cdc>] memcpy+0x34/0x68
[<ffffff9203d75348>] xattr_getsecurity+0xe0/0x160
[<ffffff9203d75490>] vfs_getxattr+0xc8/0x120
[<ffffff9203d75d68>] getxattr+0x100/0x2c8
[<ffffff9203d76fb4>] SyS_fgetxattr+0x64/0xa0
[<ffffff9203a83f70>] el0_svc_naked+0x24/0x28
If user get root access and calls security.selinux setxattr() with an
embedded NUL on a file and then if some process performs a getxattr()
on that file with a length greater than the actual length of the string,
it would result in a panic.
To fix this, add the actual length of the string to the security context
instead of the length passed by the userspace process.
Signed-off-by: Sachin Grover <sgrover@codeaurora.org>
Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c97f414c54 upstream.
This value depands on the metadata support value, so reorder the
initialization to fit.
Fixes: b5be3b392 ("nvme: always unregister the integrity profile in __nvme_revalidate_disk")
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2824f50332 upstream.
The snapshot trigger currently only affects the main ring buffer, even when
it is used by the instances. This can be confusing as the snapshot trigger
is listed in the instance.
> # cd /sys/kernel/tracing
> # mkdir instances/foo
> # echo snapshot > instances/foo/events/syscalls/sys_enter_fchownat/trigger
> # echo top buffer > trace_marker
> # echo foo buffer > instances/foo/trace_marker
> # touch /tmp/bar
> # chown rostedt /tmp/bar
> # cat instances/foo/snapshot
# tracer: nop
#
#
# * Snapshot is freed *
#
# Snapshot commands:
# echo 0 > snapshot : Clears and frees snapshot buffer
# echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
# Takes a snapshot of the main buffer.
# echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free)
# (Doesn't have to be '2' works with any number that
# is not a '0' or '1')
> # cat snapshot
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
bash-1189 [000] .... 111.488323: tracing_mark_write: top buffer
Not only did the snapshot occur in the top level buffer, but the instance
snapshot buffer should have been allocated, and it is still free.
Cc: stable@vger.kernel.org
Fixes: 85f2b08268 ("tracing: Add basic event trigger framework")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 86b389ff22 upstream.
If a instance has an event trigger enabled when it is freed, it could cause
an access of free memory. Here's the case that crashes:
# cd /sys/kernel/tracing
# mkdir instances/foo
# echo snapshot > instances/foo/events/initcall/initcall_start/trigger
# rmdir instances/foo
Would produce:
general protection fault: 0000 [#1] PREEMPT SMP PTI
Modules linked in: tun bridge ...
CPU: 5 PID: 6203 Comm: rmdir Tainted: G W 4.17.0-rc4-test+ #933
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
RIP: 0010:clear_event_triggers+0x3b/0x70
RSP: 0018:ffffc90003783de0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b2b RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800c7130ba0
RBP: ffffc90003783e00 R08: ffff8801131993f8 R09: 0000000100230016
R10: ffffc90003783d80 R11: 0000000000000000 R12: ffff8800c7130ba0
R13: ffff8800c7130bd8 R14: ffff8800cc093768 R15: 00000000ffffff9c
FS: 00007f6f4aa86700(0000) GS:ffff88011eb40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6f4a5aed60 CR3: 00000000cd552001 CR4: 00000000001606e0
Call Trace:
event_trace_del_tracer+0x2a/0xc5
instance_rmdir+0x15c/0x200
tracefs_syscall_rmdir+0x52/0x90
vfs_rmdir+0xdb/0x160
do_rmdir+0x16d/0x1c0
__x64_sys_rmdir+0x17/0x20
do_syscall_64+0x55/0x1a0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
This was due to the call the clears out the triggers when an instance is
being deleted not removing the trigger from the link list.
Cc: stable@vger.kernel.org
Fixes: 85f2b08268 ("tracing: Add basic event trigger framework")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40f7090bb1 upstream.
New ICs (like the one on the Lenovo T480s) answer to
ETP_SMBUS_IAP_VERSION_CMD 4 bytes instead of 3. This corrupts the stack
as i2c_smbus_read_block_data() uses the values returned by the i2c
device to know how many data it need to return.
i2c_smbus_read_block_data() can read up to 32 bytes (I2C_SMBUS_BLOCK_MAX)
and there is no safeguard on how many bytes are provided in the return
value. Ensure we always have enough space for any future firmware.
Also 0-initialize the values to prevent any access to uninitialized memory.
Cc: <stable@vger.kernel.org> # v4.4.x, v4.9.x, v4.14.x, v4.15.x, v4.16.x
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad8fb554f0 upstream.
This time, Lenovo decided to go with different pieces in its latest
series of Thinkpads.
For those we have been able to test:
- the T480 is using Synaptics with an IBM trackpoint
-> it behaves properly with or without intertouch, there is no point
not using RMI4
- the X1 Carbon 6th gen is using Synaptics with an IBM trackpoint
-> the touchpad doesn't behave properly under PS/2 so we have to
switch it to RMI4 if we do not want to have disappointed users
- the X280 is using Synaptics with an ALPS trackpoint
-> the recent fixes in the trackpoint handling fixed it so upstream
now works fine with or without RMI4, and there is no point not
using RMI4
- the T480s is using an Elan touchpad, so that's a different story
Cc: <stable@vger.kernel.org> # v4.14.x, v4.15.x, v4.16.x
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5717a09aea upstream.
Synaptics devices reported it has Intertouch support,
and it fails via PS/2 as following logs:
psmouse serio2: Failed to reset mouse on synaptics-pt/serio0
psmouse serio2: Failed to enable mouse on synaptics-pt/serio0
Set these new devices to use SMBus to fix this issue, then they report
SMBus version 3 is using, patch:
https://patchwork.kernel.org/patch/9989547/ enabled SMBus ver 3 and
makes synaptics devices work fine on SMBus mode.
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15e2cffec3 upstream.
Lenovo use two different trackpoints in the fifth generation Thinkpad X1
Carbon. Both are accessible over SMBUS/RMI but the pnpIDs are missing.
This patch is for the Elantech trackpoint specifically which also
reports SMB version 3 so rmi_smbus needs to be updated in order to
handle it.
For the record, I was not the first one to come up with this patch as it
has been floating around the internet for a while now. However, I have
spent significant time with testing and my efforts to find the original
author of the patch have been unsuccessful.
Signed-off-by: Edvard Holst <edvard.holst@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a27ba2607e upstream.
The struct xfs_agfl v5 header was originally introduced with
unexpected padding that caused the AGFL to operate with one less
slot than intended. The header has since been packed, but the fix
left an incompatibility for users who upgrade from an old kernel
with the unpacked header to a newer kernel with the packed header
while the AGFL happens to wrap around the end. The newer kernel
recognizes one extra slot at the physical end of the AGFL that the
previous kernel did not. The new kernel will eventually attempt to
allocate a block from that slot, which contains invalid data, and
cause a crash.
This condition can be detected by comparing the active range of the
AGFL to the count. While this detects a padding mismatch, it can
also trigger false positives for unrelated flcount corruption. Since
we cannot distinguish a size mismatch due to padding from unrelated
corruption, we can't trust the AGFL enough to simply repopulate the
empty slot.
Instead, avoid unnecessarily complex detection logic and and use a
solution that can handle any form of flcount corruption that slips
through read verifiers: distrust the entire AGFL and reset it to an
empty state. Any valid blocks within the AGFL are intentionally
leaked. This requires xfs_repair to rectify (which was already
necessary based on the state the AGFL was found in). The reset
mitigates the side effect of the padding mismatch problem from a
filesystem crash to a free space accounting inconsistency. The
generic approach also means that this patch can be safely backported
to kernels with or without a packed struct xfs_agfl.
Check the AGF for an invalid freelist count on initial read from
disk. If detected, set a flag on the xfs_perag to indicate that a
reset is required before the AGFL can be used. In the first
transaction that attempts to use a flagged AGFL, reset it to empty,
warn the user about the inconsistency and allow the freelist fixup
code to repopulate the AGFL with new blocks. The xfs_perag flag is
cleared to eliminate the need for repeated checks on each block
allocation operation.
This allows kernels that include the packing fix commit 96f859d52b
("libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct")
to handle older unpacked AGFL formats without a filesystem crash.
Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by Dave Chiluk <chiluk+linuxxfs@indeed.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a78ee256c3 upstream.
The AGFL size calculation is about to get more complex, so lets turn
the macro into a function first and remove the macro.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
[darrick: forward port to newer kernel, simplify the helper]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6073a09210 upstream.
Use kasprintf instead of combination of kmalloc and sprintf. Also,
remove the local variables used for storing the string length as they
are not required now.
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7dec80ccbe upstream.
With the following commit:
fd35c88b74 ("objtool: Support GCC 8 switch tables")
I added a "can't find switch jump table" warning, to stop covering up
silent failures if add_switch_table() can't find anything.
That warning found yet another bug in the objtool switch table detection
logic. For cases 1 and 2 (as described in the comments of
find_switch_table()), the find_symbol_containing() check doesn't adjust
the offset for RIP-relative switch jumps.
Incidentally, this bug was already fixed for case 3 with:
6f5ec2993b ("objtool: Detect RIP-relative switch table references")
However, that commit missed the fix for cases 1 and 2.
The different cases are now starting to look more and more alike. So
fix the bug by consolidating them into a single case, by checking the
original dynamic jump instruction in the case 3 loop.
This also simplifies the code and makes it more robust against future
switch table detection issues -- of which I'm sure there will be many...
Switch table detection has been the most fragile area of objtool, by
far. I long for the day when we'll have a GCC plugin for annotating
switch tables. Linus asked me to delay such a plugin due to the
flakiness of the plugin infrastructure in older versions of GCC, so this
rickety code is what we're stuck with for now. At least the code is now
a little simpler than it was.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/f400541613d45689086329432f3095119ffbc328.1526674218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f5ec2993b upstream.
Typically a switch table can be found by detecting a .rodata access
followed an indirect jump:
1969: 4a 8b 0c e5 00 00 00 mov 0x0(,%r12,8),%rcx
1970: 00
196d: R_X86_64_32S .rodata+0x438
1971: e9 00 00 00 00 jmpq 1976 <dispc_runtime_suspend+0xb6a>
1972: R_X86_64_PC32 __x86_indirect_thunk_rcx-0x4
Randy Dunlap reported a case (seen with GCC 4.8) where the .rodata
access uses RIP-relative addressing:
19bd: 48 8b 3d 00 00 00 00 mov 0x0(%rip),%rdi # 19c4 <dispc_runtime_suspend+0xbb8>
19c0: R_X86_64_PC32 .rodata+0x45c
19c4: e9 00 00 00 00 jmpq 19c9 <dispc_runtime_suspend+0xbbd>
19c5: R_X86_64_PC32 __x86_indirect_thunk_rdi-0x4
In this case the relocation addend needs to be adjusted accordingly in
order to find the location of the switch table.
The fix is for case 3 (as described in the comments), but also make the
existing case 1 & 2 checks more precise by only adjusting the addend for
R_X86_64_PC32 relocations.
This fixes the following warnings:
drivers/video/fbdev/omap2/omapfb/dss/dispc.o: warning: objtool: dispc_runtime_suspend()+0xbb8: sibling call from callable instruction with modified stack frame
drivers/video/fbdev/omap2/omapfb/dss/dispc.o: warning: objtool: dispc_runtime_resume()+0xcc5: sibling call from callable instruction with modified stack frame
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/b6098294fd67afb69af8c47c9883d7a68bf0f8ea.1526305958.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd35c88b74 upstream.
With GCC 8, some issues were found with the objtool switch table
detection.
1) In the .rodata section, immediately after the switch table, there can
be another object which contains a pointer to the function which had
the switch statement. In this case objtool wrongly considers the
function pointer to be part of the switch table. Fix it by:
a) making sure there are no pointers to the beginning of the
function; and
b) making sure there are no gaps in the switch table.
Only the former was needed, the latter adds additional protection for
future optimizations.
2) In find_switch_table(), case 1 and case 2 are missing the check to
ensure that the .rodata switch table data is anonymous, i.e. that it
isn't already associated with an ELF symbol. Fix it by adding the
same find_symbol_containing() check which is used for case 3.
This fixes the following warnings with GCC 8:
drivers/block/virtio_blk.o: warning: objtool: virtio_queue_rq()+0x0: stack state mismatch: cfa1=7+8 cfa2=7+72
net/ipv6/icmp.o: warning: objtool: icmpv6_rcv()+0x0: stack state mismatch: cfa1=7+8 cfa2=7+64
drivers/usb/core/quirks.o: warning: objtool: quirks_param_set()+0x0: stack state mismatch: cfa1=7+8 cfa2=7+48
drivers/mtd/nand/raw/nand_hynix.o: warning: objtool: hynix_nand_decode_id()+0x0: stack state mismatch: cfa1=7+8 cfa2=7+24
drivers/mtd/nand/raw/nand_samsung.o: warning: objtool: samsung_nand_decode_id()+0x0: stack state mismatch: cfa1=7+8 cfa2=7+32
drivers/gpu/drm/nouveau/nvkm/subdev/top/gk104.o: warning: objtool: gk104_top_oneinit()+0x0: stack state mismatch: cfa1=7+8 cfa2=7+64
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: damian <damian.tometzki@icloud.com>
Link: http://lkml.kernel.org/r/20180510224849.xwi34d6tzheb5wgw@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13810435b9 upstream.
GCC 8 moves a lot of unlikely code out of line to "cold" subfunctions in
.text.unlikely. Properly detect the new subfunctions and treat them as
extensions of the original functions.
This fixes a bunch of warnings like:
kernel/cgroup/cgroup.o: warning: objtool: parse_cgroup_root_flags()+0x33: sibling call from callable instruction with modified stack frame
kernel/cgroup/cgroup.o: warning: objtool: cgroup_addrm_files()+0x290: sibling call from callable instruction with modified stack frame
kernel/cgroup/cgroup.o: warning: objtool: cgroup_apply_control_enable()+0x25b: sibling call from callable instruction with modified stack frame
kernel/cgroup/cgroup.o: warning: objtool: rebind_subsystems()+0x325: sibling call from callable instruction with modified stack frame
Reported-and-tested-by: damian <damian.tometzki@icloud.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/0965e7fcfc5f31a276f0c7f298ff770c19b68706.1525923412.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91ba9f28a3 upstream.
SOU primary plane prepare_fb hook depends upon dmabuf_size to pin up BO
(and not call a new vmw_dmabuf_init) when a new fb size is same as
current fb. This was changed in a recent commit which is causing
page_flip to fail on VM with low display memory and multi-mon failure
when cycle monitors from secondary display.
Cc: <stable@vger.kernel.org> # 4.14, 4.16
Fixes: 20fb5a635a ("drm/vmwgfx: Unpin the screen object backup buffer when not used")
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9b3e420766 ]
The SPI version of this chip allows several devices to be present on the
same SPI bus via a local address. If this is in action and if the kernel
has debugfs, however, the code attempts to create duplicate entries for
the regmap's debugfs:
mcp23s08 spi1.1: Failed to create debugfs directory
This patch simply assigns a local name matching the device logical
address to the `struct regmap_config`.
No changes are needed for MCP23S18 because that device does not support
any logical addressing. Similarly, I2C devices do not need any action,
either, because they are already different in their I2C address.
A similar problem is present for the pinctrl debugfs instance, but that
one is not addressed by this patch.
Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2bada7ac1f ]
The missing last digit of the CONFIG values is added. Looks like a typo
of some sort when comparing to the downstream dt. This fixes
intermittent behavior behaviour of the ethernet controllers.
Signed-off-by: Aapo Vienamo <aapo@tuxera.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 87f88732d2 ]
When operating the system headless headless, the domain is never
powered on, leaving the clocks disabled. The shutdown function then
tries to disable the already disabled clocks, resulting in errors.
Therefore call meson_gx_pwrc_vpu_power_off() only if domain is
powered on.
This patch fixes the described issue on my system (Odorid-C2).
Fixes: 339cd0ea08 "soc: amlogic: meson-gx-pwrc-vpu: fix power-off when powered by bootloader"
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71df179363 ]
The cache pointer points to the actual memory used by the cache, as the
comparison here is looking for the type of the cache it should check
against cache_type.
Fixes: 1ea975cf1e ("regmap: Add a function to check if a regmap register is cached")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7981190fb5 ]
The used part does contain an eeprom compatible with an Atmel 24c02
chip and it is from NXP, but it is not called 24c02. It's actually a
se97b chip. Adjust the compatible accordingly.
Fixes: 21dd0ece34 ("ARM: dts: at91: add devicetree for the Axentia TSE-850")
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6b65933008 ]
The used part does contain an eeprom compatible with an Atmel 24c02
chip and it is from NXP, but it is not called 24c02. It's actually a
se97b chip. Adjust the compatible accordingly.
Fixes: 0e43238999 ("ARM: dts: at91: add devicetree for the Axentia Nattis with Natte power")
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3b765c0b76 ]
drm_vblank_count() has an u32 type returning what is a 64-bit vblank count.
The effect of this is when drm_wait_vblank_ioctl() tries to widen the user
space requested vblank sequence using this clipped 32-bit count(when the
value is >= 2^32) as reference, the requested sequence remains a 32-bit
value and gets queued like that. However, the code that checks if the
requested sequence has passed compares this against the 64-bit vblank
count.
With drm_vblank_count() returning all bits of the vblank count, update
drm_crtc_accurate_vblank_count() so that drm_crtc_arm_vblank_event() queues
the correct sequence. Otherwise, this leads to prolonged waits for a vblank
sequence when the current count is >=2^32.
Finally, fix drm_wait_one_vblank() too.
v2: Commit message fix (Keith)
Squash commits (Rodrigo)
Fixes: 570e86963a ("drm: Widen vblank count to 64-bits [v3]")
Cc: Keith Packard <keithp@keithp.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180203051302.9974-1-dhinakaran.pandiyan@intel.com
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e3ebaa4651 ]
Jin Yao reported memory corrupton in perf report with
branch info used for stack trace:
> Following command lines will cause perf crash.
> perf record -j call -g -a <application>
> perf report --branch-history
>
> *** Error in `perf': double free or corruption (!prev): 0x00000000104aa040 ***
> ======= Backtrace: =========
> /lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
> /lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
> /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
> perf[0x51b914]
> perf(hist_entry_iter__add+0x1e5)[0x51f305]
> perf[0x43cf01]
> perf[0x4fa3bf]
> perf[0x4fa923]
> perf[0x4fd396]
> perf[0x4f9614]
> perf(perf_session__process_events+0x89e)[0x4fc38e]
> perf(cmd_report+0x15d2)[0x43f202]
> perf[0x4a059f]
> perf(main+0x631)[0x427b71]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
> perf(_start+0x29)[0x427d89]
For the cumulative output, we allocate the he_cache array based on the
--max-stack option value and populate it with data from 'callchain_cursor'.
The --max-stack option value does not ensure now the limit for number of
callchain_cursor nodes, so the cumulative iter code will allocate smaller array
than it's actually needed and cause above corruption.
I think the --max-stack limit does not apply here anyway, because we add
callchain data as normal hist entries, while the --max-stack control the limit
of single entry callchain depth.
Using the callchain_cursor.nr as he_cache array count to fix this. Also
removing struct hist_entry_iter::max_stack, because there's no longer any use
for it.
We need more fixes to ensure that the branch stack code follows properly the
logic of --max-stack, which is not the case at the moment.
Original-patch-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180216123619.GA9945@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ab6e9a9934 ]
The symbol search called by machine__find_kernel_symbol_by_name is using
internally arch__compare_symbol_names function to compare 2 symbol
names, because different archs have different ways of comparing symbols.
Mostly for skipping '.' prefixes and similar.
In test 1 when we try to find matching symbols in kallsyms and vmlinux,
by address and by symbol name. When either is found we compare the pair
symbol names by simple strcmp, which is not good enough for reasons
explained in previous paragraph.
On powerpc this can cause lockup, because even thought we found the
pair, the compared names are different and don't match simple strcmp.
Following code path is executed, that leads to lockup:
- we find the pair in kallsyms by sym->start
next_pair:
- we compare the names and it fails
- we find the pair by sym->name
- the pair addresses match so we call goto next_pair
because we assume the names match in this case
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 031b84c407 ("perf probe ppc: Enable matching against dot symbols automatically")
Link: http://lkml.kernel.org/r/20180215122635.24029-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b40982e846 ]
When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,
1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'
Percent│ IPC Cycle
│ if (flag)
1.37 │0.4┌── 1 ↓ je 82
│ │ x += x / y + y / x;
0.00 │0.4│ 1310 movsd (%rsp),%xmm0
0.00 │0.4│ 565 movsd 0x8(%rsp),%xmm4
│0.4│ movsd 0x8(%rsp),%xmm1
│0.4│ movsd (%rsp),%xmm3
│0.4│ divsd %xmm4,%xmm0
0.00 │0.4│ 579 divsd %xmm3,%xmm1
│0.4│ movsd (%rsp),%xmm2
│0.4│ addsd %xmm1,%xmm0
│0.4│ addsd %xmm2,%xmm0
0.00 │0.4│ movsd %xmm0,(%rsp)
│ │ volatile double x = 1212121212, y = 121212;
│ │
│ │ s_randseed = time(0);
│ │ srand(s_randseed);
│ │
│ │ for (i = 0; i < 2000000000; i++) {
1.37 │0.4└─→ 82: sub $0x1,%ebx
28.21 │0.48 17 ↑ jne 38
The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.
With this patch, the result is:
Percent│ IPC Cycle
│ if (flag)
1.37 │0.48 1 ┌──je 82
│ │ x += x / y + y / x;
0.00 │0.48 1310 │ movsd (%rsp),%xmm0
0.00 │0.48 565 │ movsd 0x8(%rsp),%xmm4
│0.48 │ movsd 0x8(%rsp),%xmm1
│0.48 │ movsd (%rsp),%xmm3
│0.48 │ divsd %xmm4,%xmm0
0.00 │0.48 579 │ divsd %xmm3,%xmm1
│0.48 │ movsd (%rsp),%xmm2
│0.48 │ addsd %xmm1,%xmm0
│0.48 │ addsd %xmm2,%xmm0
0.00 │0.48 │ movsd %xmm0,(%rsp)
│ │ volatile double x = 1212121212, y = 121212;
│ │
│ │ s_randseed = time(0);
│ │ srand(s_randseed);
│ │
│ │ for (i = 0; i < 2000000000; i++) {
1.37 │0.48 82:└─→sub $0x1,%ebx
28.21 │0.48 17 ↑ jne 38
Committer notes:
Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.
While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0f19a038af ]
Using Fedora 27 and latest Linux kernel the test case
trace+probe_libc_inet_pton.sh fails again on s390. This time is the
inlining of functions which does not match. After an update of the
glibc (from 2.26-16 to 2.26-24) the output is different
The expected output is:
__inet_pton (/usr/lib64/libc-2.26.so)
gaih_inet (inlined)
....
The actual output is:
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
0.000 probe_libc:inet_pton:(3ffb2140448))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
...
Fix this by being less strict on 'inlined' verses library name and
accept both
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180214070303.55757-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bee3204ec3 ]
Currently the kdump kernel becomes very slow if 'noapic' is specified.
Normal kernel doesn't have this bug.
Kernel parameter 'noapic' is used to disable IO-APIC in system for
testing or special purpose. Here the root cause is that in kdump
kernel LAPIC is disabled since commit:
522e664644 ("x86/apic: Disable I/O APIC before shutdown of the local APIC")
In this case we need set up through-local-APIC on boot CPU in
setup_local_APIC().
In normal kernel the legacy irq mode is enabled by the BIOS. If
it is virtual wire mode, the local-APIC has been enabled and set as
through-local-APIC.
Though we fixed the regression introduced by commit 522e664644,
to further improve robustness set up the through-local-APIC mode
explicitly, do not rely on the default boot IRQ mode.
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: douly.fnst@cn.fujitsu.com
Cc: joro@8bytes.org
Cc: prarit@redhat.com
Cc: uobergfe@redhat.com
Link: http://lkml.kernel.org/r/20180214054656.3780-7-bhe@redhat.com
[ Rewrote the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3b94a4007d ]
Brightness couldn't change when booting up in DC mode.
It was because "psr_enabled" flag was not set to true before
setting vsc packet revision, causing packet rev setup was skipped.
Now instead of checking the psr flag, it checks if the DPCD_REV >= 1.2
and set the vsc packet revision.
Signed-off-by: Tao <xtao@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3caa973b7a ]
When RCU stall warning triggers, it can print out a lot of messages
while holding spinlocks. If the console device is slow (e.g. an
actual or IPMI serial console), it may end up triggering NMI hard
lockup watchdog like the following.
[ Upstream commit 13138de014 ]
stmmac_mac_config_rx_queues_routing() incorrectly calls rx_queue_prio()
instead of rx_queue_routing().
This looks like a copy paste issue, since
stmmac_mac_config_rx_queues_prio() already calls rx_queue_prio(),
and both stmmac_mac_config_rx_queues_routing() and
stmmac_mac_config_rx_queues_prio() are very similar in structure.
Fixes: abe80fdc6e ("net: stmmac: RX queue routing configuration")
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 23138ead27 ]
If there is a memory allocation error when trying to change an audit
kernel feature value, the ignored allocation error will trigger a NULL
pointer dereference oops on subsequent use of that pointer. Return
instead.
Passes audit-testsuite.
See: https://github.com/linux-audit/audit-kernel/issues/76
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: not necessary (other funcs check for NULL), but a good practice]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dbdd0f58fd ]
There is a problem with PCMCIA system resume callbacks with respect
to suspend-to-idle in which the ->suspend_noirq() callback may be
invoked after the ->resume_noirq() one without resuming the system
entirely in some cases. This doesn't work for PCMCIA because of
the lack of symmetry between its system suspend and system resume
"noirq" callbacks.
The system resume handling in PCMCIA is split between
socket_early_resume() and socket_late_resume() which are called in
different phases of system resume and both need to run for
socket_suspend() (invoked by the system suspend "noirq" callback)
to work. Specifically, socket_suspend() returns an error when
called after socket_early_resume() without socket_late_resume(),
so if the suspend-to-idle core detects a spurious wakeup event and
attempts to put the system back to sleep, that is aborted by the
error coming from socket_suspend().
Avoid that by using a new socket state flag, SOCKET_IN_RESUME,
to indicate that socket_early_resume() has already run for the
socket in which case socket_suspend() will do minimum handling
and return 0.
This change has been tested on my venerable Toshiba Portege R500
(which is where the problem has been discovered in the first place),
but admittedly I have no PCMCIA cards to test along with the socket
itself.
Fixes: 33e4f80ee6 (ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[linux@dominikbrodowski.net: follow same codepaths for both suspend variants; call ->suspend()]
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79c81facdc ]
Since 517e7a1537 ("ASoC: bcm2835: move to use the clock framework")
the bcm2835-i2s requires a clock as DT property. Unfortunately
the necessary DT change has never been applied. While we are at it
also fix the first PCM register range to cover the PCM_GRAY register.
Fixes: 517e7a1537 ("ASoC: bcm2835: move to use the clock framework")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a94cf2a614 ]
It appears that the single port Ether controllers having TSU (like SH7734/
R8A7740) need the same kind of treating in sh_eth_tsu_init() as R7S72100
currently has -- they also don't have the TSU registers related e.g. to
passing the frames between ports. Add the 'sh_eth_cpu_data::dual_port'
flag and use it as a new criterion for taking a "short path" in the TSU
init sequence in order to avoid writing to the non-existent registers...
Fixes: f0e81fecd4 ("net: sh_eth: Add support SH7734")
Fixes: 73a0d90730 ("net: sh_eth: add support R8A7740")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6704a3abf4 ]
On hardware which supports timestamping all packets, the timestamps are
recorded in the packet buffer, and the driver no longer uses or reads
the registers. This makes the logic for checking and clearing Rx
timestamp hangs meaningless.
If we run the ixgbe_ptp_rx_hang() function in this case, then the driver
will continuously spam the log output with "Clearing Rx timestamp hang".
These messages are spurious, and confusing to end users.
The original code in commit a9763f3cb5 ("ixgbe: Update PTP to support
X550EM_x devices", 2015-12-03) did have a flag PTP_RX_TIMESTAMP_IN_REGISTER
which was intended to be used to avoid the Rx timestamp hang check,
however it did not actually check the flag before calling the function.
Do so now in order to stop the checks and prevent the spurious log
messages.
Fixes: a9763f3cb5 ("ixgbe: Update PTP to support X550EM_x devices", 2015-12-03)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 116e5258e4 ]
Currently when UDF filesystem is recorded without uid / gid (ids are set
to -1), we will assign INVALID_[UG]ID to vfs inode unless user uses uid=
and gid= mount options. In such case filesystem could not be modified in
any way as VFS refuses to modify files with invalid ids (even by root).
This is confusing to users and not very useful default since such media
mode is generally used for removable media. Use overflow[ug]id instead
so that at least root can modify the filesystem.
Reported-by: Steve Kenton <skenton@ou.edu>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 120d75ecf0 ]
An issue in the code mapping the skb fragments into
scatter-gather frames was evidentiated by netperf
TCP_SENDFILE tests. The size was set wrong for all
fragments but the first, affecting the transmission
of any skb with more than one fragment.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b24b6478e6 ]
Ideally the de-allocation of resources should happen in the exact
opposite order in which they were allocated. It helps maintain the code
in long term, even if nothing really breaks with incorrect ordering.
That wasn't followed in cpufreq_online() and it has some
inconsistencies. For example, the symlinks were created from within
the locked region while they are removed only after putting the locks.
Also ->exit() should have been called only after the symlinks are
removed and the lock is dropped, as that was the case when ->init()
was first called.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 15d2ee42a3 ]
A dma_wmb() is used to guarantee the ordering, with respect to
other writes, to cache coherent DMA memory.
There is a dma_wmb() in prepare_tx_desc()/prepare_tso_tx_desc() which
ensures that TDES0/1/2 is written before TDES3 (which contains the own
bit), for First Desc.
However, in the rare case that MSS changes, there will be a MSS
context descriptor in front of the regular DMA descriptors:
<MSS desc> <- DMA Next Descriptor
<First Desc>
<desc n>
<Last Desc>
Thus, for this special case, we need a dma_wmb()
after prepare_tso_tx_desc()/before writing the own bit to the MSS desc,
so that we flush the write to TDES3 for First Desc,
in order to ensure that the MSS descriptor is the last descriptor to
set the own bit.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a6b25da5e7 ]
According to Documentation/memory-barriers.txt, we need to use a
dma_rmb() after reading the status/own bit, to ensure that all
descriptor fields are read after reading the own bit.
This way, we ensure that the DMA engine is done with the DMA
descriptor before we read the other descriptor fields, e.g. reading
the tx hardware timestamp (if PTP is enabled).
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 53cc7721fd ]
Currently, buffers holding individual queue statistics are allocated
when the device is opened. If an ibmvnic interface is hotplugged or
initialized but never opened, an attempt to get statistics with
ethtool will result in a kernel panic.
Since the driver allocates a constant number, the maximum supported
queues, of buffers, these can be allocated during device probe and
freed when the device is hot-unplugged or the module is removed.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dbf797655a ]
issue:
sometime GFX/MM ib test hit timeout under SRIOV env, root cause
is that engine doesn't come back soon enough so the current
IB test considered as timed out.
fix:
for SRIOV GFX IB test wait time need to be expanded a lot during
SRIOV runtimei mode since it couldn't really begin before GFX engine
come back.
for SRIOV MM IB test it always need more time since MM scheduling
is not go together with GFX engine, it is controled by h/w MM
scheduler so no matter runtime or exclusive mode MM IB test
always need more time.
v2:
use ring type instead of idx to judge
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f0178fb67 ]
otherwise there will be DMAR reading error comes out from CP since
GFX is still alive and CPC's WPTR_POLL is still enabled, which would
lead to DMAR read error.
fix:
we can hault CPG after hw_fini, but cannot halt CPC becaues KIQ
stil need to be alive to let RLCV invoke, but its WPTR_POLL could
be disabled.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 215003b4ae ]
There's no reason to delay initialization of most of the driver (such as
mapping memory I/O, getting clocks or enabling runtime PM) to the
component master bind handler.
This additionally fixes a real PM issue caused enabling runtime PM in
the bind handler.
The bind handler performs the following sequence of PM operations:
pm_runtime_enable(dev);
pm_runtime_get_sync(dev);
... (access the hardware to read the device revision) ...
pm_runtime_put_sync(dev);
If a failure occurs at this point, the error path calls
pm_runtime_disable() to balance the pm_runtime_enable() call.
To understand the problem, it should be noted that the bind handler is
called when one of the component registers itself, which happens in the
component's probe handler. Furthermore, as the components are children
of the DSS, the device core calls pm_runtime_get_sync() on the DSS
platform device before calling the component's probe handler. This
increases the DSS power usage count but doesn't runtime resume the
device, as runtime PM is disabled at that point.
The bind handler is thus called with runtime PM disabled, with the
device runtime suspended, but with the power usage count larger than 0.
The pm_runtime_get_sync() call will thus further increase the power
usage count and runtime resume the device. The pm_runtime_put_sync()
handler will decrease the power usage count to a non-zero value and will
thus not suspend the device. Finally, the pm_runtime_disable() call will
disable runtime PM, preventing the pm_runtime_put() call in the device
core from runtime suspending the device. The DSS device is thus left
powered on.
To fix this, move the initialization code from the bind handler to the
probe handler.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 48d163b1aa ]
When Linux is master of BAM, it can directly read registers to know number
of supported channels, however when its remotely controlled reading these
registers would trigger a crash if the BAM is not yet initialized or
powered up on the remote side.
This patch allows driver to read num-channels and num-ees from Device Tree
for remotely controlled BAM.
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9851bc77e6 ]
vfio-ccw only supports command mode for channel programs, not transport
mode. User space is supposed to already take care of that and pass us
command-mode ORBs only, but better make sure and return an error to
the caller instead of trying to process tcws as ccws.
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b89405b610 ]
When dt_to_map_one_config() is called with a pinctrl_dev passed
in, it should only be using this if the node being looked up
is a hog. The code was always using the passed pinctrl_dev
without checking whether the dt node referred to it.
A pin controller can have pinctrl-n dependencies on other pin
controllers in these cases:
- the pin controller hardware is external, for example I2C, so
needs other pin controller(s) to be setup to communicate with
the hardware device.
- it is a child of a composite MFD so its of_node is shared with
the parent MFD and other children of that MFD. Any part of that
MFD could have dependencies on other pin controllers.
Because of this, dt_to_map_one_config() can't assume that if it
has a pinctrl_dev passed in then the node it looks up must be
a hog. It could be a reference to some other pin controller.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3c829f47e3 ]
If devm_reset_control_get_exclusive() fails, asm9260_wdt_probe()
returns immediately. But clks has been already enabled at that point,
so it is required to disable them or to move the code around.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3e081628d5 ]
This patch fixes an issue that a race condition happens between a client
driver and the rcar-dmac driver:
- The rcar_dmac_isr_transfer_end() is called.
- The done list appears, and desc.running is the next active list.
- rcar_dmac_chan_get_residue() is called by a client driver before
rcar_dmac_isr_channel_thread() is called.
- The rcar_dmac_chan_get_residue() will not find any descriptors.
- And, the following WARNING happens:
WARN(1, "No descriptor for cookie!");
The sh-sci driver with HSCIF (921,600bps) on R-Car H3 can cause this
situation.
So, this patch checks the done lists in rcar_dmac_chan_get_residue()
and returns zero if the done lists has the argument cookie.
Tested-by: Nguyen Viet Dung <dung.nguyen.aj@renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit aa0ab02ba9 ]
On the 8xx, the page size is set in the PMD entry and applies to
all pages of the page table pointed by the said PMD entry.
When an app has some regular pages allocated (e.g. see below) and tries
to mmap() a huge page at a hint address covered by the same PMD entry,
the kernel accepts the hint allthough the 8xx cannot handle different
page sizes in the same PMD entry.
10000000-10001000 r-xp 00000000 00:0f 2597 /root/malloc
10010000-10011000 rwxp 00000000 00:0f 2597 /root/malloc
mmap(0x10080000, 524288, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x10080000
This results the app remaining forever in do_page_fault()/hugetlb_fault()
and when interrupting that app, we get the following warning:
[162980.035629] WARNING: CPU: 0 PID: 2777 at arch/powerpc/mm/hugetlbpage.c:354 hugetlb_free_pgd_range+0xc8/0x1e4
[162980.035699] CPU: 0 PID: 2777 Comm: malloc Tainted: G W 4.14.6 #85
[162980.035744] task: c67e2c00 task.stack: c668e000
[162980.035783] NIP: c000fe18 LR: c00e1eec CTR: c00f90c0
[162980.035830] REGS: c668fc20 TRAP: 0700 Tainted: G W (4.14.6)
[162980.035854] MSR: 00029032 <EE,ME,IR,DR,RI> CR: 24044224 XER: 20000000
[162980.036003]
[162980.036003] GPR00: c00e1eec c668fcd0 c67e2c00 00000010 c6869410 10080000 00000000 77fb4000
[162980.036003] GPR08: ffff0001 0683c001 00000000 ffffff80 44028228 10018a34 00004008 418004fc
[162980.036003] GPR16: c668e000 00040100 c668e000 c06c0000 c668fe78 c668e000 c6835ba0 c668fd48
[162980.036003] GPR24: 00000000 73ffffff 74000000 00000001 77fb4000 100fffff 10100000 10100000
[162980.036743] NIP [c000fe18] hugetlb_free_pgd_range+0xc8/0x1e4
[162980.036839] LR [c00e1eec] free_pgtables+0x12c/0x150
[162980.036861] Call Trace:
[162980.036939] [c668fcd0] [c00f0774] unlink_anon_vmas+0x1c4/0x214 (unreliable)
[162980.037040] [c668fd10] [c00e1eec] free_pgtables+0x12c/0x150
[162980.037118] [c668fd40] [c00eabac] exit_mmap+0xe8/0x1b4
[162980.037210] [c668fda0] [c0019710] mmput.part.9+0x20/0xd8
[162980.037301] [c668fdb0] [c001ecb0] do_exit+0x1f0/0x93c
[162980.037386] [c668fe00] [c001f478] do_group_exit+0x40/0xcc
[162980.037479] [c668fe10] [c002a76c] get_signal+0x47c/0x614
[162980.037570] [c668fe70] [c0007840] do_signal+0x54/0x244
[162980.037654] [c668ff30] [c0007ae8] do_notify_resume+0x34/0x88
[162980.037744] [c668ff40] [c000dae8] do_user_signal+0x74/0xc4
[162980.037781] Instruction dump:
[162980.037821] 7fdff378 81370000 54a3463a 80890020 7d24182e 7c841a14 712a0004 4082ff94
[162980.038014] 2f890000 419e0010 712a0ff0 408200e0 <0fe00000> 54a9000a 7f984840 419d0094
[162980.038216] ---[ end trace c0ceeca8e7a5800a ]---
[162980.038754] BUG: non-zero nr_ptes on freeing mm: 1
[162985.363322] BUG: non-zero nr_ptes on freeing mm: -1
In order to fix this, this patch uses the address space "slices"
implemented for BOOK3S/64 and enhanced to support PPC32 by the
preceding patch.
This patch modifies the context.id on the 8xx to be in the range
[1:16] instead of [0:15] in order to identify context.id == 0 as
not initialised contexts as done on BOOK3S
This patch activates CONFIG_PPC_MM_SLICES when CONFIG_HUGETLB_PAGE is
selected for the 8xx
Alltough we could in theory have as many slices as PMD entries, the
current slices implementation limits the number of low slices to 16.
This limitation is not preventing us to fix the initial issue allthough
it is suboptimal. It will be cured in a subsequent patch.
Fixes: 4b91428699 ("powerpc/8xx: Implement support of hugepages")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db3a528db4 upstream.
In preparation for the following patch which will fix an issue on
the 8xx by re-using the 'slices', this patch enhances the
'slices' implementation to support 32 bits CPUs.
On PPC32, the address space is limited to 4Gbytes, hence only the low
slices will be used.
The high slices use bitmaps. As bitmap functions are not prepared to
handle bitmaps of size 0, this patch ensures that bitmap functions
are called only when SLICE_NUM_HIGH is not nul.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3286f05bc upstream.
In preparation for the following patch which will enhance 'slices'
for supporting PPC32 in order to fix an issue on hugepages on 8xx,
this patch takes out of page*.h all bits related to 'slices' and put
them into newly created slice.h header files.
While common parts go into asm/slice.h, subarch specific
parts go into respective books3s/64/slice.c and nohash/64/slice.c
'slices'
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5a3a03905a ]
Commit 95d8b41c76 ("ARM: dts: keystone-k2e-clocks: Add missing unit
name to clock nodes that have regs") fixed the unit names on various
clock nodes but missed out adding the unit address separator on the
clkhyperlink0 clock node. Fix the same.
Fixes: 95d8b41c76 ("ARM: dts: keystone-k2e-clocks: Add missing unit name to clock nodes that have regs")
Signed-off-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1330fc3532 ]
This patch fixes the below warnings with new glibc and gcc:
hv_vss_daemon.c:100:13: warning: In the GNU C Library, "major" is defined
by <sys/sysmacros.h>. For historical compatibility, it is currently
defined by <sys/types.h> as well, but we plan to remove this soon.
To use "major", include <sys/sysmacros.h> directly.
hv_fcopy_daemon.c:42:2: note: 'snprintf' output between 2 and 1040
bytes into a destination of size 260
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4062119b9d ]
The sdma wptr polling memory is not fast enough, then the sdma
wptr register will be random, and not equal to sdma rptr, which
will cause sdma engine hang when load driver, so clean up the sdma
wptr directly to fix this issue.
v2:add comment above the code and correct coding style
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2e7a66a8b5 ]
The following happens when connection a DVI output driven
from the SiI9022 using a DVI-to-VGA adapter plug:
i2c i2c-0: sendbytes: NAK bailout.
i2c i2c-0: sendbytes: NAK bailout.
Then no picture. Apparently the I2C engine inside the SiI9022
is not smart enough to try to fall back to DDC I2C. Or the
vendor have not integrated the electronics properly. I don't
know which one it is.
After this, the I2C bus seems stalled and the first attempt to
read the status register fails, and the code returns with
negative return value, and the display fails to initialized.
Instead, retry status readout five times and continue even
if this fails.
Tested on the ARM Versatile Express with a DVI-to-VGA
connector, it now gives picture.
Introduce a helper struct device *dev variable to make
the code more readable.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305101702.13441-1-linus.walleij@linaro.org
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2e2c177ca8 ]
In slave_update() of vmaster code ignores the error from the slave
get() callback and copies the values. It's not only about the missing
error code but also that this may potentially lead to a leak of
uninitialized variables when the slave get() don't clear them.
This patch fixes slave_update() not to copy the potentially
uninitialized values when an error is returned from the slave get()
callback, and to propagate the error value properly.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0a5169add9 ]
IRQ parameters for the SoC devices connected directly to I/O APIC lines
(without PCI IRQ routing) may be specified in the Device Tree.
Called from DT IRQ parser, irq_create_fwspec_mapping() calls
irq_domain_alloc_irqs() with a pointer to irq_fwspec structure as @arg.
But x86-specific DT IRQ allocation code casts @arg to of_phandle_args
structure pointer and crashes trying to read the IRQ parameters. The
function was not converted when the mapping descriptor was changed to
irq_fwspec in the generic irqdomain code.
Fixes: 11e4438ee3 ("irqdomain: Introduce a firmware-specific IRQ specifier structure")
Signed-off-by: Ivan Gorinov <ivan.gorinov@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Link: https://lkml.kernel.org/r/a234dee27ea60ce76141872da0d6bdb378b2a9ee.1520450752.git.ivan.gorinov@intel.com
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 174d1232eb ]
The chunk size of allocations in __gfs2_fallocate is calculated
incorrectly. The size can collapse, causing __gfs2_fallocate to
allocate one block at a time, which is very inefficient. This needs
fixing in two places:
In gfs2_quota_lock_check, always set ap->allowed to UINT_MAX to indicate
that there is no quota limit. This fixes callers that rely on
ap->allowed to be set even when quotas are off.
In __gfs2_fallocate, reset max_blks to UINT_MAX in each iteration of the
loop to make sure that allocation limits from one resource group won't
spill over into another resource group.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 90c29ed762 ]
hdr.len includes both the size of the header and the fragment, so using
this when stepping through the firmware causes us to skip 16 bytes every
chunk of 3072 bytes; causing only the first fragment to actually be
valid data.
Instead use fragment size steps through the firmware blob.
Fixes: ea7a1f275c ("soc: qcom: Introduce WCNSS_CTRL SMD client")
Reported-by: Will Newton <will.newton@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d31fc13fdc ]
There is a bug when reading event->count with large PEBS enabled.
Here is an example:
# ./read_count
0x71f0
0x122c0
0x1000000001c54
0x100000001257d
0x200000000bdc5
In fixed period mode, the auto-reload mechanism could be enabled for
PEBS events, but the calculation of event->count does not take the
auto-reload values into account.
Anyone who reads event->count will get the wrong result, e.g x86_pmu_read().
This bug was introduced with the auto-reload mechanism enabled since
commit:
851559e35f ("perf/x86/intel: Use the PEBS auto reload mechanism when possible")
Introduce intel_pmu_save_and_restart_reload() to calculate the
event->count only for auto-reload.
Since the counter increments a negative counter value and overflows on
the sign switch, giving the interval:
[-period, 0]
the difference between two consequtive reads is:
A) value2 - value1;
when no overflows have happened in between,
B) (0 - value1) + (value2 - (-period));
when one overflow happened in between,
C) (0 - value1) + (n - 1) * (period) + (value2 - (-period));
when @n overflows happened in between.
Here A) is the obvious difference, B) is the extension to the discrete
interval, where the first term is to the top of the interval and the
second term is from the bottom of the next interval and C) the extension
to multiple intervals, where the middle term is the whole intervals
covered.
The equation for all cases is:
value2 - value1 + n * period
Previously the event->count is updated right before the sample output.
But for case A, there is no PEBS record ready. It needs to be specially
handled.
Remove the auto-reload code from x86_perf_event_set_period() since
we'll not longer call that function in this case.
Based-on-code-from: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Fixes: 851559e35f ("perf/x86/intel: Use the PEBS auto reload mechanism when possible")
Link: http://lkml.kernel.org/r/1518474035-21006-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2bbea6e117 ]
when mounting an ISO filesystem sometimes (very rarely)
the system hangs because of a race condition between two tasks.
PID: 6766 TASK: ffff88007b2a6dd0 CPU: 0 COMMAND: "mount"
#0 [ffff880078447ae0] __schedule at ffffffff8168d605
#1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
#2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
#3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
#4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
#5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
#6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
#7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
#8 [ffff880078447da8] mount_bdev at ffffffff81202570
#9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
#10 [ffff880078447e28] mount_fs at ffffffff81202d09
#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
#12 [ffff880078447ea8] do_mount at ffffffff81220fee
#13 [ffff880078447f28] sys_mount at ffffffff812218d6
#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
RIP: 00007fd9ea914e9a RSP: 00007ffd5d9bf648 RFLAGS: 00010246
RAX: 00000000000000a5 RBX: ffffffff81698c49 RCX: 0000000000000010
RDX: 00007fd9ec2bc210 RSI: 00007fd9ec2bc290 RDI: 00007fd9ec2bcf30
RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000010
R10: 00000000c0ed0001 R11: 0000000000000206 R12: 00007fd9ec2bc040
R13: 00007fd9eb6b2380 R14: 00007fd9ec2bc210 R15: 00007fd9ec2bcf30
ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b
This task was trying to mount the cdrom. It allocated and configured a
super_block struct and owned the write-lock for the super_block->s_umount
rwsem. While exclusively owning the s_umount lock, it called
sr_block_ioctl and waited to acquire the global sr_mutex lock.
PID: 6785 TASK: ffff880078720fb0 CPU: 0 COMMAND: "systemd-udevd"
#0 [ffff880078417898] __schedule at ffffffff8168d605
#1 [ffff880078417900] schedule at ffffffff8168dc59
#2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
#3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
#4 [ffff8800784179d0] down_read at ffffffff8168cde0
#5 [ffff8800784179e8] get_super at ffffffff81201cc7
#6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
#7 [ffff880078417a40] flush_disk at ffffffff8123a94b
#8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
#9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
#16 [ffff880078417d00] do_last at ffffffff8120d53d
#17 [ffff880078417db0] path_openat at ffffffff8120e6b2
#18 [ffff880078417e48] do_filp_open at ffffffff8121082b
#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
#20 [ffff880078417f70] sys_open at ffffffff811fde4e
#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
RIP: 00007f29438b0c20 RSP: 00007ffc76624b78 RFLAGS: 00010246
RAX: 0000000000000002 RBX: ffffffff81698c49 RCX: 0000000000000000
RDX: 00007f2944a5fa70 RSI: 00000000000a0800 RDI: 00007f2944a5fa70
RBP: 00007f2944a5f540 R8: 0000000000000000 R9: 0000000000000020
R10: 00007f2943614c40 R11: 0000000000000246 R12: ffffffff811fde4e
R13: ffff880078417f78 R14: 000000000000000c R15: 00007f2944a4b010
ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b
This task tried to open the cdrom device, the sr_block_open function
acquired the global sr_mutex lock. The call to check_disk_change()
then saw an event flag indicating a possible media change and tried
to flush any cached data for the device.
As part of the flush, it tried to acquire the super_block->s_umount
lock associated with the cdrom device.
This was the same super_block as created and locked by the previous task.
The first task acquires the s_umount lock and then the sr_mutex_lock;
the second task acquires the sr_mutex_lock and then the s_umount lock.
This patch fixes the issue by moving check_disk_change() out of
cdrom_open() and let the caller take care of it.
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ecb29abd4c ]
A negative page register value means that no page needs to be
selected. This is used by status register read operations and needs
to be accepted. The failure to do so so results in missed status
and limit registers.
Fixes: da8e48ab48 ("hwmon: (pmbus) Always call _pmbus_read_byte in core driver")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a46f8cd696 ]
A negative page register value means that no page needs to be
selected. This is used by status register evaluations and needs
to be accepted.
Fixes: da8e48ab48 ("hwmon: (pmbus) Always call _pmbus_read_byte in core driver")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9e5b127d6f ]
Mark reported his arm64 perf fuzzer runs sometimes splat like:
armv8pmu_read_counter+0x1e8/0x2d8
armpmu_event_update+0x8c/0x188
armpmu_read+0xc/0x18
perf_output_read+0x550/0x11e8
perf_event_read_event+0x1d0/0x248
perf_event_exit_task+0x468/0xbb8
do_exit+0x690/0x1310
do_group_exit+0xd0/0x2b0
get_signal+0x2e8/0x17a8
do_signal+0x144/0x4f8
do_notify_resume+0x148/0x1e8
work_pending+0x8/0x14
which asserts that we only call pmu::read() on ACTIVE events.
The above callchain does:
perf_event_exit_task()
perf_event_exit_task_context()
task_ctx_sched_out() // INACTIVE
perf_event_exit_event()
perf_event_set_state(EXIT) // EXIT
sync_child_event()
perf_event_read_event()
perf_output_read()
perf_output_read_group()
leader->pmu->read()
Which results in doing a pmu::read() on an !ACTIVE event.
I _think_ this is 'new' since we added attr.inherit_stat, which added
the perf_event_read_event() to the exit path, without that
perf_event_read_output() would only trigger from samples and for
@event to trigger a sample, it's leader _must_ be ACTIVE too.
Still, adding this check makes it consistent with the @sub case for
the siblings.
Reported-and-Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 66ec32fc7c ]
max17042_get_status uses the core power_supply_am_i_supplied. That
function relies on DT properties to figure out the power supply
topology, and will error out without DT.
Fixes max17042 battery status being reported as "unknown".
Signed-off-by: Pierre Bourdon <delroth@google.com>
Signed-off-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bf617f7a92 ]
If noextent_cache mount option is on, we will never initialize extent tree
in inode, but still we're going to access it in f2fs_drop_extent_tree,
result in kernel panic as below:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: _raw_write_lock+0xc/0x30
Call Trace:
? f2fs_drop_extent_tree+0x41/0x70 [f2fs]
f2fs_fallocate+0x5a0/0xdd0 [f2fs]
? common_file_perm+0x47/0xc0
? apparmor_file_permission+0x1a/0x20
vfs_fallocate+0x15b/0x290
SyS_fallocate+0x44/0x70
do_syscall_64+0x6e/0x160
entry_SYSCALL64_slow_path+0x25/0x25
This patch fixes to check extent cache status before using in
f2fs_drop_extent_tree.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cd36d7a17f ]
Once CP_TRIMMED_FLAG is set, after a reboot, we will never issue discard
before LBA becomes invalid again, fix it by clearing the flag in
checkpoint without CP_TRIMMED reason.
Fixes: 1f43e2ad7b ("f2fs: introduce CP_TRIMMED_FLAG to avoid unneeded discard")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 17cd07ae95 ]
As Jayashree Mohan reported:
A simple workload to reproduce this would be :
1. create foo
2. Write (8K - 16K) // foo size = 16K now
3. fsync()
4. falloc zero_range , keep_size (4202496 - 4210688) // foo size must be 16K
5. fdatasync()
Crash now
On recovery, we see that the file size is 4210688 and not 16K, which
violates the semantics of keep_size flag. We have a test case to
reproduce this using CrashMonkey on 4.15 kernel. Try this out by
simply running :
./c_harness -f /dev/sda -d /dev/cow_ram0 -t f2fs -e 102400 -P -v
tests/generic_468_zero.so
The root cause is that we miss to set KEEP_SIZE bit correctly in zero_range
when zeroing block cross EOF with FALLOC_FL_KEEP_SIZE, let's fix this
missing case.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 94322ed8e8 ]
PSL9D doesn't have a data-cache that needs to be flushed before
resetting the card. However when cxl tries to flush data-cache on such
a card, it times-out as PSL_Control register never indicates flush
operation complete due to missing data-cache. This is usually
indicated in the kernel logs with this message:
"WARNING: cache flush timed out"
To fix this the patch checks PSL_Debug register CDC-Field(BIT:27)
which indicates the absence of a data-cache and sets a flag
'no_data_cache' in 'struct cxl_native' to indicate this. When
cxl_data_cache_flush() is called it checks the flag and if set bails
out early without requesting a data-cache flush operation to the PSL.
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 46706d5917 ]
Previously, we attempt to flush the whole cp pack in a single bio,
however, when suddenly powering off at this time, we could get into
an extreme scenario that cp pack 1 page and cp pack 2 page are updated
and latest, but payload or current summaries are still partially
outdated. (see reliable write in the UFS specification)
This patch submits the whole cp pack except cp pack 2 page at first,
and then writes the cp pack 2 page with an extra independent
bio with pre-io barrier.
Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2b74e2a9b3 ]
When sending TLB invalidates to the NPU we need to send extra flushes due
to a hardware issue. The original implementation would lock the all the
ATSD MMIO registers sequentially before unlocking and relocking each of
them sequentially to do the extra flush.
This introduced a deadlock as it is possible for one thread to hold one
ATSD register whilst waiting for another register to be freed while the
other thread is holding that register waiting for the one in the first
thread to be freed.
For example if there are two threads and two ATSD registers:
Thread A Thread B
----------------------
Acquire 1
Acquire 2
Release 1 Acquire 1
Wait 1 Wait 2
Both threads will be stuck waiting to acquire a register resulting in an
RCU stall warning or soft lockup.
This patch solves the deadlock by refactoring the code to ensure registers
are not released between flushes and to ensure all registers are either
acquired or released together and in order.
Fixes: bbd5ff50af ("powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD")
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f5246862f8 ]
In commit 4f8b50bbbe ("irq_work, ppc: Fix up arch hooks") a new
function arch_irq_work_raise() was added without a prototype in header
irq_work.h.
Fix the following warning (treated as error in W=1):
arch/powerpc/kernel/time.c:523:6: error: no previous prototype for ‘arch_irq_work_raise’
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d2fc8db691 ]
Assert RESET_SYSTEM bit for any reset and set MODE field from reset
type.
The watchdog control register has a RESET_SYSTEM bit that is really
closer to activate a reset, and RESET_SYSTEM_MODE field that chooses
how much to reset.
Before this patch, a node without these optional property would do a
SOC reset, but a node with properties requesting a cpu or SOC reset
would do nothing and a node requesting a system reset would do a
SOC reset.
Fixes: b7f0b8ad25 ("drivers/watchdog: ASPEED reference dev tree properties for config")
Signed-off-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Eddie James <eajames@linux.vnet.ibm.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a81abbb412 ]
RK3399 has rst_pulse_length in CONTROL_REG[4:2], determining the length
of pulse to issue for system reset. We shouldn't clobber this value,
because that might make the system reset ineffective. On RK3399, we're
seeing that a value of 000b (meaning 2 cycles) yields an unreliable
(partial?) reset, and so we only fully reset after the watchdog fires a
second time. If we retain the system default (010b, or 8 clock cycles),
then the watchdog reset is much more reliable.
Read-modify-write retains the system value and improves reset
reliability.
It seems we were intentionally clobbering the response mode previously,
to ensure we performed a system reset (we don't support an interrupt
notification), so retain that explicitly.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5775b843a6 ]
We leave PCI devices not bound to a driver in D0 during runtime suspend.
But they may have a parent which is bound and can be transitioned to
D3cold at runtime. Once the parent goes to D3cold, the unbound child
may go to D3cold as well. When the child goes to D3cold, its internal
state, including configuration of BARs, MSI, ASPM, MPS, etc., is lost.
One example are recent hybrid graphics laptops which cut power to the
discrete GPU when the root port above it goes to ACPI power state D3.
Users may provoke this by unbinding the GPU driver and allowing runtime
PM on the GPU via sysfs: The PM core will then treat the GPU as
"suspended", which in turn allows the root port to runtime suspend,
causing the power resources listed in its _PR3 object to be powered off.
The GPU's BARs will be uninitialized when a driver later probes it.
Another example are hybrid graphics laptops where the GPU itself (rather
than the root port) is capable of runtime suspending to D3cold. If the
GPU's integrated HDA controller is not bound and the GPU's driver
decides to runtime suspend to D3cold, the HDA controller's BARs will be
uninitialized when a driver later probes it.
Fix by saving and restoring config space over a runtime suspend cycle
even if the device is not bound.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[lukas: add commit message, bikeshed code comments for clarity]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/92fb6e6ae2730915eb733c08e2f76c6a313e3860.1520068884.git.lukas@wunner.de
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e676d81c89 ]
The case in which we handle a reset from the state where the device is
closed seems to be bugged for all types of reset. For most types of reset
we currently exit the reset routine correctly, but don't set the state to
indicate that we are back in the "closed" state. For some specific cases,
we don't exit the reset routine at all and resetting will cause a closed
device to be opened.
This patch fixes the problem by unconditionally checking the reset_state
and correctly setting the adapter state before returning.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 831c326fcd ]
Commit ad67b74d24 ("printk: hash addresses printed with %p") lets
printk specifier %p to hash all addresses before printing, this was
resulting in the high 32 bits of pcsr can only output zeros. So
module cannot completely print pc value and it's pointless for debugging
purpose.
This patch fixes this by using %px to print pcsr instead.
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 563c4ba3bd ]
ah_attr contains the port number to which cm_id is bound. However, while
searching for GID table for matching GID entry, the port number is
ignored.
This could cause the wrong GID to be used when the ah_attr is converted to
an AH.
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fca32340a5 ]
Executing command 'perf stat -T -- ls' dumps core on x86 and s390.
Here is the call back chain (done on x86):
# gdb ./perf
....
(gdb) r stat -T -- ls
...
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff56d1963 in vasprintf () from /lib64/libc.so.6
(gdb) where
#0 0x00007ffff56d1963 in vasprintf () from /lib64/libc.so.6
#1 0x00007ffff56ae484 in asprintf () from /lib64/libc.so.6
#2 0x00000000004f1982 in __parse_events_add_pmu (parse_state=0x7fffffffd580,
list=0xbfb970, name=0xbf3ef0 "cpu",
head_config=0xbfb930, auto_merge_stats=false) at util/parse-events.c:1233
#3 0x00000000004f1c8e in parse_events_add_pmu (parse_state=0x7fffffffd580,
list=0xbfb970, name=0xbf3ef0 "cpu",
head_config=0xbfb930) at util/parse-events.c:1288
#4 0x0000000000537ce3 in parse_events_parse (_parse_state=0x7fffffffd580,
scanner=0xbf4210) at util/parse-events.y:234
#5 0x00000000004f2c7a in parse_events__scanner (str=0x6b66c0
"task-clock,{instructions,cycles,cpu/cycles-t/,cpu/tx-start/}",
parse_state=0x7fffffffd580, start_token=258) at util/parse-events.c:1673
#6 0x00000000004f2e23 in parse_events (evlist=0xbe9990, str=0x6b66c0
"task-clock,{instructions,cycles,cpu/cycles-t/,cpu/tx-start/}", err=0x0)
at util/parse-events.c:1713
#7 0x000000000044e137 in add_default_attributes () at builtin-stat.c:2281
#8 0x000000000044f7b5 in cmd_stat (argc=1, argv=0x7fffffffe3b0) at
builtin-stat.c:2828
#9 0x00000000004c8b0f in run_builtin (p=0xab01a0 <commands+288>, argc=4,
argv=0x7fffffffe3b0) at perf.c:297
#10 0x00000000004c8d7c in handle_internal_command (argc=4,
argv=0x7fffffffe3b0) at perf.c:349
#11 0x00000000004c8ece in run_argv (argcp=0x7fffffffe20c,
argv=0x7fffffffe200) at perf.c:393
#12 0x00000000004c929c in main (argc=4, argv=0x7fffffffe3b0) at perf.c:537
(gdb)
It turns out that a NULL pointer is referenced. Here are the
function calls:
...
cmd_stat()
+---> add_default_attributes()
+---> parse_events(evsel_list, transaction_attrs, NULL);
3rd parameter set to NULL
Function parse_events(xx, xx, struct parse_events_error *err) dives
into a bison generated scanner and creates
parser state information for it first:
struct parse_events_state parse_state = {
.list = LIST_HEAD_INIT(parse_state.list),
.idx = evlist->nr_entries,
.error = err, <--- NULL POINTER !!!
.evlist = evlist,
};
Now various functions inside the bison scanner are called to end up in
__parse_events_add_pmu(struct parse_events_state *parse_state, ..) with
first parameter being a pointer to above structure definition.
Now the PMU event name is not found (because being executed in a VM) and
this function tries to create an error message with
asprintf(&parse_state->error.str, ....)
which references a NULL pointer and dumps core.
Fix this by providing a pointer to the necessary error information
instead of NULL. Technically only the else part is needed to avoid the
core dump, just lets be safe...
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180308145735.64717-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0bcc3fb95b ]
Devices which use level-triggered interrupts under Windows 2016 with
Hyper-V role enabled don't work: Windows disables EOI broadcast in SPIV
unconditionally. Our in-kernel IOAPIC implementation emulates an old IOAPIC
version which has no EOI register so EOI never happens.
The issue was discovered and discussed a while ago:
https://www.spinics.net/lists/kvm/msg148098.html
While this is a guest OS bug (it should check that IOAPIC has the required
capabilities before disabling EOI broadcast) we can workaround it in KVM:
advertising DIRECTED_EOI with in-kernel IOAPIC makes little sense anyway.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 31184d8c6e ]
The errata FE-8471889 description has been updated. There is still a
timing violation for repeated start. But the errata now states that it
was only the case for the Standard mode (100 kHz), in Fast mode (400 kHz)
there is no issue.
This patch limit the errata fix to the Standard mode.
It has been tesed successfully on the clearfog (Aramda 388 based board).
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d7cb44496a ]
Setting sge_uld_rxq_info to NULL in free_queues_uld().
We are referencing sge_uld_rxq_info in cxgb_up(). This
will fix a panic when interface is brought up after a
ULDq creation failure.
Fixes: 94cdb8bb99 (cxgb4: Add support for dynamic allocation
of resources for ULD)
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudhar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3fd47bfe55 ]
struct delayed_work writeback_rate_update in struct cache_dev is a delayed
worker to call function update_writeback_rate() in period (the interval is
defined by dc->writeback_rate_update_seconds).
When a metadate I/O error happens on cache device, bcache error handling
routine bch_cache_set_error() will call bch_cache_set_unregister() to
retire whole cache set. On the unregister code path, this delayed work is
stopped by calling cancel_delayed_work_sync(&dc->writeback_rate_update).
dc->writeback_rate_update is a special delayed work from others in bcache.
In its routine update_writeback_rate(), this delayed work is re-armed
itself. That means when cancel_delayed_work_sync() returns, this delayed
work can still be executed after several seconds defined by
dc->writeback_rate_update_seconds.
The problem is, after cancel_delayed_work_sync() returns, the cache set
unregister code path will continue and release memory of struct cache set.
Then the delayed work is scheduled to run, __update_writeback_rate()
will reference the already released cache_set memory, and trigger a NULL
pointer deference fault.
This patch introduces two more bcache device flags,
- BCACHE_DEV_WB_RUNNING
bit set: bcache device is in writeback mode and running, it is OK for
dc->writeback_rate_update to re-arm itself.
bit clear:bcache device is trying to stop dc->writeback_rate_update,
this delayed work should not re-arm itself and quit.
- BCACHE_DEV_RATE_DW_RUNNING
bit set: routine update_writeback_rate() is executing.
bit clear: routine update_writeback_rate() quits.
This patch also adds a function cancel_writeback_rate_update_dwork() to
wait for dc->writeback_rate_update quits before cancel it by calling
cancel_delayed_work_sync(). In order to avoid a deadlock by unexpected
quit dc->writeback_rate_update, after time_out seconds this function will
give up and continue to call cancel_delayed_work_sync().
And here I explain how this patch stops self re-armed delayed work properly
with the above stuffs.
update_writeback_rate() sets BCACHE_DEV_RATE_DW_RUNNING at its beginning
and clears BCACHE_DEV_RATE_DW_RUNNING at its end. Before calling
cancel_writeback_rate_update_dwork() clear flag BCACHE_DEV_WB_RUNNING.
Before calling cancel_delayed_work_sync() wait utill flag
BCACHE_DEV_RATE_DW_RUNNING is clear. So when calling
cancel_delayed_work_sync(), dc->writeback_rate_update must be already re-
armed, or quite by seeing BCACHE_DEV_WB_RUNNING cleared. In both cases
delayed work routine update_writeback_rate() won't be executed after
cancel_delayed_work_sync() returns.
Inside update_writeback_rate() before calling schedule_delayed_work(), flag
BCACHE_DEV_WB_RUNNING is checked before. If this flag is cleared, it means
someone is about to stop the delayed work. Because flag
BCACHE_DEV_RATE_DW_RUNNING is set already and cancel_delayed_work_sync()
has to wait for this flag to be cleared, we don't need to worry about race
condition here.
If update_writeback_rate() is scheduled to run after checking
BCACHE_DEV_RATE_DW_RUNNING and before calling cancel_delayed_work_sync()
in cancel_writeback_rate_update_dwork(), it is also safe. Because at this
moment BCACHE_DEV_WB_RUNNING is cleared with memory barrier. As I mentioned
previously, update_writeback_rate() will see BCACHE_DEV_WB_RUNNING is clear
and quit immediately.
Because there are more dependences inside update_writeback_rate() to struct
cache_set memory, dc->writeback_rate_update is not a simple self re-arm
delayed work. After trying many different methods (e.g. hold dc->count, or
use locks), this is the only way I can find which works to properly stop
dc->writeback_rate_update delayed work.
Changelog:
v3: change values of BCACHE_DEV_WB_RUNNING and BCACHE_DEV_RATE_DW_RUNNING
to bit index, for test_bit().
v2: Try to fix the race issue which is pointed out by Junhui.
v1: The initial version for review
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Junhui Tang <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 804f3c6981 ]
When bcache metadata I/O fails, bcache will call bch_cache_set_error()
to retire the whole cache set. The expected behavior to retire a cache
set is to unregister the cache set, and unregister all backing device
attached to this cache set, then remove sysfs entries of the cache set
and all attached backing devices, finally release memory of structs
cache_set, cache, cached_dev and bcache_device.
In my testing when journal I/O failure triggered by disconnected cache
device, sometimes the cache set cannot be retired, and its sysfs
entry /sys/fs/bcache/<uuid> still exits and the backing device also
references it. This is not expected behavior.
When metadata I/O failes, the call senquence to retire whole cache set is,
bch_cache_set_error()
bch_cache_set_unregister()
bch_cache_set_stop()
__cache_set_unregister() <- called as callback by calling
clousre_queue(&c->caching)
cache_set_flush() <- called as a callback when refcount
of cache_set->caching is 0
cache_set_free() <- called as a callback when refcount
of catch_set->cl is 0
bch_cache_set_release() <- called as a callback when refcount
of catch_set->kobj is 0
I find if kernel thread bch_writeback_thread() quits while-loop when
kthread_should_stop() is true and searched_full_index is false, clousre
callback cache_set_flush() set by continue_at() will never be called. The
result is, bcache fails to retire whole cache set.
cache_set_flush() will be called when refcount of closure c->caching is 0,
and in function bcache_device_detach() refcount of closure c->caching is
released to 0 by clousre_put(). In metadata error code path, function
bcache_device_detach() is called by cached_dev_detach_finish(). This is a
callback routine being called when cached_dev->count is 0. This refcount
is decreased by cached_dev_put().
The above dependence indicates, cache_set_flush() will be called when
refcount of cache_set->cl is 0, and refcount of cache_set->cl to be 0
when refcount of cache_dev->count is 0.
The reason why sometimes cache_dev->count is not 0 (when metadata I/O fails
and bch_cache_set_error() called) is, in bch_writeback_thread(), refcount
of cache_dev is not decreased properly.
In bch_writeback_thread(), cached_dev_put() is called only when
searched_full_index is true and cached_dev->writeback_keys is empty, a.k.a
there is no dirty data on cache. In most of run time it is correct, but
when bch_writeback_thread() quits the while-loop while cache is still
dirty, current code forget to call cached_dev_put() before this kernel
thread exits. This is why sometimes cache_set_flush() is not executed and
cache set fails to be retired.
The reason to call cached_dev_put() in bch_writeback_rate() is, when the
cache device changes from clean to dirty, cached_dev_get() is called, to
make sure during writeback operatiions both backing and cache devices
won't be released.
Adding following code in bch_writeback_thread() does not work,
static int bch_writeback_thread(void *arg)
}
+ if (atomic_read(&dc->has_dirty))
+ cached_dev_put()
+
return 0;
}
because writeback kernel thread can be waken up and start via sysfs entry:
echo 1 > /sys/block/bcache<N>/bcache/writeback_running
It is difficult to check whether backing device is dirty without race and
extra lock. So the above modification will introduce potential refcount
underflow in some conditions.
The correct fix is, to take cached dev refcount when creating the kernel
thread, and put it before the kernel thread exits. Then bcache does not
need to take a cached dev refcount when cache turns from clean to dirty,
or to put a cached dev refcount when cache turns from ditry to clean. The
writeback kernel thread is alwasy safe to reference data structure from
cache set, cache and cached device (because a refcount of cache device is
taken for it already), and no matter the kernel thread is stopped by I/O
errors or system reboot, cached_dev->count can always be used correctly.
The patch is simple, but understanding how it works is quite complicated.
Changelog:
v2: set dc->writeback_thread to NULL in this patch, as suggested by Hannes.
v1: initial version for review.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Junhui Tang <tang.junhui@zte.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2e08e4d2ff ]
The Allwinner H6 main CCU uses the internal oscillator of the SoC, which
is different with old SoCs' main CCU.
Add device tree binding for the Allwinner H6 main CCU.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fadd94e05c ]
In patch "bcache: fix cached_dev->count usage for bch_cache_set_error()",
cached_dev_get() is called when creating dc->writeback_thread, and
cached_dev_put() is called when exiting dc->writeback_thread. This
modification works well unless people detach the bcache device manually by
'echo 1 > /sys/block/bcache<N>/bcache/detach'
Because this sysfs interface only calls bch_cached_dev_detach() which wakes
up dc->writeback_thread but does not stop it. The reason is, before patch
"bcache: fix cached_dev->count usage for bch_cache_set_error()", inside
bch_writeback_thread(), if cache is not dirty after writeback,
cached_dev_put() will be called here. And in cached_dev_make_request() when
a new write request makes cache from clean to dirty, cached_dev_get() will
be called there. Since we don't operate dc->count in these locations,
refcount d->count cannot be dropped after cache becomes clean, and
cached_dev_detach_finish() won't be called to detach bcache device.
This patch fixes the issue by checking whether BCACHE_DEV_DETACHING is
set inside bch_writeback_thread(). If this bit is set and cache is clean
(no existing writeback_keys), break the while-loop, call cached_dev_put()
and quit the writeback thread.
Please note if cache is still dirty, even BCACHE_DEV_DETACHING is set the
writeback thread should continue to perform writeback, this is the original
design of manually detach.
It is safe to do the following check without locking, let me explain why,
+ if (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) &&
+ (!atomic_read(&dc->has_dirty) || !dc->writeback_running)) {
If the kenrel thread does not sleep and continue to run due to conditions
are not updated in time on the running CPU core, it just consumes more CPU
cycles and has no hurt. This should-sleep-but-run is safe here. We just
focus on the should-run-but-sleep condition, which means the writeback
thread goes to sleep in mistake while it should continue to run.
1, First of all, no matter the writeback thread is hung or not,
kthread_stop() from cached_dev_detach_finish() will wake up it and
terminate by making kthread_should_stop() return true. And in normal
run time, bit on index BCACHE_DEV_DETACHING is always cleared, the
condition
!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags)
is always true and can be ignored as constant value.
2, If one of the following conditions is true, the writeback thread should
go to sleep,
"!atomic_read(&dc->has_dirty)" or "!dc->writeback_running)"
each of them independently controls the writeback thread should sleep or
not, let's analyse them one by one.
2.1 condition "!atomic_read(&dc->has_dirty)"
If dc->has_dirty is set from 0 to 1 on another CPU core, bcache will
call bch_writeback_queue() immediately or call bch_writeback_add() which
indirectly calls bch_writeback_queue() too. In bch_writeback_queue(),
wake_up_process(dc->writeback_thread) is called. It sets writeback
thread's task state to TASK_RUNNING and following an implicit memory
barrier, then tries to wake up the writeback thread.
In writeback thread, its task state is set to TASK_INTERRUPTIBLE before
doing the condition check. If other CPU core sets the TASK_RUNNING state
after writeback thread setting TASK_INTERRUPTIBLE, the writeback thread
will be scheduled to run very soon because its state is not
TASK_INTERRUPTIBLE. If other CPU core sets the TASK_RUNNING state before
writeback thread setting TASK_INTERRUPTIBLE, the implict memory barrier
of wake_up_process() will make sure modification of dc->has_dirty on
other CPU core is updated and observed on the CPU core of writeback
thread. Therefore the condition check will correctly be false, and
continue writeback code without sleeping.
2.2 condition "!dc->writeback_running)"
dc->writeback_running can be changed via sysfs file, every time it is
modified, a following bch_writeback_queue() is alwasy called. So the
change is always observed on the CPU core of writeback thread. If
dc->writeback_running is changed from 0 to 1 on other CPU core, this
condition check will observe the modification and allow writeback
thread to continue to run without sleeping.
Now we can see, even without a locking protection, multiple conditions
check is safe here, no deadlock or process hang up will happen.
I compose a separte patch because that patch "bcache: fix cached_dev->count
usage for bch_cache_set_error()" already gets a "Reviewed-by:" from Hannes
Reinecke. Also this fix is not trivial and good for a separate patch.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Huijun Tang <tang.junhui@zte.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 55496d3fe2 ]
The generic DMA API uses dev->dma_mask to check the DMA addressable
memory bitmask, and warns if no mask is set or even allocated.
Set z->dev.dma_coherent_mask on Zorro bus scan, and make z->dev.dma_mask
to point to z->dev.dma_coherent_mask so device drivers that need DMA have
everything set up to avoid warnings from dma_alloc_coherent(). Drivers can
still use dma_set_mask_and_coherent() to explicitly set their DMA bit mask.
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
[geert: Handle Zorro II with 24-bit address space]
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7672ed33c4 ]
Before commit f1b65df5a2 ("IB/mlx5: Add support for active_width and
active_speed in RoCE"), the mlx5_ib driver set the default active_width
and active_speed to IB_WIDTH_4X and IB_SPEED_QDR.
When the RoCE port is down, the RoCE port does not negotiate the active
width with the remote side, causing the active width to be zero. When
running userspace ibstat to view the port status, ibstat will panic as it
reads an invalid width from sys file.
This patch restores the original behavior.
Fixes: f1b65df5a2 ("IB/mlx5: Add support for active_width and active_speed in RoCE").
Signed-off-by: Honggang Li <honli@redhat.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Reviewed-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d15d731155 ]
Currently fw_add_devm_name() returns 1 if the firmware cache
was already set. This makes it complicated for us to check for
correctness. It is actually non-fatal if the firmware cache
is already setup, so just return 0, and simplify the checkers.
fw_add_devm_name() adds device's name onto the devres for the
device so that prior to suspend we cache the firmware onto memory,
so that on resume the firmware is reliably available. We never
were checking for success for this call though, meaning in some
really rare cases we my have never setup the firmware cache for
a device, which could in turn make resume fail.
This is all theoretical, no known issues have been reported.
This small issue has been present way since the addition of the
devres firmware cache names on v3.7.
Fixes: f531f05ae9 ("firmware loader: store firmware name into devres list")
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 70ca608b2e ]
In MediaTek's IOMMU design, When a iommu translation fault occurs
(HW can NOT translate the destination address to a valid physical
address), the IOMMU HW output the dirty data into a special memory
to avoid corrupting the main memory, this is called "protect memory".
the register(0x114) for protect memory is a little different between
mt8173 and mt2712.
In the mt8173, bit[30:6] in the register represents [31:7] of the
physical address. In the 4GB mode, the register bit[31] should be 1.
While in the mt2712, the bits don't shift. bit[31:7] in the register
represents [31:7] in the physical address, and bit[1:0] in the
register represents bit[33:32] of the physical address if it has.
Fixes: e6dec92308 ("iommu/mediatek: Add mt2712 IOMMU support")
Reported-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 20fb5a635a ]
We were relying on the pinned screen object backup buffer to be destroyed
when not used. But if we hold a copy of the atomic state, like when
hibernating, the backup buffer might not be destroyed since it's
refcounted by the atomic state. This causes us to hibernate with a
buffer pinned in VRAM.
Fix this by only having the buffer pinned when it is actually used by a
screen object.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0d9366d67b ]
If mount is auto-probing for filesystem type, it will try various
filesystems in order, with the MS_SILENT flag set. We get
that flag as the silent arg to ext4_fill_super.
If we're probing (silent==1) then don't complain about feature
incompatibilities that are found if it looks like it's actually
a different valid extN type - failed probes should be silent
in this case.
If the on-disk features are unknown even to ext4, then complain.
Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Tested-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bda7fab548 ]
The operstate update logic will leave an interface in the
default UNKNOWN operstate if the interface carrier state never changes
from the default carrier up state set at creation. This includes the
case of an explicit call to netif_carrier_on, as the carrier on to on
transition has no effect on operstate.
This affects virtio-net for the case that the virtio peer does
not support VIRTIO_NET_F_STATUS (the feature that provides carrier state
updates). Without this feature, the virtio specification states that
"the link should be assumed active," so, logically, the operstate should
be UP instead of UNKNOWN. This has impact on user space applications
that use the operstate to make availability decisions for the interface.
Resolve this by changing the virtio probe logic slightly to call
netif_carrier_off for both the "with" and "without" VIRTIO_NET_F_STATUS
cases, and then the existing call to netif_carrier_on for the "without"
case will cause an operstate transition.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bb491ce67a ]
When punching a hole or truncating an inode down to a given size, also
check if the truncate point / start of the hole is within the range we
have metadata for. Otherwise, we can end up freeing blocks that
shouldn't be freed, corrupting the inode, or crashing the machine when
trying to punch a hole into the void.
When growing an inode via truncate, we set the new size but we don't
allocate additional levels of indirect blocks and grow the inode height.
When shrinking that inode again, the new size may still point beyond the
end of the inode's metadata.
Fixes xfstest generic/476.
Debugged-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6ffa340221 ]
Allow the device tree to specify a watchdog to fallover to
the alternate boot source.
The aspeeed watchdog can set a latch directing flash chip select 0 to
chip select 1, allowing boot from an alternate media if the watchdog
is not reset in time. On the ast2400 bank 1 also goes to flash bank 1,
while on the ast2500 the chip selects are swapped.
Also clear the secondary boot bit during the machine restart operation.
Otherwise, the system will switch to the alternate boot after every
reboot, which is not desired.
Signed-off-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Eddie James <eajames@linux.vnet.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fac37c628f ]
TPM_CRB driver provides TPM CRB 2.0 support. If it is built as a
module, the TPM chip is registered after IMA init. tpm_pcr_read() in
IMA fails and displays the following message even though eventually
there is a TPM chip on the system.
ima: No TPM chip found, activating TPM-bypass! (rc=-19)
Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is built in the kernel
and initializes before IMA.
Signed-off-by: Jiandi An <anjiandi@codeaurora.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5c71dadbb4 ]
As defined in hyperv_net.h, the NVSP_STAT_SUCCESS is one not zero.
Some functions returns 0 when it actually means NVSP_STAT_SUCCESS.
This patch fixes them.
In netvsc_receive(), it puts the last RNDIS packet's receive status
for all packets in a vmxferpage which may contain multiple RNDIS
packets.
This patch puts NVSP_STAT_FAIL in the receive completion if one of
the packets in a vmxferpage fails.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 843bd7db79 ]
When NetworkManager is enabled, there are chances that interface up
is called even before probe completes. This means we have not yet
allocated the FW sge queues, hence rest of ingress queue allocation
wont be proper. Fix this by calling setup_fw_sge_queues() before
register_netdev().
Fixes: 0fbc81b3ad ('chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's')
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit acf18c56fd ]
The replace target device can be missing when mounted with -o degraded,
but we wont allocate a missing btrfs_device to it. So check the device
before accessing.
BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
IP: btrfs_destroy_dev_replace_tgtdev+0x43/0xf0 [btrfs]
Call Trace:
btrfs_dev_replace_cancel+0x15f/0x180 [btrfs]
btrfs_ioctl+0x2216/0x2590 [btrfs]
do_vfs_ioctl+0x625/0x650
SyS_ioctl+0x4e/0x80
do_syscall_64+0x5d/0x160
entry_SYSCALL64_slow_path+0x25/0x25
This patch has been moved in front of patch "btrfs: log, when replace,
is canceled by the user" that could reproduce the crash if the system
reboots inside btrfs_dev_replace_start before the
btrfs_dev_replace_finishing call.
$ mkfs /dev/sda
$ mount /dev/sda mnt
$ btrfs replace start /dev/sda /dev/sdb
<insert reboot>
$ mount po degraded /dev/sdb mnt
<crash>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
[ added reproducer description from mail ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 467c77d4cb ]
Yet another "incompatible" Samsung NVMe SSD 960 EVO and Asus motherboard
combination. 960 EVO device disappears from PCIe bus within few minutes
after boot-up when APST is in use and never gets back. Forcing
NVME_QUIRK_NO_APST is the only way to make this drive work with this
particular motherboard. NVME_QUIRK_NO_DEEPEST_PS doesn't work, upgrading
motherboard's BIOS didn't help either.
Since this is a desktop motherboard, the only drawback of not using APST
is increased device temperature.
Signed-off-by: Jarosław Janik <jaroslaw.janik@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b12740d316 ]
Another abort race: An io request is started, becomes active,
and is attempted to be started with the lldd. At the same time
the controller is stopped/torndown and an itterator is run to
abort the ios. As the io is active, it is added to the outstanding
aborted io count. However on the original io request thread, the
driver ends up rejecting the io due to the condition that induced
the controller teardown. The driver reject path didn't check whether
it was in the outstanding io count. This left the count outstanding
stopping controller teardown.
Correct by, in the driver reject case, setting the state to
inactive and checking whether it was in the outstanding io count.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8b2d93dd22 ]
When attempt to run worker (ath10k_sta_rc_update_wk) after the station object
(ieee80211_sta) delete will trigger the kernel panic.
This problem arise in AP + Mesh configuration, Where the current node AP VAP
and neighbor node mesh VAP MAC address are same. When the current mesh node
try to establish the mesh link with neighbor node, driver peer creation for
the neighbor mesh node fails due to duplication MAC address. Already the AP
VAP created with same MAC address.
It is caused by the following scenario steps.
Steps:
1. In above condition, ath10k driver sta_state callback (ath10k_sta_state)
fails to do the state change for a station from IEEE80211_STA_NOTEXIST
to IEEE80211_STA_NONE due to peer creation fails. Sta_state callback is
called from ieee80211_add_station() to handle the new station
(neighbor mesh node) request from the wpa_supplicant.
2. Concurrently ath10k receive the sta_rc_update callback notification from
the mesh_neighbour_update() to handle the beacon frames of the above
neighbor mesh node. since its atomic callback, ath10k driver queue the
work (ath10k_sta_rc_update_wk) to handle rc update.
3. Due to driver sta_state callback fails (step 1), mac80211 free the station
object.
4. When the worker (ath10k_sta_rc_update_wk) scheduled to run, it will access
the station object which is already deleted. so it will trigger kernel
panic.
Added the peer exist check in sta_rc_update callback before queue the work.
Kernel Panic log:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0204000
[00000000] *pgd=00000000
Internal error: Oops: 17 [#1] PREEMPT SMP ARM
CPU: 1 PID: 1833 Comm: kworker/u4:2 Not tainted 3.14.77 #1
task: dcef0000 ti: d72b6000 task.ti: d72b6000
PC is at pwq_activate_delayed_work+0x10/0x40
LR is at pwq_activate_delayed_work+0xc/0x40
pc : [<c023f988>] lr : [<c023f984>] psr: 40000193
sp : d72b7f18 ip : 0000007a fp : d72b6000
r10: 00000000 r9 : dd404414 r8 : d8c31998
r7 : d72b6038 r6 : 00000004 r5 : d4907ec8 r4 : dcee1300
r3 : ffffffe0 r2 : 00000000 r1 : 00000001 r0 : 00000000
Flags: nZcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: 10c5787d Table: 595bc06a DAC: 00000015
...
Process kworker/u4:2 (pid: 1833, stack limit = 0xd72b6238)
Stack: (0xd72b7f18 to 0xd72b8000)
7f00: 00000001 dcee1300
7f20: 00000001 c02410dc d8c31980 dd404400 dd404400 c0242790 d8c31980 00000089
7f40: 00000000 d93e1340 00000000 d8c31980 c0242568 00000000 00000000 00000000
7f60: 00000000 c02474dc 00000000 00000000 000000f8 d8c31980 00000000 00000000
7f80: d72b7f80 d72b7f80 00000000 00000000 d72b7f90 d72b7f90 d72b7fac d93e1340
7fa0: c0247404 00000000 00000000 c0208d20 00000000 00000000 00000000 00000000
7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[<c023f988>] (pwq_activate_delayed_work) from [<c02410dc>] (pwq_dec_nr_in_flight+0x58/0xc4)
[<c02410dc>] (pwq_dec_nr_in_flight) from [<c0242790>] (worker_thread+0x228/0x360)
[<c0242790>] (worker_thread) from [<c02474dc>] (kthread+0xd8/0xec)
[<c02474dc>] (kthread) from [<c0208d20>] (ret_from_fork+0x14/0x34)
Code: e92d4038 e1a05000 ebffffbc[69210.619376] SMP: failed to stop secondary CPUs
Rebooting in 3 seconds..
Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0c29ba1b43 ]
The call to rmnet_get_endpoint can potentially return NULL so check
for this to avoid any subsequent null pointer dereferences on a NULL
ep.
Detected by CoverityScan, CID#1465385 ("Dereference null return value")
Fixes: 23790ef120 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d66e53649c ]
clk_disable_unprepare() was added to one error path,
but there is another one. The patch makes sure clk is
disabled at the both of them.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 957f6ba8ad ]
The system with CONFIG_UBSAN enabled on produces the following error
during driver initialization. The reason to it that max_reg_cmds can be
larger enough to cause to "1 << max_reg_cmds" overflow the unsigned long.
================================================================================
UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/cmd.c:1805:42
signed integer overflow:
-2147483648 - 1 cannot be represented in type 'int'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-rc2-00032-g06cda2358d9b-dirty #724
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
dump_stack+0xe9/0x18f
? dma_virt_alloc+0x81/0x81
ubsan_epilogue+0xe/0x4e
handle_overflow+0x187/0x20c
mlx5_cmd_init+0x73a/0x12b0
mlx5_load_one+0x1c3d/0x1d30
init_one+0xd02/0xf10
pci_device_probe+0x26c/0x3b0
driver_probe_device+0x622/0xb40
__driver_attach+0x175/0x1b0
bus_for_each_dev+0xef/0x190
bus_add_driver+0x2db/0x490
driver_register+0x16b/0x1e0
__pci_register_driver+0x177/0x1b0
init+0x6d/0x92
do_one_initcall+0x15b/0x270
kernel_init_freeable+0x2d8/0x3d0
kernel_init+0x14/0x190
ret_from_fork+0x24/0x30
================================================================================
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f0ee70a042 ]
When we suspend and resume, we need to clear and re-enable the interrupt
scheme. This was previously not done while holding the RTNL lock, which
could be problematic, because we are actually destroying and re-creating
queues.
Hold the RTNL lock for the entire sequence of preparing for reset, and
when resuming. This additionally protects the flags related to interrupt
scheme under RTNL lock so that their modification is properly threaded.
This is part of a larger effort to remove the need for cmpxchg64 in
i40e_set_priv_flags().
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 88893cf787 ]
Some tests cause the kernel to print things to the kernel log
buffer (ie. printk), in particular oops and warnings etc. However when
running all the tests in succession it's not always obvious which
test(s) caused the kernel to print something.
We can narrow it down by printing which test directory we're running
in to /dev/kmsg, if it's writable.
Example output:
[ 170.149149] kselftest: Running tests in powerpc
[ 305.300132] kworker/dying (71) used greatest stack depth: 7776 bytes
left
[ 808.915456] kselftest: Running tests in pstore
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6c59f64b7e ]
Fixes a segfault occurring when e.g. <TAB> is pressed multiple times in the
ncurses tmon application. The segfault is caused by incrementing
cur_thermal_record in the main function without checking if it's value reached
NR_THERMAL_RECORD immediately. Since the boundary check only occurred in
update_thermal_data a race condition existed, which lead to an attempted read
beyond the last element of the trec array.
The fix was implemented by moving the cur_thermal_record incrementation to the
update_thermal_data function using a temporary variable on which the boundary
condition is checked before updating cur_thread_record, so that the variable is
never incremented beyond the trec array's boundary.
It seems the segfault does not occur on every machine: On a HP EliteBook G4 the
segfault happens, while it does not happen on a Thinkpad T540p.
Signed-off-by: Frank Asseg <frank.asseg@objecthunter.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e1ebd0e5b9 ]
Current code in power_pmu_disable() does not clear the sampling
registers like Sampling Instruction Address Register (SIAR) and
Sampling Data Address Register (SDAR) after disabling the PMU. Since
these are userspace readable and could contain kernel addresses, add
code to explicitly clear the content of these registers.
Also add a "context synchronizing instruction" to enforce no further
updates to these registers as suggested by Power ISA v3.0B. From
section 9.4, on page 1108:
"If an mtspr instruction is executed that changes the value of a
Performance Monitor register other than SIAR, SDAR, and SIER, the
change is not guaranteed to have taken effect until after a
subsequent context synchronizing instruction has been executed (see
Chapter 11. "Synchronization Requirements for Context Alterations"
on page 1133)."
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Massage change log and add ISA reference]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bb19af8160 ]
The current Branch History Rolling Buffer (BHRB) code does not check
for any privilege levels before updating the data from BHRB. This
could leak kernel addresses to userspace even when profiling only with
userspace privileges. Add proper checks to prevent it.
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6958b02743 ]
Fix a theoretical NULL pointer dereferencing in mt76x2_tx routine that
can occurs for injected frames in a monitor vif since vif pointer could
be NULL for that interfaces
Fixes: 2340523646 ("mt76: fix transmission of encrypted mgmt frames")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 415eb2a1aa ]
pwmX_mode is defined in the ABI as 0=DC mode, 1=pwm mode. The chip
register bit is set to 1 for DC mode. This got mixed up, and writing
1 into pwmX_mode resulted in DC mode enabled. Fix it up by using
the ABI definition throughout the driver for consistency.
Fixes: 77eb5b3703 ("hwmon: (nct6775) Add support for pwm, pwm_mode, ... ")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b845f66f78 ]
Carlo Pisani noticed that his C3600 workstation behaved unstable during heavy
I/O on the PCI bus with a VIA VT6421 IDE/SATA PCI card.
To avoid such instability, this patch switches the LBA PCI bus from Hard Fail
mode into Soft Fail mode. In this mode the bus will return -1UL for timed out
MMIO transactions, which is exactly how the x86 (and most other architectures)
PCI busses behave.
This patch is based on a proposal by Grant Grundler and Kyle McMartin 10
years ago:
https://www.spinics.net/lists/linux-parisc/msg01027.html
Cc: Carlo Pisani <carlojpisani@gmail.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Reviewed-by: Grant Grundler <grantgrundler@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bfc647d52e ]
Driver callback for handling TX timeout should access some internal
resources (SQ, CQ) in order to decide if the tx timeout work should be
scheduled. These resources might be unavailable if channels are closed
in parallel (ifdown for example).
The state lock is the mechanism to protect from such races.
Move all TX timeout logic to be in the work under a state lock.
In addition, Move the work from the global WQ to mlx5e WQ to make sure
this work is flushed when device is detached..
Also, move the mlx5e_tx_timeout_work code to be next to the TX timeout
NDO for better code locality.
Fixes: 3947ca1859 ("net/mlx5e: Implement ndo_tx_timeout callback")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9a233bb802 ]
Sometimes iwl_mvm_disable_txq() may be called with mac80211_queue ==
IEEE80211_INVAL_HW_QUEUE, and this would cause us to use BIT(0xFF)
which is way too large for the u16 we used to store it in
hw_queue_to_mac820211. If this happens the following UBSAN warning
will be generated:
[ 167.185167] UBSAN: Undefined behaviour in drivers/net/wireless/intel/iwlwifi/mvm/utils.c:838:5
[ 167.185171] shift exponent 255 is too large for 64-bit type 'long unsigned int'
Fix that by checking that it is not IEEE80211_INVAL_HW_QUEUE and,
while at it, add a warning if the queue number is larger than
IEEE80211_MAX_QUEUES.
Fixes: 34e10860ae ("iwlwifi: mvm: remove references to queue_info in new TX path")
Reported-by: Paul Menzel <pmenzel+linux-wireless@molgen.mpg.de>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f61e64310b ]
As of commit 205e1b7f51 ("dma-mapping: warn when there is no
coherent_dma_mask") the Freescale FEC driver is issuing the following
warning on driver initialization on ColdFire systems:
WARNING: CPU: 0 PID: 1 at ./include/linux/dma-mapping.h:516 0x40159e20
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 4.16.0-rc7-dirty #4
Stack from 41833dd8:
41833dd8 40259c53 40025534 40279e26 00000003 00000000 4004e514 41827000
400255de 40244e42 00000204 40159e20 00000009 00000000 00000000 4024531d
40159e20 40244e42 00000204 00000000 00000000 00000000 00000007 00000000
00000000 40279e26 4028d040 40226576 4003ae88 40279e26 418273f6 41833ef8
7fffffff 418273f2 41867028 4003c9a2 4180ac6c 00000004 41833f8c 4013e71c
40279e1c 40279e26 40226c16 4013ced2 40279e26 40279e58 4028d040 00000000
Call Trace:
[<40025534>] 0x40025534
[<4004e514>] 0x4004e514
[<400255de>] 0x400255de
[<40159e20>] 0x40159e20
[<40159e20>] 0x40159e20
It is not fatal, the driver and the system continue to function normally.
As per the warning the coherent_dma_mask is not set on this device.
There is nothing special about the DMA memory coherency on this hardware
so we can just set the mask to 32bits in the platform data for the FEC
ethernet devices.
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9ad5770871 ]
Since commit 8edc514b01 ("intel_th: Make SOURCE devices children of the
root device") the hub is not the parent of SOURCE devices any more, so the
new helper function should be used for that instead of always using the
parent. The intel_th_set_output() path, however, still uses the old
logic, leading to the hub driver structure being aliased with something
else, like struct pci_driver or struct acpi_driver, and an incorrect call
to an address inferred from that, potentially resulting in a crash.
Fixes: 8edc514b01 ("intel_th: Make SOURCE devices children of the root device")
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 39ffe39545 ]
find_dev_data() does not check whether the return value alloc_dev_data()
is NULL. This was okay once because the pointer was returned once as-is.
Since commit df3f7a6e8e ("iommu/amd: Use is_attach_deferred
call-back") the pointer may be used within find_dev_data() so a NULL
check is required.
Cc: Baoquan He <bhe@redhat.com>
Fixes: df3f7a6e8e ("iommu/amd: Use is_attach_deferred call-back")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8ebee73b57 ]
This patch fixes regression caused by 0c317a02ca
("cfg80211: support virtual interfaces with different beacon intervals"),
with this change cfg80211 expects the driver to advertize
'beacon_int_min_gcd' to support different beacon intervals in multivap
scenario. This support is added for, QCA988X/QCA99X0/QCA9984/QCA4019.
Verifed AP + mesh bring up on QCA9984 with beacon interval 100msec and
1000msec respectively.
Frimware: firmware-5.bin_10.4-3.5.3-00053
Fixes: 0c317a02ca ("cfg80211: support virtual interfaces with different beacon intervals")
Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 86674a97f5 ]
In ca8210_test_int_user_write() a user can request the transfer of a
frame with a length field (command.length) that is longer than the
actual buffer provided (len). In this scenario the driver will copy
the buffer contents into the uninitialised command[] buffer, then
transfer <data.length> bytes over the SPI even though only <len> bytes
had been populated, potentially leaking sensitive kernel memory.
Also the first 6 bytes of the command buffer must be initialised in case
a malformed, short packet is written and the uninitialised bytes are
read in ca8210_test_check_upstream.
Reported-by: Domen Puncer Kugler <domen.puncer@samsung.com>
Signed-off-by: Harry Morris <h.morris@cascoda.com>
Tested-by: Harry Morris <h.morris@cascoda.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0834d627fb ]
In mpic_physmask() we loop over all CPUs up to 32, then get the hard
SMP processor id of that CPU.
Currently that's possibly walking off the end of the paca array, but
in a future patch we will change the paca array to be an array of
pointers, and in that case we will get a NULL for missing CPUs and
oops. eg:
Unable to handle kernel paging request for data at address 0x88888888888888b8
Faulting instruction address: 0xc00000000004e380
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP .mpic_set_affinity+0x60/0x1a0
LR .irq_do_set_affinity+0x48/0x100
Fix it by checking the CPU is possible, this also fixes the code if
there are gaps in the CPU numbering which probably never happens on
mpic systems but who knows.
Debugged-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e283655b5a ]
We should zero an array using sizeof instead of number of elements.
Fixes the following compiler (GCC 7.3.0) warnings:
drivers/macintosh/rack-meter.c: In function 'rackmeter_do_pause':
drivers/macintosh/rack-meter.c:157:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size]
drivers/macintosh/rack-meter.c:158:2: warning: 'memset' used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size]
Fixes: 4f7bef7a9f ("drivers: macintosh: rack-meter: fix bogus memsets")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c37a3c9477 ]
If acpi_id is == nr_acpi_bits, then we access one element beyond the end
of the acpi_psd[] array or we set one bit beyond the end of the bit map
when we do __set_bit(acpi_id, acpi_id_present);
Fixes: 59a5680291 ("xen/acpi-processor: C and P-state driver that uploads said data to hypervisor.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 57b0c9d49b ]
If a call-level abort is received for the previous call to complete on a
connection channel, then that abort is queued for the connection processor
to handle. Unfortunately, the connection processor then assumes without
checking that the abort is connection-level (ie. callNumber is 0) and
distributes it over all active calls on that connection, thereby
incorrectly aborting them.
Fix this by discarding aborts aimed at a completed call.
Further, discard all packets aimed at a call that's complete if there's
currently an active call on a channel, since the DATA packets associated
with the new call automatically terminate the old call.
Fixes: 18bfeba50d ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 03877bf6a3 ]
rxrpc calls have a ring of packets that are awaiting ACK or retransmission
and a parallel ring of annotations that tracks the state of those packets.
If the initial transmission of a packet on the underlying UDP socket fails
then the packet annotation is marked for resend - but the setting of this
mark accidentally erases the last-packet mark also stored in the same
annotation slot. If this happens, a call won't switch out of the Tx phase
when all the packets have been transmitted.
Fix this by retaining the last-packet mark and only altering the packet
state.
Fixes: 248f219cb8 ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 59299aa102 ]
Commit a158bdd3 ("rxrpc: Fix call timeouts") reworked the time calculation
for the next resend event. For this calculation, "oldest" will be before
"now", so ktime_sub(oldest, now) will yield a negative value. When passed
to nsecs_to_jiffies which expects an unsigned value, the end result will be
a very large value, and a resend event scheduled far into the future. This
could cause calls to stall if some packets were lost.
Fix by ordering the arguments to ktime_sub correctly.
Fixes: a158bdd324 ("rxrpc: Fix call timeouts")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4d31778aa2 ]
When multiple pending snapshots referring to the same source subvolume
are executed, enabled quota will cause root item corruption, where root
items are using old bytenr (no backref in extent tree).
This can be triggered by fstests btrfs/152.
The cause is when source subvolume is still dirty, extra commit
(simplied transaction commit) of qgroup_account_snapshot() can skip
dirty roots not recorded in current transaction, making root item of
source subvolume not updated.
Fix it by forcing recording source subvolume in current transaction
before qgroup sub-transaction commit.
Reported-by: Justin Maggard <jmaggard@netgear.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8434ec46c6 ]
When logging an inode, at tree-log.c:copy_items(), if we call
btrfs_next_leaf() at the loop which checks for the need to log holes, we
need to make sure copy_items() returns the value 1 to its caller and
not 0 (on success). This is because the path the caller passed was
released and is now different from what is was before, and the caller
expects a return value of 0 to mean both success and that the path
has not changed, while a return value of 1 means both success and
signals the caller that it can not reuse the path, it has to perform
another tree search.
Even though this is a case that should not be triggered on normal
circumstances or very rare at least, its consequences can be very
unpredictable (especially when replaying a log tree).
Fixes: 16e7549f04 ("Btrfs: incompatible format change to remove hole extents")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3c0efdf03b ]
The extent tree of the test fs is like the following:
BTRFS info (device (null)): leaf 16327509003777336587 total ptrs 1 free space 3919
item 0 key (4096 168 4096) itemoff 3944 itemsize 51
extent refs 1 gen 1 flags 2
tree block key (68719476736 0 0) level 1
^^^^^^^
ref#0: tree block backref root 5
And it's using an empty tree for fs tree, so there is no way that its
level can be 1.
For REAL (created by mkfs) fs tree backref with no skinny metadata, the
result should look like:
item 3 key (30408704 EXTENT_ITEM 4096) itemoff 3845 itemsize 51
refs 1 gen 4 flags TREE_BLOCK
tree block key (256 INODE_ITEM 0) level 0
^^^^^^^
tree block backref root 5
Fix the level to 0, so it won't break later tree level checker.
Fixes: faa2dbf004 ("Btrfs: add sanity tests for new qgroup accounting code")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d40b6768e4 ]
system_reset_exception does most of its own crash handling now,
invoking the debugger or crash dumps if they are registered. If not,
then it goes through to die() to print stack traces, and then is
supposed to panic (according to comments).
However after die() prints oopses, it does its own handling which
doesn't allow system_reset_exception to panic (e.g., it may just
kill the current process). This patch causes sreset exceptions to
return from die after it prints messages but before acting.
This also stops die from invoking the debugger on 0x100 crashes.
system_reset_exception similarly calls the debugger. It had been
thought this was harmless (because if the debugger was disabled,
neither call would fire, and if it was enabled the first call
would return). However in some cases like xmon 'X' command, the
debugger returns 0, which currently causes it to be entered
again (first in system_reset_exception, then in die), which is
confusing.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 16a1c0646e ]
All the members: base, idm_base and nicpm_base should be annotated with
__iomem since they are pointers to register space. This fixes a bunch of
sparse reported warnings.
Fixes: f6a95a2495 ("net: ethernet: bgmac: Add platform device support")
Fixes: dd5c5d037f ("net: ethernet: bgmac: add NS2 support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 60d6e6f0b9 ]
bgmac_dma_tx_ring_free() assigns the ctl1 word which is a litle endian
32-bit word without using proper accessors, fix this, and because a
length cannot be negative, use unsigned int while at it.
Fixes: 9cde94506e ("bgmac: implement scatter/gather support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0e5b09b165 ]
We're freeing "value_name" which is NULL, so that's a no-op, but we
intended to free "location_name" instead. And then we don't free the
names in token_location_attrs[0] and token_value_attrs[0].
Fixes: 33b9ca1e53 ("platform/x86: dell-smbios: Add a sysfs interface for SMBIOS tokens")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0123f4d76c ]
Current implementations map locking operations using .rl and .aq
annotations. However, this mapping is unsound w.r.t. the kernel
memory consistency model (LKMM) [1]:
Referring to the "unlock-lock-read-ordering" test reported below,
Daniel wrote:
"I think an RCpc interpretation of .aq and .rl would in fact
allow the two normal loads in P1 to be reordered [...]
The intuition would be that the amoswap.w.aq can forward from
the amoswap.w.rl while that's still in the store buffer, and
then the lw x3,0(x4) can also perform while the amoswap.w.rl
is still in the store buffer, all before the l1 x1,0(x2)
executes. That's not forbidden unless the amoswaps are RCsc,
unless I'm missing something.
Likewise even if the unlock()/lock() is between two stores.
A control dependency might originate from the load part of
the amoswap.w.aq, but there still would have to be something
to ensure that this load part in fact performs after the store
part of the amoswap.w.rl performs globally, and that's not
automatic under RCpc."
Simulation of the RISC-V memory consistency model confirmed this
expectation.
In order to "synchronize" LKMM and RISC-V's implementation, this
commit strengthens the implementations of the locking operations
by replacing .rl and .aq with the use of ("lightweigth") fences,
resp., "fence rw, w" and "fence r , rw".
C unlock-lock-read-ordering
{}
/* s initially owned by P1 */
P0(int *x, int *y)
{
WRITE_ONCE(*x, 1);
smp_wmb();
WRITE_ONCE(*y, 1);
}
P1(int *x, int *y, spinlock_t *s)
{
int r0;
int r1;
r0 = READ_ONCE(*y);
spin_unlock(s);
spin_lock(s);
r1 = READ_ONCE(*x);
}
exists (1:r0=1 /\ 1:r1=0)
[1] https://marc.info/?l=linux-kernel&m=151930201102853&w=2https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/hKywNHBkAXMhttps://marc.info/?l=linux-kernel&m=151633436614259&w=2
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Albert Ou <albert@sifive.com>
Cc: Daniel Lustig <dlustig@nvidia.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jade Alglave <j.alglave@ucl.ac.uk>
Cc: Luc Maranget <luc.maranget@inria.fr>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Akira Yokosawa <akiyks@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-riscv@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d13864b68e ]
This avoids a lot of -Wunused warnings such as:
====================
kernel/debug/debug_core.c: In function ‘kgdb_cpu_enter’:
./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value]
#define xchg(ptr,x) ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
./arch/sparc/include/asm/atomic_64.h:86:30: note: in expansion of macro ‘xchg’
#define atomic_xchg(v, new) (xchg(&((v)->counter), new))
^~~~
kernel/debug/debug_core.c:508:4: note: in expansion of macro ‘atomic_xchg’
atomic_xchg(&kgdb_active, cpu);
^~~~~~~~~~~
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 73dcc666d6 ]
If serial console wake-up is enabled ("echo enabled >
/sys/.../ttySC0/power/wakeup"), and any serial input is received while
the system is suspended, serial port input no longer works after system
resume.
Note that:
1) The system can still be woken up using the serial console,
2) Serial port input keeps working if the system is woken up in some
other way (e.g. Wake-on-LAN or gpio-keys), and no serial input was
received while suspended.
To fix this, replace SET_LATE_SYSTEM_SLEEP_PM_OPS() by
SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(), as the callbacks installed by the
former happen too early resp. late in the suspend resp. resume process.
Reported-by: RVC test team via Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Fixes: 1131b0a4af ("dmaengine: rcar-dmac: Make DMAC reinit during system resume explicit")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2c98425720 ]
If the fscache asynchronous write operation elects to discard a page that's
pending storage to the cache because the page would be over the store limit
then it needs to wake the page as someone may be waiting on completion of
the write.
The problem is that the store limit may be updated by a different
asynchronous operation - and so may miss the write - and that the store
limit may not even get updated until later by the netfs.
Fix the kernel hang by making fscache_write_op() mark as written any pages
that are over the limit.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 92571a1aae ]
When using wicked with a lan78xx device attached to the system, we
end up with ethtool commands issued on the device before an ifup
got issued. That lead to the following crash:
Unable to handle kernel NULL pointer dereference at virtual address 0000039c
pgd = ffff800035b30000
[0000039c] *pgd=0000000000000000
Internal error: Oops: 96000004 [#1] SMP
Modules linked in: [...]
Supported: Yes
CPU: 3 PID: 638 Comm: wickedd Tainted: G E 4.12.14-0-default #1
Hardware name: raspberrypi rpi/rpi, BIOS 2018.03-rc2 02/21/2018
task: ffff800035e74180 task.stack: ffff800036718000
PC is at phy_ethtool_ksettings_get+0x20/0x98
LR is at lan78xx_get_link_ksettings+0x44/0x60 [lan78xx]
pc : [<ffff0000086f7f30>] lr : [<ffff000000dcca84>] pstate: 20000005
sp : ffff80003671bb20
x29: ffff80003671bb20 x28: ffff800035e74180
x27: ffff000008912000 x26: 000000000000001d
x25: 0000000000000124 x24: ffff000008f74d00
x23: 0000004000114809 x22: 0000000000000000
x21: ffff80003671bbd0 x20: 0000000000000000
x19: ffff80003671bbd0 x18: 000000000000040d
x17: 0000000000000001 x16: 0000000000000000
x15: 0000000000000000 x14: ffffffffffffffff
x13: 0000000000000000 x12: 0000000000000020
x11: 0101010101010101 x10: fefefefefefefeff
x9 : 7f7f7f7f7f7f7f7f x8 : fefefeff31677364
x7 : 0000000080808080 x6 : ffff80003671bc9c
x5 : ffff80003671b9f8 x4 : ffff80002c296190
x3 : 0000000000000000 x2 : 0000000000000000
x1 : ffff80003671bbd0 x0 : ffff80003671bc00
Process wickedd (pid: 638, stack limit = 0xffff800036718000)
Call trace:
Exception stack(0xffff80003671b9e0 to 0xffff80003671bb20)
b9e0: ffff80003671bc00 ffff80003671bbd0 0000000000000000 0000000000000000
ba00: ffff80002c296190 ffff80003671b9f8 ffff80003671bc9c 0000000080808080
ba20: fefefeff31677364 7f7f7f7f7f7f7f7f fefefefefefefeff 0101010101010101
ba40: 0000000000000020 0000000000000000 ffffffffffffffff 0000000000000000
ba60: 0000000000000000 0000000000000001 000000000000040d ffff80003671bbd0
ba80: 0000000000000000 ffff80003671bbd0 0000000000000000 0000004000114809
baa0: ffff000008f74d00 0000000000000124 000000000000001d ffff000008912000
bac0: ffff800035e74180 ffff80003671bb20 ffff000000dcca84 ffff80003671bb20
bae0: ffff0000086f7f30 0000000020000005 ffff80002c296000 ffff800035223900
bb00: 0000ffffffffffff 0000000000000000 ffff80003671bb20 ffff0000086f7f30
[<ffff0000086f7f30>] phy_ethtool_ksettings_get+0x20/0x98
[<ffff000000dcca84>] lan78xx_get_link_ksettings+0x44/0x60 [lan78xx]
[<ffff0000087cbc40>] ethtool_get_settings+0x68/0x210
[<ffff0000087cc0d4>] dev_ethtool+0x214/0x2180
[<ffff0000087e5008>] dev_ioctl+0x400/0x630
[<ffff00000879dd00>] sock_do_ioctl+0x70/0x88
[<ffff00000879f5f8>] sock_ioctl+0x208/0x368
[<ffff0000082cde10>] do_vfs_ioctl+0xb0/0x848
[<ffff0000082ce634>] SyS_ioctl+0x8c/0xa8
Exception stack(0xffff80003671bec0 to 0xffff80003671c000)
bec0: 0000000000000009 0000000000008946 0000fffff4e841d0 0000aa0032687465
bee0: 0000aaaafa2319d4 0000fffff4e841d4 0000000032687465 0000000032687465
bf00: 000000000000001d 7f7fff7f7f7f7f7f 72606b622e71ff4c 7f7f7f7f7f7f7f7f
bf20: 0101010101010101 0000000000000020 ffffffffffffffff 0000ffff7f510c68
bf40: 0000ffff7f6a9d18 0000ffff7f44ce30 000000000000040d 0000ffff7f6f98f0
bf60: 0000fffff4e842c0 0000000000000001 0000aaaafa2c2e00 0000ffff7f6ab000
bf80: 0000fffff4e842c0 0000ffff7f62a000 0000aaaafa2b9f20 0000aaaafa2c2e00
bfa0: 0000fffff4e84818 0000fffff4e841a0 0000ffff7f5ad0cc 0000fffff4e841a0
bfc0: 0000ffff7f44ce3c 0000000080000000 0000000000000009 000000000000001d
bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
The culprit is quite simple: The driver tries to access the phy left and right,
but only actually has a working reference to it when the device is up.
The fix thus is quite simple too: Get a reference to the phy on probe already
and keep it even when the device is going down.
With this patch applied, I can successfully run wicked on my system and bring
the interface up and down as many times as I want, without getting NULL pointer
dereferences in between.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit add5ff7a21 ]
Exit to userspace with KVM_INTERNAL_ERROR_EMULATION if we encounter
an exception in Protected Mode while emulating guest due to invalid
guest state. Unlike Big RM, KVM doesn't support emulating exceptions
in PM, i.e. PM exceptions are always injected via the VMCS. Because
we will never do VMRESUME due to emulation_required, the exception is
never realized and we'll keep emulating the faulting instruction over
and over until we receive a signal.
Exit to userspace iff there is a pending exception, i.e. don't exit
simply on a requested event. The purpose of this check and exit is to
aid in debugging a guest that is in all likelihood already doomed.
Invalid guest state in PM is extremely limited in normal operation,
e.g. it generally only occurs for a few instructions early in BIOS,
and any exception at this time is all but guaranteed to be fatal.
Non-vectored interrupts, e.g. INIT, SIPI and SMI, can be cleanly
handled/emulated, while checking for vectored interrupts, e.g. INTR
and NMI, without hitting false positives would add a fair amount of
complexity for almost no benefit (getting hit by lightning seems
more likely than encountering this specific scenario).
Add a WARN_ON_ONCE to vmx_queue_exception() if we try to inject an
exception via the VMCS and emulation_required is true.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 162ee5a8ab ]
Linus reported the following boot warning:
WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/tlbflush.h:134 load_new_mm_cr3+0x114/0x170
[...]
Call Trace:
switch_mm_irqs_off+0x267/0x590
switch_mm+0xe/0x20
efi_switch_mm+0x3e/0x50
efi_enter_virtual_mode+0x43f/0x4da
start_kernel+0x3bf/0x458
secondary_startup_64+0xa5/0xb0
... after merging:
03781e4089: x86/efi: Use efi_switch_mm() rather than manually twiddling with %cr3
When the platform supports PCID and if CONFIG_DEBUG_VM=y is enabled,
build_cr3_noflush() (called via switch_mm()) does a sanity check to see
if X86_FEATURE_PCID is set.
Presently, build_cr3_noflush() uses "this_cpu_has(X86_FEATURE_PCID)" to
perform the check but this_cpu_has() works only after SMP is initialized
(i.e. per cpu cpu_info's should be populated) and this happens to be very
late in the boot process (during rest_init()).
As efi_runtime_services() are called during (early) kernel boot time
and run time, modify build_cr3_noflush() to use boot_cpu_has() all the
time. As suggested by Dave Hansen, this should be OK because all CPU's have
same capabilities on x86.
With this change the warning is fixed.
( Dave also suggested that we put a warning in this_cpu_has() if it's used
early in the boot process. This is still work in progress as it affects
MCE. )
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Lee Chun-Yi <jlee@suse.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Ricardo Neri <ricardo.neri@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1522870459-7432-1-git-send-email-sai.praneeth.prakhya@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d29a20645d ]
While running rt-tests' pi_stress program I got the following splat:
rq->clock_update_flags < RQCF_ACT_SKIP
WARNING: CPU: 27 PID: 0 at kernel/sched/sched.h:960 assert_clock_updated.isra.38.part.39+0x13/0x20
[...]
<IRQ>
enqueue_top_rt_rq+0xf4/0x150
? cpufreq_dbs_governor_start+0x170/0x170
sched_rt_rq_enqueue+0x65/0x80
sched_rt_period_timer+0x156/0x360
? sched_rt_rq_enqueue+0x80/0x80
__hrtimer_run_queues+0xfa/0x260
hrtimer_interrupt+0xcb/0x220
smp_apic_timer_interrupt+0x62/0x120
apic_timer_interrupt+0xf/0x20
</IRQ>
[...]
do_idle+0x183/0x1e0
cpu_startup_entry+0x5f/0x70
start_secondary+0x192/0x1d0
secondary_startup_64+0xa5/0xb0
We can get rid of it be the "traditional" means of adding an
update_rq_clock() call after acquiring the rq->lock in
do_sched_rt_period_timer().
The case for the RT task throttling (which this workload also hits)
can be ignored in that the skip_update call is actually bogus and
quite the contrary (the request bits are removed/reverted).
By setting RQCF_UPDATED we really don't care if the skip is happening
or not and will therefore make the assert_clock_updated() check happy.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: linux-kernel@vger.kernel.org
Cc: rostedt@goodmis.org
Link: http://lkml.kernel.org/r/20180402164954.16255-1-dave@stgolabs.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c1b25a17d2 ]
POWER8 restores AMOR when waking from deep sleep, but POWER9 does not,
because it does not go through the subcore restore.
Have POWER9 restore it in core restore.
Fixes: ee97b6b99f ("powerpc/mm/radix: Setup AMOR in HV mode to allow key 0")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bb34f24c7d ]
We should not handle migrate lockres if we are already in
'DLM_CTXT_IN_SHUTDOWN', as that will cause lockres remains after leaving
dlm domain. At last other nodes will get stuck into infinite loop when
requsting lock from us.
The problem is caused by concurrency umount between nodes. Before
receiveing N1's DLM_BEGIN_EXIT_DOMAIN_MSG, N2 has picked up N1 as the
migrate target. So N2 will continue sending lockres to N1 even though
N1 has left domain.
N1 N2 (owner)
touch file
access the file,
and get pr lock
begin leave domain and
pick up N1 as new owner
begin leave domain and
migrate all lockres done
begin migrate lockres to N1
end leave domain, but
the lockres left
unexpectedly, because
migrate task has passed
[piaojun@huawei.com: v3]
Link: http://lkml.kernel.org/r/5A9CBD19.5020107@huawei.com
Link: http://lkml.kernel.org/r/5A99F028.2090902@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Reviewed-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1e1c50a929 ]
do_chunk_alloc implements a loop checking whether there is a pending
chunk allocation and if so causes the caller do loop. Generally this
loop is executed only once, however testing with btrfs/072 on a single
core vm machines uncovered an extreme case where the system could loop
indefinitely. This is due to a missing cond_resched when loop which
doesn't give a chance to the previous chunk allocator finish its job.
The fix is to simply add the missing cond_resched.
Fixes: 6d74119f1a ("Btrfs: avoid taking the chunk_mutex in do_chunk_alloc")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 80c0b4210a ]
0, 1 and <0 can be returned by btrfs_next_leaf(), and when <0 is
returned, path->nodes[0] could be NULL, log_dir_items lacks such a
check for <0 and we may run into a null pointer dereference panic.
Fixes: e02119d5a7 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b98def7ca6 ]
If errors were returned by btrfs_next_leaf(), replay_dir_deletes needs
to bail out, otherwise @ret would be forced to be 0 after 'break;' and
the caller won't be aware of it.
Fixes: e02119d5a7 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e92bb4dd96 ]
When page_mapping() is called and the mapping is dereferenced in
page_evicatable() through shrink_active_list(), it is possible for the
inode to be truncated and the embedded address space to be freed at the
same time. This may lead to the following race.
CPU1 CPU2
truncate(inode) shrink_active_list()
... page_evictable(page)
truncate_inode_page(mapping, page);
delete_from_page_cache(page)
spin_lock_irqsave(&mapping->tree_lock, flags);
__delete_from_page_cache(page, NULL)
page_cache_tree_delete(..)
... mapping = page_mapping(page);
page->mapping = NULL;
...
spin_unlock_irqrestore(&mapping->tree_lock, flags);
page_cache_free_page(mapping, page)
put_page(page)
if (put_page_testzero(page)) -> false
- inode now has no pages and can be freed including embedded address_space
mapping_unevictable(mapping)
test_bit(AS_UNEVICTABLE, &mapping->flags);
- we've dereferenced mapping which is potentially already free.
Similar race exists between swap cache freeing and page_evicatable()
too.
The address_space in inode and swap cache will be freed after a RCU
grace period. So the races are fixed via enclosing the page_mapping()
and address_space usage in rcu_read_lock/unlock(). Some comments are
added in code to make it clear what is protected by the RCU read lock.
Link: http://lkml.kernel.org/r/20180212081227.1940-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 77da2ba064 ]
This patch fixes a corner case for KSM. When two pages belong or
belonged to the same transparent hugepage, and they should be merged,
KSM fails to split the page, and therefore no merging happens.
This bug can be reproduced by:
* making sure ksm is running (in case disabling ksmtuned)
* enabling transparent hugepages
* allocating a THP-aligned 1-THP-sized buffer
e.g. on amd64: posix_memalign(&p, 1<<21, 1<<21)
* filling it with the same values
e.g. memset(p, 42, 1<<21)
* performing madvise to make it mergeable
e.g. madvise(p, 1<<21, MADV_MERGEABLE)
* waiting for KSM to perform a few scans
The expected outcome is that the all the pages get merged (1 shared and
the rest sharing); the actual outcome is that no pages get merged (1
unshared and the rest volatile)
The reason of this behaviour is that we increase the reference count
once for both pages we want to merge, but if they belong to the same
hugepage (or compound page), the reference counter used in both cases is
the one of the head of the compound page. This means that
split_huge_page will find a value of the reference counter too high and
will fail.
This patch solves this problem by testing if the two pages to merge
belong to the same hugepage when attempting to merge them. If so, the
hugepage is split safely. This means that the hugepage is not split if
not necessary.
Link: http://lkml.kernel.org/r/1521548069-24758-1-git-send-email-imbrenda@linux.vnet.ibm.com
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Co-authored-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0211e12dd0 ]
When the allocation of node_to_possible_cpumask fails, then
irq_create_affinity_masks() returns with a pointer to the empty affinity
masks array, which will cause malfunction.
Reorder the allocations so the masks array allocation comes last and every
failure path returns NULL.
Fixes: 9a0ef98e18 ("genirq/affinity: Assign vectors to all present CPUs")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 41f714672f ]
The counter that tracks used TX descriptors pending completion
needs to be zeroed as part of a device reset. This change fixes
a bug causing transmit queues to be stopped unnecessarily and in
some cases a transmit queue stall and timeout reset. If the counter
is not reset, the remaining descriptors will not be "removed",
effectively reducing queue capacity. If the queue is over half full,
it will cause the queue to stall if stopped.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 76327a35ca ]
The datasheet specifies a 3uS pause after performing a software
reset. The default implementation of genphy_soft_reset() does not
provide this, so implement soft_reset with the needed pause.
Signed-off-by: Esben Haabendal <eha@deif.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7854e499f3 ]
The clang API calls used by perf have changed in recent releases and
builds succeed with libclang-3.9 only. This introduces compatibility
with libclang-4.0 and above.
Without this patch, we will see the following compilation errors with
libclang-4.0+:
util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’:
util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope
Opts.Inputs.emplace_back(Path, IK_C);
^~~~
util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::Module> perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr<clang::vfs::FileSystem>)’:
util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’
Clang.setInvocation(&*CI);
^
In file included from util/c++/clang.cpp:14:0:
/usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr<clang::CompilerInvocation>)
void setInvocation(std::shared_ptr<CompilerInvocation> Value);
^~~~~~~~~~~~~
Committer testing:
Tested on Fedora 27 after installing the clang-devel and llvm-devel
packages, versions:
# rpm -qa | egrep llvm\|clang
llvm-5.0.1-6.fc27.x86_64
clang-libs-5.0.1-5.fc27.x86_64
clang-5.0.1-5.fc27.x86_64
clang-tools-extra-5.0.1-5.fc27.x86_64
llvm-libs-5.0.1-6.fc27.x86_64
llvm-devel-5.0.1-6.fc27.x86_64
clang-devel-5.0.1-5.fc27.x86_64
#
Make sure you don't have some older version lying around in /usr/local,
etc, then:
$ make LIBCLANGLLVM=1 -C tools/perf install-bin
And in the end perf will be linked agains these libraries:
# ldd ~/bin/perf | egrep -i llvm\|clang
libclangAST.so.5 => /lib64/libclangAST.so.5 (0x00007f8bb2eb4000)
libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x00007f8bb29e3000)
libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x00007f8bb23f7000)
libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x00007f8bb2060000)
libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x00007f8bb1d06000)
libclangLex.so.5 => /lib64/libclangLex.so.5 (0x00007f8bb1a3e000)
libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x00007f8bb17d4000)
libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x00007f8bb15c5000)
libclangSema.so.5 => /lib64/libclangSema.so.5 (0x00007f8bb0cc9000)
libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x00007f8bb0a23000)
libclangParse.so.5 => /lib64/libclangParse.so.5 (0x00007f8bb0725000)
libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x00007f8bb039a000)
libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x00007f8bace98000)
libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x00007f8bab735000)
libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x00007f8bab4b2000)
libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x00007f8bab2a1000)
libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x00007f8bab08e000)
#
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Fixes: 00b86691c7 ("perf clang: Add builtin clang support ant test case")
Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandipan@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 709b973c84 ]
The function get_user() can sleep while trying to fetch instruction
from user address space and causes the following warning from the
scheduler.
BUG: sleeping function called from invalid context
Though interrupts get enabled back but it happens bit later after
get_user() is called. This change moves enabling these interrupts
earlier covering the function get_user(). While at this, lets check
for kernel mode and crash as this interrupt should not have been
triggered from the kernel context.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8913315e94 ]
When multiple CPUs are related in one cpufreq policy, the first online
CPU will be chosen by default to handle cpufreq operations. Let's take
cpu0 and cpu1 as an example.
When cpu0 is offline, policy->cpu will be shifted to cpu1. cpu1's perf
capabilities should be initialized. Otherwise, perf capabilities are 0s
and speed change can not take effect.
This patch copies perf capabilities of the first online CPU to other
shared CPUs when policy shared type is CPUFREQ_SHARED_TYPE_ANY.
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8c81dd46ef ]
Forcing the log to disk after reading the agf is wrong, we might be
calling xfs_log_force with XFS_LOG_SYNC with a metadata lock held.
This can cause a deadlock when racing a fstrim with a filesystem
shutdown.
The deadlock has been identified due a miscalculation bug in device-mapper
dm-thin, which returns lack of space to its users earlier than the device itself
really runs out of space, changing the device-mapper volume into an error state.
The problem happened while filling the filesystem with a single file,
triggering the bug in device-mapper, consequently causing an IO error
and shutting down the filesystem.
If such file is removed, and fstrim executed before the XFS finishes the
shut down process, the fstrim process will end up holding the buffer
lock, and going to sleep on the cil wait queue.
At this point, the shut down process will try to wake up all the threads
waiting on the cil wait queue, but for this, it will try to hold the
same buffer log already held my the fstrim, locking up the filesystem.
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a06ad633a3 ]
Calling swapon() on a zero length swap file on SSD can lead to a
divide-by-zero.
Although creating such files isn't possible with mkswap and they woud be
considered invalid, it would be better for the swapon code to be more
robust and handle this condition gracefully (return -EINVAL).
Especially since the fix is small and straightforward.
To help with wear leveling on SSD, the swapon syscall calculates a
random position in the swap file using modulo p->highest_bit, which is
set to maxpages - 1 in read_swap_header.
If the swap file is zero length, read_swap_header sets maxpages=1 and
last_page=0, resulting in p->highest_bit=0 and we divide-by-zero when we
modulo p->highest_bit in swapon syscall.
This can be prevented by having read_swap_header return zero if
last_page is zero.
Link: http://lkml.kernel.org/r/5AC747C1020000A7001FA82C@prv-mh.provo.novell.com
Signed-off-by: Thomas Abraham <tabraham@suse.com>
Reported-by: <Mark.Landis@Teradata.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bb06ec3145 ]
The nvmf_check_if_ready() checks that were added are very simplistic.
As such, the routine allows a lot of cases to fail ios during windows
of reset or re-connection. In cases where there are not multi-path
options present, the error goes back to the callee - the filesystem
or application. Not good.
The common routine was rewritten and calling syntax slightly expanded
so that per-transport is_ready routines don't need to be present.
The transports now call the routine directly. The routine is now a
fabrics routine rather than an inline function.
The routine now looks at controller state to decide the action to
take. Some states mandate io failure. Others define the condition where
a command can be accepted. When the decision is unclear, a generic
queue-or-reject check is made to look for failfast or multipath ios and
only fails the io if it is so marked. Otherwise, the io will be queued
and wait for the controller state to resolve.
Admin commands issued via ioctl share a live admin queue with commands
from the transport for controller init. The ioctls could be intermixed
with the initialization commands. It's possible for the ioctl cmd to
be issued prior to the controller being enabled. To block this, the
ioctl admin commands need to be distinguished from admin commands used
for controller init. Added a USERCMD nvme_req(req)->rq_flags bit to
reflect this division and set it on ioctls requests. As the
nvmf_check_if_ready() routine is called prior to nvme_setup_cmd(),
ensure that commands allocated by the ioctl path (actually anything
in core.c) preps the nvme_req(req) before starting the io. This will
preserve the USERCMD flag during execution and/or retry.
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.e>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 479ca3bf91 ]
The driver currently uses src port field (along with other fields) in the
decap tunnel key, while looking up and adding tunnel nodes. This leads to
redundant cfa_decap_filter_alloc() requests to the FW and flow-miss in the
flow engine. Fix this by ignoring the src port field in decap tunnel nodes.
Fixes: f484f6782e ("bnxt_en: add hwrm FW cmds for cfa_encap_record and decap_filter")
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 471d557afe ]
Currently if we allocate extents beyond an inode's i_size (through the
fallocate system call) and then fsync the file, we log the extents but
after a power failure we replay them and then immediately drop them.
This behaviour happens since about 2009, commit c71bf099ab ("Btrfs:
Avoid orphan inodes cleanup while replaying log"), because it marks
the inode as an orphan instead of dropping any extents beyond i_size
before replaying logged extents, so after the log replay, and while
the mount operation is still ongoing, we find the inode marked as an
orphan and then perform a truncation (drop extents beyond the inode's
i_size). Because the processing of orphan inodes is still done
right after replaying the log and before the mount operation finishes,
the intention of that commit does not make any sense (at least as
of today). However reverting that behaviour is not enough, because
we can not simply discard all extents beyond i_size and then replay
logged extents, because we risk dropping extents beyond i_size created
in past transactions, for example:
add prealloc extent beyond i_size
fsync - clears the flag BTRFS_INODE_NEEDS_FULL_SYNC from the inode
transaction commit
add another prealloc extent beyond i_size
fsync - triggers the fast fsync path
power failure
In that scenario, we would drop the first extent and then replay the
second one. To fix this just make sure that all prealloc extents
beyond i_size are logged, and if we find too many (which is far from
a common case), fallback to a full transaction commit (like we do when
logging regular extents in the fast fsync path).
Trivial reproducer:
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ xfs_io -f -c "pwrite -S 0xab 0 256K" /mnt/foo
$ sync
$ xfs_io -c "falloc -k 256K 1M" /mnt/foo
$ xfs_io -c "fsync" /mnt/foo
<power failure>
# mount to replay log
$ mount /dev/sdb /mnt
# at this point the file only has one extent, at offset 0, size 256K
A test case for fstests follows soon, covering multiple scenarios that
involve adding prealloc extents with previous shrinking truncates and
without such truncates.
Fixes: c71bf099ab ("Btrfs: Avoid orphan inodes cleanup while replaying log")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit af72273381 ]
Currently if some fatal errors occur, like all IO get -EIO, resources
would be cleaned up when
a) transaction is being committed or
b) BTRFS_FS_STATE_ERROR is set
However, in some rare cases, resources may be left alone after transaction
gets aborted and umount may run into some ASSERT(), e.g.
ASSERT(list_empty(&block_group->dirty_list));
For case a), in btrfs_commit_transaciton(), there're several places at the
beginning where we just call btrfs_end_transaction() without cleaning up
resources. For case b), it is possible that the trans handle doesn't have
any dirty stuff, then only trans hanlde is marked as aborted while
BTRFS_FS_STATE_ERROR is not set, so resources remain in memory.
This makes btrfs also check BTRFS_FS_STATE_TRANS_ABORTED to make sure that
all resources won't stay in memory after umount.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 74c6c71530 ]
NVMe over Fabrics 1.0 Section 5.2 "Discovery Controller Properties and
Command Support" Figure 31 "Discovery Controller – Admin Commands"
explicitly listst all commands but "Get Log Page" and "Identify" as
reserved, but NetApp report the Linux host is sending Keep Alive
commands to the discovery controller, which is a violation of the
Spec.
We're already checking for discovery controllers when configuring the
keep alive timeout but when creating a discovery controller we're not
hard wiring the keep alive timeout to 0 and thus remain on
NVME_DEFAULT_KATO for the discovery controller.
This can be easily remproduced when issuing a direct connect to the
discovery susbsystem using:
'nvme connect [...] --nqn=nqn.2014-08.org.nvmexpress.discovery'
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: 07bfcd09a2 ("nvme-fabrics: add a generic NVMe over Fabrics library")
Reported-by: Martin George <marting@netapp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 90fe6f8ff0 ]
The test which ensures that the DMI type 1 structure is long enough
to hold the UUID is off by one. It would fail if the structure is
exactly 24 bytes long, while that's sufficient to hold the UUID.
I don't expect this bug to cause problem in practice because all
implementations I have seen had length 8, 25 or 27 bytes, in line
with the SMBIOS specifications. But let's fix it still.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: a814c3597a ("firmware: dmi_scan: Check DMI structure length")
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 96a598996f ]
When responding to a debug trap (breakpoint) in userspace, the
kernel's trap handler raised SIGTRAP but returned from the trap via a
code path that ignored pending signals, resulting in an infinite loop
re-executing the trapping instruction.
Signed-off-by: Rich Felker <dalias@libc.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8f2f498d9 upstream.
Since 4.10, commit 8003c9ae20 (KVM: LAPIC: add APIC Timer
periodic/oneshot mode VMX preemption timer support), guests using
periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
drift significantly over time.
Differences in the underlying clocks and numerical errors means the
periods of the two timers (hv and sw) are not the same. This
difference will accumulate with every expiry resulting in a large
error between the hv and sw timer.
This means the sw timer may be running slow when compared to the hv
timer. When the timer is switched from hv to sw, the now active sw
timer will expire late. The guest VCPU is reentered and it switches to
using the hv timer. This timer catches up, injecting multiple IRQs
into the guest (of which the guest only sees one as it does not get to
run until the hv timer has caught up) and thus the guest's timer rate
is low (and becomes increasing slower over time as the sw timer lags
further and further behind).
I believe a similar problem would occur if the hv timer is the slower
one, but I have not observed this.
Fix this by synchronizing the deadlines for both timers to the same
time source on every tick. This prevents the errors from accumulating.
Fixes: 8003c9ae20
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
Cc: stable@vger.kernel.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1eaafe91a0 upstream.
If there is a possibility that a VM may migrate to a Skylake host,
then the hypervisor should report IA32_ARCH_CAPABILITIES.RSBA[bit 2]
as being set (future work, of course). This implies that
CPUID.(EAX=7,ECX=0):EDX.ARCH_CAPABILITIES[bit 29] should be
set. Therefore, kvm should report this CPUID bit as being supported
whether or not the host supports it. Userspace is still free to clear
the bit if it chooses.
For more information on RSBA, see Intel's white paper, "Retpoline: A
Branch Target Injection Mitigation" (Document Number 337131-001),
currently available at https://bugzilla.kernel.org/show_bug.cgi?id=199511.
Since the IA32_ARCH_CAPABILITIES MSR is emulated in kvm, there is no
dependency on hardware support for this feature.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fixes: 28c1c9fabf ("KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES")
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c4d2188206 upstream.
The CPUID bits of OSXSAVE (function=0x1) and OSPKE (func=0x7, leaf=0x0)
allows user apps to detect if OS has set CR4.OSXSAVE or CR4.PKE. KVM is
supposed to update these CPUID bits when CR4 is updated. Current KVM
code doesn't handle some special cases when updates come from emulator.
Here is one example:
Step 1: guest boots
Step 2: guest OS enables XSAVE ==> CR4.OSXSAVE=1 and CPUID.OSXSAVE=1
Step 3: guest hot reboot ==> QEMU reset CR4 to 0, but CPUID.OSXAVE==1
Step 4: guest os checks CPUID.OSXAVE, detects 1, then executes xgetbv
Step 4 above will cause an #UD and guest crash because guest OS hasn't
turned on OSXAVE yet. This patch solves the problem by comparing the the
old_cr4 with cr4. If the related bits have been changed,
kvm_update_cpuid() needs to be called.
Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Bandan Das <bsd@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c62ec4610c upstream.
Commit 08810a4119 (PM / core: Add NEVER_SKIP and SMART_PREPARE
driver flags) inadvertently prevented the power.direct_complete flag
from being set for devices without PM callbacks and with disabled
runtime PM which also prevents power.direct_complete from being set
for their parents. That led to problems including a resume crash on
HP ZBook 14u.
Restore the previous behavior by causing power.direct_complete to be
set for those devices again, but do that in a more direct way to
avoid overlooking that case in the future.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199693
Fixes: 08810a4119 (PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags)
Reported-by: Thomas Martitz <kugel@rockbox.org>
Tested-by: Thomas Martitz <kugel@rockbox.org>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f901dcbc3 upstream.
KASAN uses different routines to map shadow for hot added memory and
memory obtained in boot process. Attempt to offline memory onlined by
normal boot process leads to this:
Trying to vfree() nonexistent vm area (000000005d3b34b9)
WARNING: CPU: 2 PID: 13215 at mm/vmalloc.c:1525 __vunmap+0x147/0x190
Call Trace:
kasan_mem_notifier+0xad/0xb9
notifier_call_chain+0x166/0x260
__blocking_notifier_call_chain+0xdb/0x140
__offline_pages+0x96a/0xb10
memory_subsys_offline+0x76/0xc0
device_offline+0xb8/0x120
store_mem_state+0xfa/0x120
kernfs_fop_write+0x1d5/0x320
__vfs_write+0xd4/0x530
vfs_write+0x105/0x340
SyS_write+0xb0/0x140
Obviously we can't call vfree() to free memory that wasn't allocated via
vmalloc(). Use find_vm_area() to see if we can call vfree().
Unfortunately it's a bit tricky to properly unmap and free shadow
allocated during boot, so we'll have to keep it. If memory will come
online again that shadow will be reused.
Matthew asked: how can you call vfree() on something that isn't a
vmalloc address?
vfree() is able to free any address returned by
__vmalloc_node_range(). And __vmalloc_node_range() gives you any
address you ask. It doesn't have to be an address in [VMALLOC_START,
VMALLOC_END] range.
That's also how the module_alloc()/module_memfree() works on
architectures that have designated area for modules.
[aryabinin@virtuozzo.com: improve comments]
Link: http://lkml.kernel.org/r/dabee6ab-3a7a-51cd-3b86-5468718e0390@virtuozzo.com
[akpm@linux-foundation.org: fix typos, reflow comment]
Link: http://lkml.kernel.org/r/20180201163349.8700-1-aryabinin@virtuozzo.com
Fixes: fa69b5989b ("mm/kasan: add support for memory hotplug")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Paul Menzel <pmenzel+linux-kasan-dev@molgen.mpg.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a73ab244f0 upstream.
Patch series "ipc/shm: shmat() fixes around nil-page".
These patches fix two issues reported[1] a while back by Joe and Andrea
around how shmat(2) behaves with nil-page.
The first reverts a commit that it was incorrectly thought that mapping
nil-page (address=0) was a no no with MAP_FIXED. This is not the case,
with the exception of SHM_REMAP; which is address in the second patch.
I chose two patches because it is easier to backport and it explicitly
reverts bogus behaviour. Both patches ought to be in -stable and ltp
testcases need updated (the added testcase around the cve can be
modified to just test for SHM_RND|SHM_REMAP).
[1] lkml.kernel.org/r/20180430172152.nfa564pvgpk3ut7p@linux-n805
This patch (of 2):
Commit 95e91b831f ("ipc/shm: Fix shmat mmap nil-page protection")
worked on the idea that we should not be mapping as root addr=0 and
MAP_FIXED. However, it was reported that this scenario is in fact
valid, thus making the patch both bogus and breaks userspace as well.
For example X11's libint10.so relies on shmat(1, SHM_RND) for lowmem
initialization[1].
[1] https://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/os-support/linux/int10/linux.c#n347
Link: http://lkml.kernel.org/r/20180503203243.15045-2-dave@stgolabs.net
Fixes: 95e91b831f ("ipc/shm: Fix shmat mmap nil-page protection")
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a4deea1aa upstream.
If the radix tree underlying the IDR happens to be full and we attempt
to remove an id which is larger than any id in the IDR, we will call
__radix_tree_delete() with an uninitialised 'slot' pointer, at which
point anything could happen. This was easiest to hit with a single
entry at id 0 and attempting to remove a non-0 id, but it could have
happened with 64 entries and attempting to remove an id >= 64.
Roman said:
The syzcaller test boils down to opening /dev/kvm, creating an
eventfd, and calling a couple of KVM ioctls. None of this requires
superuser. And the result is dereferencing an uninitialized pointer
which is likely a crash. The specific path caught by syzbot is via
KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are
other user-triggerable paths, so cc:stable is probably justified.
Matthew added:
We have around 250 calls to idr_remove() in the kernel today. Many of
them pass an ID which is embedded in the object they're removing, so
they're safe. Picking a few likely candidates:
drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl.
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar
drivers/atm/nicstar.c could be taken down by a handcrafted packet
Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org
Fixes: 0a835c4f09 ("Reimplement IDR and IDA using the radix tree")
Reported-by: <syzbot+35666cba7f0a337e2e79@syzkaller.appspotmail.com>
Debugged-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 361de091a4 upstream.
Used buffer wasn't big enough to hold whole strings. Example output of
this function is:
[ 0.180892] bcma: bus0: core 0x0800, irq: 2(S)* 3 4 5 6 D I
[ 0.180948] bcma: bus0: core 0x0812, irq: 2(S) 3* 4 5 6 D I
[ 0.180998] bcma: bus0: core 0x082d, irq: 2(S) 3 4* 5 6 D I
[ 0.181046] bcma: bus0: core 0x082c, irq: 2(S) 3 4 5 6 D I*
which means we need to store up to 24 chars.
Fixes: 758f7e0606 ("bcma: Use bcma_debug and not pr_cont in MIPS driver")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7068114d4 upstream.
We're casting the CDROM layer request_sense to the SCSI sense
buffer, but the former is 64 bytes and the latter is 96 bytes.
As we generally allocate these on the stack, we end up blowing
up the stack.
Fix this by wrapping the scsi_execute() call with a properly
sized sense buffer, and copying back the bits for the CDROM
layer.
Cc: stable@vger.kernel.org
Reported-by: Piotr Gabriel Kosinski <pg.kosinski@gmail.com>
Reported-by: Daniel Shapira <daniel@twistlock.com>
Tested-by: Kees Cook <keescook@chromium.org>
Fixes: 82ed4db499 ("block: split scsi_request out of struct request")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e907ed488 upstream.
User-space may invoke ibv_reg_mr and ibv_dereg_mr in different threads.
If ibv_dereg_mr is called after the thread which invoked ibv_reg_mr has
exited, get_pid_task will return NULL and ib_umem_release will not
decrease mm->pinned_vm.
Instead of using threads to locate the mm, use the overall tgid from the
ib_ucontext struct instead. This matches the behavior of ODP and
disassociate in handling the mm of the process that called ibv_reg_mr.
Cc: <stable@vger.kernel.org>
Fixes: 87773dd56d ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9e76ca377 upstream.
A pio send egress error can occur when the PSM library attempts to
to send a bad packet. That issue is still being investigated.
The pio error interrupt handler then attempts to progress the recovery
of the errored pio send context.
Code inspection reveals that the handling lacks the necessary locking
if that recovery interleaves with a PSM close of the "context" object
contains the pio send context.
The lack of the locking can cause the recovery to access the already
freed pio send context object and incorrectly deduce that the pio
send context is actually a kernel pio send context as shown by the
NULL deref stack below:
[<ffffffff8143d78c>] _dev_info+0x6c/0x90
[<ffffffffc0613230>] sc_restart+0x70/0x1f0 [hfi1]
[<ffffffff816ab124>] ? __schedule+0x424/0x9b0
[<ffffffffc06133c5>] sc_halted+0x15/0x20 [hfi1]
[<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[<ffffffff810ab086>] worker_thread+0x126/0x3c0
[<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[<ffffffff810b252f>] kthread+0xcf/0xe0
[<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[<ffffffff816b8798>] ret_from_fork+0x58/0x90
[<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
This is the best case scenario and other scenarios can corrupt the
already freed memory.
Fix by adding the necessary locking in the pio send context error
handler.
Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit faf37c44a1 upstream.
Clear the PCR (Processor Compatibility Register) on boot to ensure we
are not running in a compatibility mode.
We've seen this cause problems when a crash (and kdump) occurs while
running compat mode guests. The kdump kernel then runs with the PCR
set and causes problems. The symptom in the kdump kernel (also seen in
petitboot after fast-reboot) is early userspace programs taking
sigills on newer instructions (seen in libc).
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 255845fc43 upstream.
Otherwise modules that use these arithmetic operations will fail to
link. We accomplish this with the usual EXPORT_SYMBOL, which on most
architectures goes in the .S file but the ARM64 maintainers prefer that
insead it goes into arm64ksyms.
While we're at it, we also fix this up to use SPDX, and I personally
choose to relicense this as GPL2||BSD so that these symbols don't need
to be export_symbol_gpl, so all modules can use the routines, since
these are important general purpose compiler-generated function calls.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reported-by: PaX Team <pageexec@freemail.hu>
Cc: stable@vger.kernel.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32c3fa7cdf upstream.
For LSE atomics that read and write a register operand, we need to
ensure that these operands are annotated as "early clobber" if the
register is written before all of the input operands have been consumed.
Failure to do so can result in the compiler allocating the same register
to both operands, leading to splats such as:
Unable to handle kernel paging request at virtual address 11111122222221
[...]
x1 : 1111111122222222 x0 : 1111111122222221
Process swapper/0 (pid: 1, stack limit = 0x000000008209f908)
Call trace:
test_atomic64+0x1360/0x155c
where x0 has been allocated as both the value to be stored and also the
atomic_t pointer.
This patch adds the missing clobbers.
Cc: <stable@vger.kernel.org>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Reported-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 938ae7259c upstream.
Depending on whether the kernel is compiled with frame-pointer or not,
the temporary memory location used for the bp parameter in these macros
is referenced relative to the stack pointer or the frame pointer.
Hence we can never reference that parameter when we've modified either
the stack pointer or the frame pointer, because then the compiler would
generate an incorrect stack reference.
Fix this by pushing the temporary memory parameter on a known location on
the stack before modifying the stack- and frame pointers.
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4855c92dbb upstream.
When run raidconfig from Dom0 we found that the Xen DMA heap is reduced,
but Dom Heap is increased by the same size. Tracing raidconfig we found
that the related ioctl() in megaraid_sas will call dma_alloc_coherent()
to apply memory. If the memory allocated by Dom0 is not in the DMA area,
it will exchange memory with Xen to meet the requiment. Later drivers
call dma_free_coherent() to free the memory, on xen_swiotlb_free_coherent()
the check condition (dev_addr + size - 1 <= dma_mask) is always false,
it prevents calling xen_destroy_contiguous_region() to return the memory
to the Xen DMA heap.
This issue introduced by commit 6810df88dc "xen-swiotlb: When doing
coherent alloc/dealloc check before swizzling the MFNs.".
Signed-off-by: Joe Jin <joe.jin@oracle.com>
Tested-by: John Sobecki <john.sobecki@oracle.com>
Reviewed-by: Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 136d769e0b upstream.
While whitelisting Micron M500DC drives, the tweaked blacklist entry
enabled queued TRIM from M500IT variants also. But these do not support
queued TRIM. And while using those SSDs with the latest kernel we have
seen errors and even the partition table getting corrupted.
Some part from the dmesg:
[ 6.727384] ata1.00: ATA-9: Micron_M500IT_MTFDDAK060MBD, MU01, max UDMA/133
[ 6.727390] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[ 6.741026] ata1.00: supports DRM functions and may not be fully accessible
[ 6.759887] ata1.00: configured for UDMA/133
[ 6.762256] scsi 0:0:0:0: Direct-Access ATA Micron_M500IT_MT MU01 PQ: 0 ANSI: 5
and then for the error:
[ 120.860334] ata1.00: exception Emask 0x1 SAct 0x7ffc0007 SErr 0x0 action 0x6 frozen
[ 120.860338] ata1.00: irq_stat 0x40000008
[ 120.860342] ata1.00: failed command: SEND FPDMA QUEUED
[ 120.860351] ata1.00: cmd 64/01:00:00:00:00/00:00:00:00:00/a0 tag 0 ncq dma 512 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x5 (timeout)
[ 120.860353] ata1.00: status: { DRDY }
[ 120.860543] ata1: hard resetting link
[ 121.166128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 121.166376] ata1.00: supports DRM functions and may not be fully accessible
[ 121.186238] ata1.00: supports DRM functions and may not be fully accessible
[ 121.204445] ata1.00: configured for UDMA/133
[ 121.204454] ata1.00: device reported invalid CHS sector 0
[ 121.204541] sd 0:0:0:0: [sda] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
[ 121.204546] sd 0:0:0:0: [sda] tag#18 Sense Key : 0x5 [current]
[ 121.204550] sd 0:0:0:0: [sda] tag#18 ASC=0x21 ASCQ=0x4
[ 121.204555] sd 0:0:0:0: [sda] tag#18 CDB: opcode=0x93 93 08 00 00 00 00 00 04 28 80 00 00 00 30 00 00
[ 121.204559] print_req_error: I/O error, dev sda, sector 272512
After few reboots with these errors, and the SSD is corrupted.
After blacklisting it, the errors are not seen and the SSD does not get
corrupted any more.
Fixes: 243918be63 ("libata: Do not blacklist Micron M500DC")
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f651b8704 upstream.
When the host controller accepts only 32bit writes, the value of the
16bit TRANSFER_MODE register, that has the same 32bit address as the
16bit COMMAND register, needs to be saved and it will be written
in a 32bit write together with the command as this will trigger the
host to send the command on the SD interface.
When sending the tuning command, TRANSFER_MODE is written and then
sdhci_set_transfer_mode reads it back to clear AUTO_CMD12 bit and
write it again resulting in wrong value to be written because the
initial write value was saved in a shadow and the read-back returned
a wrong value, from the register.
Fix sdhci_iproc_readw to return the saved value of TRANSFER_MODE
when a saved value exist.
Same fix for read of BLOCK_SIZE and BLOCK_COUNT registers, that are
saved for a different reason, although a scenario that will cause the
mentioned problem on this registers is not probable.
Fixes: b580c52d58 ("mmc: sdhci-iproc: add IPROC SDHCI driver")
Signed-off-by: Corneliu Doban <corneliu.doban@broadcom.com>
Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b25b750df9 upstream.
In commit 97548575be ("mmc: block: Convert RPMB to a character device") a
new function `mmc_rpmb_ioctl` was added. The final return is simply
returning a value of `0` instead of propagating the correct return code.
Discovered during a compilation with W=1, silence the following gcc warning
drivers/mmc/core/block.c:2470:6: warning: variable ‘ret’ set but not used
[-Wunused-but-set-variable]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Fixes: 97548575be ("mmc: block: Convert RPMB to a character device")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e2e547a93 upstream.
For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex. Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.
Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode(). All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.
Cc: stable@kernel.org # 2.6.29 and later
Tested-by: Mike Marshall <hubcap@omnibond.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ae1809725 upstream.
Commit f65e0d2998 ("ALSA: timer: Call notifier in the same spinlock")
combined the start/continue and stop/pause functions, and in doing so
changed the event code for the pause case to SNDRV_TIMER_EVENT_CONTINUE.
Change it back to SNDRV_TIMER_EVENT_PAUSE.
Fixes: f65e0d2998 ("ALSA: timer: Call notifier in the same spinlock")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d50147381a upstream.
Jun Wu at Facebook reported that an internal service was seeing a return
value of 1 from ftruncate() on Btrfs in some cases. This is coming from
the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().
btrfs_truncate() uses two variables for error handling, ret and err.
When btrfs_truncate_inode_items() returns non-zero, we set err to the
return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
only set err if ret is an error (i.e., negative).
To reproduce the issue: mount a filesystem with -o compress-force=zstd
and the following program will encounter return value of 1 from
ftruncate:
int main(void) {
char buf[256] = { 0 };
int ret;
int fd;
fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
if (fd == -1) {
perror("open");
return EXIT_FAILURE;
}
if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
perror("write");
close(fd);
return EXIT_FAILURE;
}
if (fsync(fd) == -1) {
perror("fsync");
close(fd);
return EXIT_FAILURE;
}
ret = ftruncate(fd, 128);
if (ret) {
printf("ftruncate() returned %d\n", ret);
close(fd);
return EXIT_FAILURE;
}
close(fd);
return EXIT_SUCCESS;
}
Fixes: ddfae63cc8 ("btrfs: move btrfs_truncate_block out of trans handle")
CC: stable@vger.kernel.org # 4.15+
Reported-by: Jun Wu <quark@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit baf10564fb upstream.
kill_ioctx() used to have an explicit RCU delay between removing the
reference from ->ioctx_table and percpu_ref_kill() dropping the refcount.
At some point that delay had been removed, on the theory that
percpu_ref_kill() itself contained an RCU delay. Unfortunately, that was
the wrong kind of RCU delay and it didn't care about rcu_read_lock() used
by lookup_ioctx(). As the result, we could get ctx freed right under
lookup_ioctx(). Tejun has fixed that in a6d7cff472 ("fs/aio: Add explicit
RCU grace period when freeing kioctx"); however, that fix is not enough.
Suppose io_destroy() from one thread races with e.g. io_setup() from another;
CPU1 removes the reference from current->mm->ioctx_table[...] just as CPU2
has picked it (under rcu_read_lock()). Then CPU1 proceeds to drop the
refcount, getting it to 0 and triggering a call of free_ioctx_users(),
which proceeds to drop the secondary refcount and once that reaches zero
calls free_ioctx_reqs(). That does
INIT_RCU_WORK(&ctx->free_rwork, free_ioctx);
queue_rcu_work(system_wq, &ctx->free_rwork);
and schedules freeing the whole thing after RCU delay.
In the meanwhile CPU2 has gotten around to percpu_ref_get(), bumping the
refcount from 0 to 1 and returned the reference to io_setup().
Tejun's fix (that queue_rcu_work() in there) guarantees that ctx won't get
freed until after percpu_ref_get(). Sure, we'd increment the counter before
ctx can be freed. Now we are out of rcu_read_lock() and there's nothing to
stop freeing of the whole thing. Unfortunately, CPU2 assumes that since it
has grabbed the reference, ctx is *NOT* going away until it gets around to
dropping that reference.
The fix is obvious - use percpu_ref_tryget_live() and treat failure as miss.
It's not costlier than what we currently do in normal case, it's safe to
call since freeing *is* delayed and it closes the race window - either
lookup_ioctx() comes before percpu_ref_kill() (in which case ctx->users
won't reach 0 until the caller of lookup_ioctx() drops it) or lookup_ioctx()
fails, ctx->users is unaffected and caller of lookup_ioctx() doesn't see
the object in question at all.
Cc: stable@kernel.org
Fixes: a6d7cff472 "fs/aio: Add explicit RCU grace period when freeing kioctx"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79f546a696 upstream.
We recently had an oops reported on a 4.14 kernel in
xfs_reclaim_inodes_count() where sb->s_fs_info pointed to garbage
and so the m_perag_tree lookup walked into lala land. It produces
an oops down this path during the failed mount:
radix_tree_gang_lookup_tag+0xc4/0x130
xfs_perag_get_tag+0x37/0xf0
xfs_reclaim_inodes_count+0x32/0x40
xfs_fs_nr_cached_objects+0x11/0x20
super_cache_count+0x35/0xc0
shrink_slab.part.66+0xb1/0x370
shrink_node+0x7e/0x1a0
try_to_free_pages+0x199/0x470
__alloc_pages_slowpath+0x3a1/0xd20
__alloc_pages_nodemask+0x1c3/0x200
cache_grow_begin+0x20b/0x2e0
fallback_alloc+0x160/0x200
kmem_cache_alloc+0x111/0x4e0
The problem is that the superblock shrinker is running before the
filesystem structures it depends on have been fully set up. i.e.
the shrinker is registered in sget(), before ->fill_super() has been
called, and the shrinker can call into the filesystem before
fill_super() does it's setup work. Essentially we are exposed to
both use-after-free and use-before-initialisation bugs here.
To fix this, add a check for the SB_BORN flag in super_cache_count.
In general, this flag is not set until ->fs_mount() completes
successfully, so we know that it is set after the filesystem
setup has completed. This matches the trylock_super() behaviour
which will not let super_cache_scan() run if SB_BORN is not set, and
hence will not allow the superblock shrinker from entering the
filesystem while it is being set up or after it has failed setup
and is being torn down.
Cc: stable@kernel.org
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b127125d9d upstream.
"VFS: don't keep disconnected dentries on d_anon" had a non-trivial
side-effect - d_unhashed() now returns true for those dentries,
making d_find_alias() skip them altogether. For most of its callers
that's fine - we really want a connected alias there. However,
there is a codepath where we relied upon picking such aliases
if nothing else could be found - selinux delayed initialization
of contexts for inodes on already mounted filesystems used to
rely upon that.
Cc: stable@kernel.org # f1ee616214 "VFS: don't keep disconnected dentries on d_anon"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 30da870ce4 upstream.
we unlock the directory hash too early - if we are looking at secondary
link and primary (in another directory) gets removed just as we unlock,
we could have the old primary moved in place of the secondary, leaving
us to look into freed entry (and leaving our dentry with ->d_fsdata
pointing to a freed entry).
Cc: stable@vger.kernel.org # 2.4.4+
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a3a92ccfe upstream.
Check the TIF_32BIT_FPREGS task setting of the tracee rather than the
tracer in determining the layout of floating-point general registers in
the floating-point context, correcting access to odd-numbered registers
for o32 tracees where the setting disagrees between the two processes.
Fixes: 597ce1723e ("MIPS: Support for 64-bit FP with O32 binaries")
Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.14+
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71e909c0cd upstream.
Correct commit 7aeb753b53 ("MIPS: Implement task_user_regset_view.")
and expose the FIR register using the unused 4 bytes at the end of the
NT_PRFPREG regset. Without that register included clients cannot use
the PTRACE_GETREGSET request to retrieve the complete FPU register set
and have to resort to one of the older interfaces, either PTRACE_PEEKUSR
or PTRACE_GETFPREGS, to retrieve the missing piece of data. Also the
register is irreversibly missing from core dumps.
This register is architecturally hardwired and read-only so the write
path does not matter. Ignore data supplied on writes then.
Fixes: 7aeb753b53 ("MIPS: Implement task_user_regset_view.")
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.13+
Patchwork: https://patchwork.linux-mips.org/patch/19273/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c60128ce97 upstream.
The debug definitions were missing for MACH_JZ4770, resulting in a build
failure when DEBUG_ZBOOT was set.
Since the UART addresses are the same across all Ingenic SoCs, we just
use a #ifdef CONFIG_MACH_INGENIC instead of checking for individual
Ingenic SoCs.
Additionally, I added a #define for the UART0 address in-code and
dropped the <asm/mach-jz4740/base.h> include, for the reason that this
include file is slowly being phased out as the whole platform is being
moved to devicetree.
Fixes: 9be5f3e92e ("MIPS: ingenic: Initial JZ4770 support")
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.16
Patchwork: https://patchwork.linux-mips.org/patch/18957/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55a2aa08b3 upstream.
When DMA will be performed to a MIPS32 1004K CPS, the L1-cache for the
range needs to be flushed and invalidated first.
The code currently takes one of two approaches.
1/ If the range is less than the size of the dcache, then HIT type
requests flush/invalidate cache lines for the particular addresses.
HIT-type requests a globalised by the CPS so this is safe on SMP.
2/ If the range is larger than the size of dcache, then INDEX type
requests flush/invalidate the whole cache. INDEX type requests affect
the local cache only. CPS does not propagate them in any way. So this
invalidation is not safe on SMP CPS systems.
Data corruption due to '2' can quite easily be demonstrated by
repeatedly "echo 3 > /proc/sys/vm/drop_caches" and then sha1sum a file
that is several times the size of available memory. Dropping caches
means that large contiguous extents (large than dcache) are more likely.
This was not a problem before Linux-4.8 because option 2 was never used
if CONFIG_MIPS_CPS was defined. The commit which removed that apparently
didn't appreciate the full consequence of the change.
We could, in theory, globalize the INDEX based flush by sending an IPI
to other cores. These cache invalidation routines can be called with
interrupts disabled and synchronous IPI require interrupts to be
enabled. Asynchronous IPI may not trigger writeback soon enough. So we
cannot use IPI in practice.
We can already test if IPI would be needed for an INDEX operation with
r4k_op_needs_ipi(R4K_INDEX). If this is true then we mustn't try the
INDEX approach as we cannot use IPI. If this is false (e.g. when there
is only one core and hence one L1 cache) then it is safe to use the
INDEX approach without IPI.
This patch avoids options 2 if r4k_op_needs_ipi(R4K_INDEX), and so
eliminates the corruption.
Fixes: c00ab4896e ("MIPS: Remove cpu_has_safe_index_cacheops")
Signed-off-by: NeilBrown <neil@brown.name>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.8+
Patchwork: https://patchwork.linux-mips.org/patch/19259/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bcdd559268 ]
The probe function is not allowed to fail after registering the RTC because
the following may happen:
CPU0: CPU1:
sys_load_module()
do_init_module()
do_one_initcall()
cmos_do_probe()
rtc_device_register()
__register_chrdev()
cdev->owner = struct module*
open("/dev/rtc0")
rtc_device_unregister()
module_put()
free_module()
module_free(mod->module_core)
/* struct module *module is now
freed */
chrdev_open()
spin_lock(cdev_lock)
cdev_get()
try_module_get()
module_is_live()
/* dereferences already
freed struct module* */
Switch to devm_rtc_allocate_device/rtc_register_device to register the rtc
as late as possible.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 347876ad47 ]
The shifting of buf[5] by 24 bits to the left will be promoted to
a 32 bit signed int and then sign-extended to an unsigned long. If
the top bit of buf[5] is set then all then all the upper bits sec
end up as also being set because of the sign-extension. Fix this by
casting buf[5] to an unsigned long before the shift.
Detected by CoverityScan, CID#1465292 ("Unintended sign extension")
Fixes: 0e1492330c ("rtc: add rtc-tx4939 driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 201fac95e7 ]
The probe function is not allowed to fail after registering the RTC because
the following may happen:
CPU0: CPU1:
sys_load_module()
do_init_module()
do_one_initcall()
cmos_do_probe()
rtc_device_register()
__register_chrdev()
cdev->owner = struct module*
open("/dev/rtc0")
rtc_device_unregister()
module_put()
free_module()
module_free(mod->module_core)
/* struct module *module is now
freed */
chrdev_open()
spin_lock(cdev_lock)
cdev_get()
try_module_get()
module_is_live()
/* dereferences already
freed struct module* */
Switch to devm_rtc_allocate_device/rtc_register_device to register the rtc
as late as possible.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b3a5ac42ab ]
On 32bit platforms, time_t is still a signed 32bit long. If it is
overflowed, userspace and the kernel cant agree on the current system time.
This causes multiple issues, in particular with systemd:
https://github.com/systemd/systemd/issues/1143
A good workaround is to simply avoid using hctosys which is something I
greatly encourage as the time is better set by userspace.
However, many distribution enable it and use systemd which is rendering the
system unusable in case the RTC holds a date after 2038 (and more so after
2106). Many drivers have workaround for this case and they should be
eliminated so there is only one place left to fix when userspace is able to
cope with dates after the 31bit overflow.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1485991c02 ]
commit 179a502f8c ("rtc: snvs: add Freescale rtc-snvs driver") introduces
the SNVS RTC driver with a function snvs_rtc_enable().
snvs_rtc_enable() can return an error on the enable path however this
driver does not currently trap that failure on the probe() path and
consequently if enabling the RTC fails we encounter a later error spinning
forever in rtc_write_sync_lp().
[ 36.093481] [<c010d630>] (__irq_svc) from [<c0c2e9ec>] (_raw_spin_unlock_irqrestore+0x34/0x44)
[ 36.102122] [<c0c2e9ec>] (_raw_spin_unlock_irqrestore) from [<c072e32c>] (regmap_read+0x4c/0x5c)
[ 36.110938] [<c072e32c>] (regmap_read) from [<c085d0f4>] (rtc_write_sync_lp+0x6c/0x98)
[ 36.118881] [<c085d0f4>] (rtc_write_sync_lp) from [<c085d160>] (snvs_rtc_alarm_irq_enable+0x40/0x4c)
[ 36.128041] [<c085d160>] (snvs_rtc_alarm_irq_enable) from [<c08567b4>] (rtc_timer_do_work+0xd8/0x1a8)
[ 36.137291] [<c08567b4>] (rtc_timer_do_work) from [<c01441b8>] (process_one_work+0x28c/0x76c)
[ 36.145840] [<c01441b8>] (process_one_work) from [<c01446cc>] (worker_thread+0x34/0x58c)
[ 36.153961] [<c01446cc>] (worker_thread) from [<c014aee4>] (kthread+0x138/0x150)
[ 36.161388] [<c014aee4>] (kthread) from [<c0107e14>] (ret_from_fork+0x14/0x20)
[ 36.168635] rcu_sched kthread starved for 2602 jiffies! g496 c495 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=0
[ 36.178564] rcu_sched R running task 0 8 2 0x00000000
[ 36.185664] [<c0c288b0>] (__schedule) from [<c0c29134>] (schedule+0x3c/0xa0)
[ 36.192739] [<c0c29134>] (schedule) from [<c0c2db80>] (schedule_timeout+0x78/0x4e0)
[ 36.200422] [<c0c2db80>] (schedule_timeout) from [<c01a7ab0>] (rcu_gp_kthread+0x648/0x1864)
[ 36.208800] [<c01a7ab0>] (rcu_gp_kthread) from [<c014aee4>] (kthread+0x138/0x150)
[ 36.216309] [<c014aee4>] (kthread) from [<c0107e14>] (ret_from_fork+0x14/0x20)
This patch fixes by parsing the result of rtc_write_sync_lp() and
propagating both in the probe and elsewhere. If the RTC doesn't start we
don't proceed loading the driver and don't get into this loop mess later
on.
Fixes: 179a502f8c ("rtc: snvs: add Freescale rtc-snvs driver")
Signed-off-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0e254963b6 ]
Most register accesses in the altera driver honor port->regshift by
using altera_uart_writel(). There are a few accesses however that were
missed when the driver was converted to use port->regshift and some
others were added later in commit 4d9d7d896d ("serial: altera_uart:
add earlycon support").
Fixes: 2780ad42f5 ("tty: serial: altera_uart: Use port->regshift to store bus shift")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2e9fe53910 ]
Currently, data in RX FIFO is read based on UART_LSR register state even
if RDI and RLSI interrupts are disabled in UART_IER register.
This is because when IRQ handler is called due to TX FIFO empty event,
RX FIFO is serviced based on UART_LSR register status instead of
UART_IIR status. This defeats the purpose of disabling UART RX
FIFO interrupts during throttling(see, omap_8250_throttle()) as IRQ
handler continues to drain UART RX FIFO resulting in overflow of buffer
at tty layer.
Fix this by making sure that driver drains UART RX FIFO only when
UART_IIR_RDI is set along with UART_LSR_BI or UART_LSR_DR bits.
Signed-off-by: Vignesh R <vigneshr@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f9f5786987 ]
The arc_uart_ports[] array is indexed using a value derived from the
"serialN" alias in DT, which may lead to an out-of-bounds access.
Fix this by adding a range check.
Note that the array size is defined by a Kconfig symbol
(CONFIG_SERIAL_ARC_NR_PORTS), so this can even be triggered using a
legitimate DTB.
Fixes: ea28fd56fc ("serial/arc-uart: switch to devicetree based probing")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ffab87fdec ]
The lpuart_ports[] array is indexed using a value derived from the
"serialN" alias in DT, which may lead to an out-of-bounds access.
Fix this by adding a range check.
Fixes: c9e2e946fb ("tty: serial: add Freescale lpuart driver support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5673444821 ]
The imx_ports[] array is indexed using a value derived from the
"serialN" alias in DT, or from platform data, which may lead to an
out-of-bounds access.
Fix this by adding a range check.
Fixes: ff05967a07 ("serial/imx: add of_alias_get_id() reference back")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dd345a31bf ]
The auart_port[] array is indexed using a value derived from the
"serialN" alias in DT, or from platform data, which may lead to an
out-of-bounds access.
Fix this by adding a range check.
Fixes: 1ea6607d4c ("serial: mxs-auart: Allow device tree probing")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 49ee23b718 ]
The s3c24xx_serial_ports[] array is indexed using a value derived from
the "serialN" alias in DT, or from an incrementing probe index, which
may lead to an out-of-bounds access.
Fix this by adding a range check.
Note that the array size is defined by a Kconfig symbol
(CONFIG_SERIAL_SAMSUNG_UARTS), so this can even be triggered using
a legitimate DTB or legitimate board code.
Fixes: 13a9f6c64f ("serial: samsung: Consider DT alias when probing ports")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 090fa4b0dc ]
The sci_ports[] array is indexed using a value derived from the
"serialN" alias in DT, which may lead to an out-of-bounds access.
Fix this by adding a range check.
Note that the array size is defined by a Kconfig symbol
(CONFIG_SERIAL_SH_SCI_NR_UARTS), so this can even be triggered using a
legitimate DTB.
Fixes: 97ed9790c5 ("serial: sh-sci: Remove unused platform data capabilities field")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e7d75e18d0 ]
The cdns_uart_port[] array is indexed using a value derived from the
"serialN" alias in DT, which may lead to an out-of-bounds access.
Fix this by adding a range check.
Fixes: 928e926349 ("tty: xuartps: Initialize ports according to aliases")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c685af1108 ]
Fixes missing characters on kernel console at low baud rates (i.e.9600).
The driver should poll TX_RDY or TX_FIFO_EMP instead of TX_EMP to ensure
that the transmitter holding register (THR) is ready to receive a new byte.
TX_EMP tells us when it is possible to send a break sequence via
SND_BRK_SEQ. While this also indicates that both the THR and the TSR are
empty, it does not guarantee that a new byte can be written just yet.
Fixes: 30530791a7 ("serial: mvebu-uart: initial support for Armada-3700 serial port")
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Gabriel Matni <gabriel.matni@exfo.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 67300abdbe ]
Currently an out of range dev->nr is detected by just reporting the
issue and later on an out-of-bounds read on array card occurs because
of this. Fix this by checking the upper range of dev->nr with the size
of array card (removes the hard coded size), move this check earlier
and also exit with the error -ENOSYS to avoid the later out-of-bounds
array read.
Detected by CoverityScan, CID#711191 ("Out-of-bounds-read")
Fixes: commit 02b20b0b4c ("V4L/DVB (12730): Add conexant cx25821 driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
[hans.verkuil@cisco.com: %ld -> %zd]
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 65243386f4 ]
The vivid driver has two custom controls that change the behavior of RDS.
Depending on the control setting the V4L2_CAP_READWRITE capability is toggled.
However, after an earlier commit the capability was no longer set correctly.
This is now fixed.
Fixes: 9765a32cd8 ("vivid: set device_caps in video_device")
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d13a0139d7 ]
Fixes vb2_vmalloc_get_userptr() to ioremap correct area.
Since the current code does ioremap the page address, if the offset > 0,
it does not do ioremap the last page and results in kernel panic.
This fixes to pass the size + offset to ioremap so that ioremap
can map correct area. Also, this uses __pfn_to_phys() to get the physical
address of given PFN.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Takao Orito <orito.takao@socionext.com>
Reported-by: Fumihiro ATSUMI <atsumi@infinitegra.co.jp>
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f564184e6 ]
The ADV748x handles interlaced media using V4L2_FIELD_ALTERNATE field
types. The correct specification for the height on the mbus is the
image height, in this instance, the field height.
The AFE component already correctly adjusts the height on the mbus, but
the HDMI component got left behind.
Adjust the mbus height to correctly describe the image height of the
fields when processing interlaced video for HDMI pipelines.
Fixes: 3e89586a64 ("media: i2c: adv748x: add adv748x driver")
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 94448e21cf ]
Both lgdt33606a_release and lgdt3306a_remove kfree state, but _release is
called first, then _remove operates on states members before kfree'ing it.
This can lead to random oops/GPF/etc on USB disconnect.
Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a398e04363 ]
While experimenting with older compiler versions, I ran
into a warning that no longer shows up on gcc-4.8 or newer:
drivers/media/platform/s3c-camif/camif-capture.c: In function '__camif_subdev_try_format':
drivers/media/platform/s3c-camif/camif-capture.c:1265:25: error: array subscript is below array bounds
This is an off-by-one bug, leading to an access before the start of the
array, while newer compilers silently assume this undefined behavior
cannot happen and leave the loop at index 0 if no other entry matches.
As Sylvester explains, we actually need to ensure that the
value is within the range, so this reworks the loop to be
easier to parse correctly, and an additional check to fall
back on the first format value for any unexpected input.
I found an existing gcc bug for it and added a reduced version
of the function there.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69249#c3
Fixes: babde1c243 ("[media] V4L: Add driver for S3C24XX/S3C64XX SoC series camera interface")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5ceade1d97 ]
Currently clk_freq is ignored entirely, because the cx235840 driver
configures the xtal at the chip defaults. This is an issue if a
board is produced with a non-default frequency crystal. If clk_freq
is not zero the cx25840 will attempt to use the setting provided,
or fall back to defaults otherwise.
Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 779c79d4b8 ]
Hauppauge produced a revision of ImpactVCBe using an 888,
with a 25MHz crystal, instead of using the default third
overtone 50Mhz crystal. This overrides that frequency so
that the cx25840 is properly configured. Without the proper
crystal setup the cx25840 cannot load the firmware or
decode video.
Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6b71aceceb ]
The fixed_pll also has a fractional part. On axg s400 board, without
this parameter, the calculated rate is off by ~8Mhz (0,4%). The fixed_pll
being the root of the peripheral clock tree, this error is propagated to
the rest of the clocks
Adding the definition of the parameter fixes the problem
Fixes: 78b4af312f ("clk: meson-axg: add clock controller drivers")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a8321e7887 ]
Rates declared in PLL rate tables should match exactly rates calculated
from PLL coefficients. If that is not the case, rate of the PLL's child clock
might be set not as expected. For instance, if in the PLL rates table we have
a 393216000 Hz entry and the real value as returned by the PLL's recalc_rate
callback is 393216003, after setting PLL's clk rate to 393216000 clk_get_rate
will return 393216003. If we now attempt to set rate of a PLL's child divider
clock to 393216000/2 its rate will be 131072001, rather than 196608000.
That is, the divider will be set to 3 instead of 2, because 393216003/2 is
greater than 196608000.
To fix this issue declared rates are changed to exactly match rates generated
by the PLL, as calculated from the P, M, S, K coefficients.
In this patch an erroneous P value for 74176002 output frequency is also
corrected.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2ac051eeab ]
Rates declared in PLL rate tables should match exactly rates calculated
from PLL coefficients. If that is not the case, rate of the PLL's child clock
might be set not as expected. For instance, if in the PLL rates table we have
a 393216000 Hz entry and the real value as returned by the PLL's recalc_rate
callback is 393216003, after setting PLL's clk rate to 393216000 clk_get_rate
will return 393216003. If we now attempt to set rate of a PLL's child divider
clock to 393216000/2 its rate will be 131072001, rather than 196608000.
That is, the divider will be set to 3 instead of 2, because 393216003/2 is
greater than 196608000.
To fix this issue declared rates are changed to exactly match rates generated
by the PLL, as calculated from the P, M, S, K coefficients.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ab0447845c ]
Rates declared in PLL rate tables should match exactly rates calculated from
the PLL coefficients. If that is not the case, rate of the PLL's child clock
might be set not as expected. For instance, if in the PLL rates table we have
a 393216000 Hz entry and the real value as returned by the PLL's recalc_rate
callback is 393216003, after setting PLL's clk rate to 393216000 clk_get_rate
will return 393216003. If we now attempt to set rate of a PLL's child divider
clock to 393216000/2 its rate will be 131072001, rather than 196608000.
That is, the divider will be set to 3 instead of 2, because 393216003/2 is
greater than 196608000.
To fix this issue declared rates are changed to exactly match rates generated
by the PLL, as calculated from the P, M, S, K coefficients.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cdb68fbd4e ]
Rates declared in PLL rate tables should match exactly rates calculated from
the PLL coefficients. If that is not the case, rate of the PLL's child clock
might be set not as expected. For instance, if in the PLL rates table we have
a 393216000 Hz entry and the real value as returned by the PLL's recalc_rate
callback is 393216003, after setting PLL's clk rate to 393216000 clk_get_rate
will return 393216003. If we now attempt to set rate of a PLL's child divider
clock to 393216000/2 its rate will be 131072001, rather than 196608000.
That is, the divider will be set to 3 instead of 2, because 393216003/2 is
greater than 196608000.
To fix this issue declared rates are changed to exactly match rates generated
by the PLL, as calculated from the P, M, S, K coefficients.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7e4db0c283 ]
Rates declared in PLL rate tables should match exactly rates calculated from
the PLL coefficients. If that is not the case, rate of the PLL's child clock
might be set not as expected. For instance, if in the PLL rates table we have
a 393216000 Hz entry and the real value as returned by the PLL's recalc_rate
callback is 393216003, after setting PLL's clk rate to 393216000 clk_get_rate
will return 393216003. If we now attempt to set rate of a PLL's child divider
clock to 393216000/2 its rate will be 131072001, rather than 196608000.
That is, the divider will be set to 3 instead of 2, because 393216003/2 is
greater than 196608000.
To fix this issue declared rates are changed to exactly match rates generated
by the PLL, as calculated from the P, M, S, K coefficients.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 179db533c0 ]
Rates declared in PLL rate tables should match exactly rates calculated from
the PLL coefficients. If that is not the case, rate of the PLL's child clock
might be set not as expected. For instance, if in the PLL rates table we have
a 393216000 Hz entry and the real value as returned by the PLL's recalc_rate
callback is 393216003, after setting PLL's clk rate to 393216000 clk_get_rate
will return 393216003. If we now attempt to set rate of a PLL's child divider
clock to 393216000/2 its rate will be 131072001, rather than 196608000.
That is, the divider will be set to 3 instead of 2, because 393216003/2 is
greater than 196608000.
To fix this issue declared rates are changed to exactly match rates generated
by the PLL, as calculated from the P, M, S, K coefficients.
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Acked-by: Tomasz Figa <tomasz.figa@gmail.com>
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4bf59902b5 ]
The MMC sample and drv clock for rockchip platforms are derived from
the bus clock output to the MMC/SDIO card. So it should never happens
that the clk rate is zero given it should inherits the clock rate from
its parent. If something goes wrong and makes the clock rate to be zero,
the calculation would be wrong but may still make the mmc tuning process
work luckily. However it makes people harder to debug when the following
data transfer is unstable.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c35b518f9b ]
Turns out latest upstream U-Boot does not configure/enable pll_u which
leaves it at some default rate of 500 kHz:
root@apalis-t30:~# cat /sys/kernel/debug/clk/clk_summary | grep pll_u
pll_u 3 3 0 500000 0
Of course this won't quite work leading to the following messages:
[ 6.559593] usb 2-1: new full-speed USB device number 2 using tegra-
ehci
[ 11.759173] usb 2-1: device descriptor read/64, error -110
[ 27.119453] usb 2-1: device descriptor read/64, error -110
[ 27.389217] usb 2-1: new full-speed USB device number 3 using tegra-
ehci
[ 32.559454] usb 2-1: device descriptor read/64, error -110
[ 47.929777] usb 2-1: device descriptor read/64, error -110
[ 48.049658] usb usb2-port1: attempt power cycle
[ 48.759475] usb 2-1: new full-speed USB device number 4 using tegra-
ehci
[ 59.349457] usb 2-1: device not accepting address 4, error -110
[ 59.509449] usb 2-1: new full-speed USB device number 5 using tegra-
ehci
[ 70.069457] usb 2-1: device not accepting address 5, error -110
[ 70.079721] usb usb2-port1: unable to enumerate USB device
Fix this by actually allowing the rate also being set from within
the Linux kernel.
Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit df934cbcbf ]
The symbol is in the __initconst section but not marked init, which
caused a warning when building with LTO.
This makes it 'const' as was obviously intended.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: c80dfd9bf5 ("clk: hisilicon: add CRG driver for Hi3516CV300 SoC")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1f9c63e8de ]
It's found that the clock phase output from clk_summary is
wrong compared to the actual phase reading from the register.
cat /sys/kernel/debug/clk/clk_summary | grep sdio_sample
sdio_sample 0 1 0 50000000 0 -22
It exposes an issue that clk core, clk_core_get_phase, always
returns the cached core->phase which should be either updated
by calling clk_set_phase or directly from the first place the
clk was registered.
When registering the clk, the core->phase geting from ->get_phase()
may return negative value indicating error. This is quite common
since the clk's phase may be highly related to its parent chain,
but it was temporarily orphan when registered, since its parent
chains hadn't be ready at that time, so the clk drivers decide to
return error in this case. However, if no clk_set_phase is called or
maybe the ->set_phase() isn't even implemented, the core->phase would
never be updated. This is wrong, and we should try to update it when
all its parent chains are settled down, like the way of updating clock
rate for that. But it's not deserved to complicate the code now and
just update it anyway when calling clk_core_get_phase, which would be
much simple and enough.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Acked-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b0556a441 ]
commit c420c1e4db22 ("clk: rockchip: Prevent calculating mmc phase
if clock rate is zero") catches one gremlin again for clk-rk3228.c
that the parent of SDMMC phase clock should be sclk_sdmmc0, but not
sclk_sdmmc. However, the naming of the sdmmc clocks varies in the
manual with the card clock having the 0 while the hclk is named
without appended 0. So standardize one one format to prevent
confusion, as there also is only one (non-sdio) mmc controller on
the soc.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 647d04f8e0 ]
If the RCLK mux clock configuration is specified in DT and no set_sysclk()
callback is used in the sound card driver the sclk_srcrate field will remain
set to 0, leading to an incorrect PSR divider setting.
To fix this the frequency value is retrieved from the CLK_I2S_RCLK_SRC clock,
so the actual RCLK mux selection is taken into account.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bde8b3887a ]
This patch adds the change required to create the TLV data
for dapm widget kcontrols from topology. This also fixes the following
TLV read error shown in amixer while showing the card control contents.
"amixer: Control hw:1 element TLV read error: No such device or address"
Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1d22c337dc ]
In case of sample rates lower than 44100 currently there is too low MCLK
frequency set for the CODEC. Playback fails with following errors:
$ speaker-test -c2 -t sine -f 1500 -l2 -r 32000
Sine wave rate is 1500.0000Hz
Rate set to 32000Hz (requested 32000Hz)
Buffer size range from 128 to 131072
Period size range from 64 to 65536
Using max buffer size 131072
Periods = 4
Unable to set hw params for playback: Invalid argument
Setting of hwparams failed: Invalid argument
[ 497.883700] max98090 1-0010: Invalid master clock frequency
To fix this the I2S root clock's frequency is increased, depending
on sampling rate.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 04673e38f5 ]
The driver controls when the hardware sends completions that communicate
consumption of elements from the WQ. This is done by setting a WQEC bit
on a WQE.
The current driver sets it on every Nth WQE posting. However, the driver
isn't clearing the bit if the WQE is reused. Thus, if the queue depth
isn't evenly divisible by N, with enough time, it can be set on every
element, creating a lot of overhead and risking CQ full conditions.
Correct by clearing the bit when not setting it on an Nth element.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 91455b8509 ]
A stress test repeatedly resetting the adapter while performing io would
eventually report I/O failures and missing nvme namespaces.
The driver was setting the nvmefc_fcp_req->private pointer to NULL
during the IO completion routine before upcalling done(). If the
transport was also running an abort for that IO, the driver would fail
the abort with message 6140. Failing the abort is not allowed by the
nvme-fc transport, as it mandates that the io must be returned back to
the transport. As that does not happen, the transport controller delete
has an outstanding reference and can't complete teardown.
The NULL-ing of the private pointer should be done only when the io is
considered complete. It's complete when the adapter returns the exchange
with the "exchange busy" flag clear.
Move the NULL'ing of the structure to the done case. This leaves the io
contexts set while it is busy and until the subsequent XRI_ABORTED
completion which returns the exchange is received.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 161df4f099 ]
During link bounce testing in a point-to-point topology, the host may
enter a soft lockup on the lpfc_worker thread:
Call Trace:
lpfc_work_done+0x1f3/0x1390 [lpfc]
lpfc_do_work+0x16f/0x180 [lpfc]
kthread+0xc7/0xe0
ret_from_fork+0x3f/0x70
The driver was simultaneously setting a combination of flags that caused
lpfc_do_work()to effectively spin between slow path work and new event
data, causing the lockup.
Ensure in the typical wq completions, that new event data flags are set
if the slow path flag is running. The slow path will eventually
reschedule the wq handling.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 815a9c4376 ]
In a test that is doing large numbers of cable swaps on the target, the
nvme controllers wouldn't reconnect.
During the cable swaps, the targets n_port_id would change. This
information was passed to the nvme-fc transport, in the new remoteport
registration. However, the nvme-fc transport didn't update the n_port_id
value in the remoteport struct when it reused an existing structure.
Later, when a new association was attempted on the remoteport, the
driver's NVME LS routine would use the stale n_port_id from the
remoteport struct to address the LS. As the device is no longer at that
address, the LS would go into never never land.
Separately, the nvme-fc transport will be corrected to update the
n_port_id value on a re-registration.
However, for now, there's no reason to use the transports values. The
private pointer points to the drivers node structure and the node
structure is up to date. Therefore, revise the LS routine to use the
drivers data structures for the LS. Augmented the debug message for
better debugging in the future.
Also removed a duplicate if check that seems to have slipped in.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1875ede02e ]
The SCSI PRE-FETCH (10 or 16) command is present both on hard disks
and some SSDs. It is useful when the address of the next block(s) to
be read is known but it is not following the LBA of the current READ
(so read-ahead won't help). It returns two "good" SCSI Status values.
If the requested blocks have fitted (or will most likely fit (when
the IMMED bit is set)) into the disk's cache, it returns CONDITION
MET. If it didn't (or will not) fit then it returns GOOD status.
The goal of this patch is to stop the SCSI subsystem treating the
CONDITION MET SCSI status as an error. The current state makes the
PRE-FETCH command effectively unusable via pass-throughs.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b7007dbccd ]
When exiting a transformation, the cra_exit() helper is called in each
driver providing one. The Inside Secure SafeXcel driver has one, which
is responsible of freeing some areas and of sending one invalidation
request to the crypto engine, to invalidate the context that was used
during the transformation.
We could see in some setups (when lots of transformations were being
used with a short lifetime, and hence lots of cra_exit() calls) NULL
pointer dereferences and other weird issues. All these issues were
coming from accessing the tfm context.
The issue is the invalidation request completion is checked using a
wait_for_completion_interruptible() call in both the cipher and hash
cra_exit() helpers. In some cases this was interrupted while the
invalidation request wasn't processed yet. And then cra_exit() returned,
and its caller was freeing the tfm instance. Only then the request was
being handled by the SafeXcel driver, which lead to the said issues.
This patch fixes this by using wait_for_completion() calls in these
specific cases.
Fixes: 1b44c5a60c ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e1d24c0bb7 ]
This patch fixes the Inside Secure SafeXcel driver not to overwrite the
interrupt threshold value. In certain cases the value of this register,
which controls when to fire an interrupt, was overwritten. This lead to
packet not being processed or acked as the driver never was aware of
their completion.
This patch fixes this behaviour by not setting the threshold when
requests are being processed by the engine.
Fixes: dc7e28a328 ("crypto: inside-secure - dequeue all requests at once")
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 666a9c70b0 ]
This patch fixes the cache length computation as cache_len could end up
being a negative value. The check between the queued size and the
block size is updated to reflect the caching mechanism which can cache
up to a full block size (included!).
Fixes: 809778e02c ("crypto: inside-secure - fix hash when length is a multiple of a block")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 95831ceafc ]
This patch adds a check in the SafeXcel dequeue function, to avoid
processing request further if no hardware command was issued. This can
happen in certain cases where the ->send() function caches all the data
that would have been send.
Fixes: 809778e02c ("crypto: inside-secure - fix hash when length is a multiple of a block")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79eb382b5e ]
I don't why we need take a single write lock and disable interrupts
while setting up debugfs. This is what what happens when we try anyway:
|ccp 0000:03:00.2: enabling device (0000 -> 0002)
|BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:69
|in_atomic(): 1, irqs_disabled(): 1, pid: 3, name: kworker/0:0
|irq event stamp: 17150
|hardirqs last enabled at (17149): [<0000000097a18c49>] restore_regs_and_return_to_kernel+0x0/0x23
|hardirqs last disabled at (17150): [<000000000773b3a9>] _raw_write_lock_irqsave+0x1b/0x50
|softirqs last enabled at (17148): [<0000000064d56155>] __do_softirq+0x3b8/0x4c1
|softirqs last disabled at (17125): [<0000000092633c18>] irq_exit+0xb1/0xc0
|CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.16.0-rc2+ #30
|Workqueue: events work_for_cpu_fn
|Call Trace:
| dump_stack+0x7d/0xb6
| ___might_sleep+0x1eb/0x250
| down_write+0x17/0x60
| start_creating+0x4c/0xe0
| debugfs_create_dir+0x9/0x100
| ccp5_debugfs_setup+0x191/0x1b0
| ccp5_init+0x8a7/0x8c0
| ccp_dev_init+0xb8/0xe0
| sp_init+0x6c/0x90
| sp_pci_probe+0x26e/0x590
| local_pci_probe+0x3f/0x90
| work_for_cpu_fn+0x11/0x20
| process_one_work+0x1ff/0x650
| worker_thread+0x1d4/0x3a0
| kthread+0xfe/0x130
| ret_from_fork+0x27/0x50
If any locking is required, a simple mutex will do it.
Cc: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4dc5475ae0 ]
This patch updates the safexcel_hmac_init_pad() function to also wait
for completion when the digest return code is -EBUSY, as it would mean
the request is in the backlog to be processed later.
Fixes: 1b44c5a60c ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b869648c06 ]
This patches moves the digest information from the transformation
context to the request context. This fixes cases where HMAC init
functions were called and override the digest value for a short period
of time, as the HMAC init functions call the SHA init one which reset
the value. This lead to a small percentage of HMAC being incorrectly
computed under heavy load.
Fixes: 1b44c5a60c ("crypto: inside-secure - add SafeXcel EIP197 crypto engine driver")
Suggested-by: Ofer Heifetz <oferh@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
[Ofer here did all the work, from seeing the issue to understanding the
root cause. I only made the patch.]
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 17556cdbe6 ]
Commit 8f18c8a48b ("staging: lustre: lmv: separate master object
with master stripe") changed how lmo_root inodes were managed,
particularly when LMV_HASH_FLAG_MIGRATION is not set.
Previously lsm_md_oinfo[0].lmo_root was always a borrowed
inode reference and didn't need to by iput().
Since the change, that special case only applies when
LMV_HASH_FLAG_MIGRATION is set
In the upstream (lustre-release) version of this patch [Commit
60e07b972114 ("LU-4690 lod: separate master object with master
stripe")] the for loop in the lmv_unpack_md() was changed to count
from 0 and to ignore entry 0 if LMV_HASH_FLAG_MIGRATION is set.
In the patch that got applied to Linux, that change was missing,
so lsm_md_oinfo[0].lmo_root is never iput().
This results in a "VFS: Busy inodes" warning at unmount.
Fixes: 8f18c8a48b ("staging: lustre: lmv: separate master object with master stripe")
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dc13498ab4 ]
The case statement in get_ap_information() should not use literal integers
to parse information element IDs when these values are provided by name
in 'enum ieee80211_eid' in the header 'linux/ieee80211.h'.
Signed-off-by: Quytelda Kahja <quytelda@tamalin.org>
Reviewed-by: Tobin C. Harding <me@tobin.cc>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e1a7418529 ]
Currently the allocation of priv->oldaddr is not null checked which will
lead to subsequent errors when accessing priv->oldaddr. Fix this with
a null pointer check and a return of -ENOMEM on allocation failure.
Detected with Coccinelle:
drivers/staging/rtl8192u/r8192U_core.c:1708:2-15: alloc with no test,
possible model on line 1723
Fixes: 8fc8598e61 ("Staging: Added Realtek rtl8192u driver to staging")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 75c583ab97 ]
The DPAA2 Ethernet driver incorrectly assumes virtual addresses
are always 64b long, which causes compiler errors when building
for a 32b platform.
Fix this by using explicit casts to uintptr_t where necessary.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2fab9faf9b ]
The lustre-release patch commit bdc5bb52c554 ("LU-4933 osc:
Automatically increase the max_dirty_mb") changed
- if (cli->cl_dirty + PAGE_CACHE_SIZE <= cli->cl_dirty_max &&
+ if (cli->cl_dirty_pages < cli->cl_dirty_max_pages &&
When this patch landed in Linux a couple of years later, it landed as
- if (cli->cl_dirty + PAGE_SIZE <= cli->cl_dirty_max &&
+ if (cli->cl_dirty_pages <= cli->cl_dirty_max_pages &&
which is clearly different ('<=' vs '<'), and allows cl_dirty_pages to
increase beyond cl_dirty_max_pages - which causes a latter assertion
to fails.
Fixes: 3147b26840 ("staging: lustre: osc: Automatically increase the max_dirty_mb")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6a9bbe53db ]
Use netdev_alloc_frag() instead of kmalloc to allocate space for
the S/G table of egress multi-buffer frames.
This fixes a bug where an unaligned pointer received from the
allocator would be overwritten with the 64B aligned value,
leading to a wrong address being later passed to kfree.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7cfebcb75 upstream.
There's currently no limit on wiphy names, other than netlink
message size and memory limitations, but that causes issues when,
for example, the wiphy name is used in a uevent, e.g. in rfkill
where we use the same name for the rfkill instance, and then the
buffer there is "only" 2k for the environment variables.
This was reported by syzkaller, which used a 4k name.
Limit the name to something reasonable, I randomly picked 128.
Reported-by: syzbot+230d9e642a85d3fec29c@syzkaller.appspotmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bdac616db9 upstream.
Commit 2d1d4c1e59 made loop_get_status() drop lo_ctx_mutex before
returning, but the loop_get_status_old(), loop_get_status64(), and
loop_get_status_compat() wrappers don't call loop_get_status() if the
passed argument is NULL. The callers expect that the lock is dropped, so
make sure we drop it in that case, too.
Reported-by: syzbot+31e8daa8b3fc129e75f2@syzkaller.appspotmail.com
Fixes: 2d1d4c1e59 ("loop: don't call into filesystem while holding lo_ctl_mutex")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d1d4c1e59 upstream.
We hit an issue where a loop device on NFS was stuck in
loop_get_status() doing vfs_getattr() after the NFS server died, which
caused a pile-up of uninterruptible processes waiting on lo_ctl_mutex.
There's no reason to hold this lock while we wait on the filesystem;
let's drop it so that other processes can do their thing. We need to
grab a reference on lo_backing_file while we use it, and we can get rid
of the check on lo_device, which has been unnecessary since commit
a34c0ae9ebd6 ("[PATCH] loop: remove the bio remapping capability") in
the linux-history tree.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0ee78c1014 ]
xhci driver displays the supported xHC USB revision in a message during
driver load:
"Host supports USB 3.1 Enhanced SuperSpeed"
Get the USB minor revision number from the xhci protocol capability.
This will show the correct supported revisions for new USB 3.2 and later
hosts
Don't rely on the SBRN (serial bus revision number) register, it's often
showing 0x30 (USB3.0) for hosts that support USB 3.1
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c7c7e8d780 ]
Hauppauge em28xx bulk devices exhibit continuity errors and corrupted
packets, when run in VMWare virtual machines. Unknown if other
manufacturers bulk models exhibit the same issue. KVM/Qemu is unaffected.
According to documentation the maximum packet multiplier for em28xx in bulk
transfer mode is 256 * 188 bytes. This changes the size of bulk transfers
to maximum supported value and have a bonus beneficial alignment.
Before:
After:
This sets up USB to expect just as many bytes as the em28xx is set to emit.
Successful usage under load afterwards natively and in both VMWare
and KVM/Qemu virtual machines.
Signed-off-by: Brad Love <brad@nextdimension.cc>
Reviewed-by: Michael Ira Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 835d66173a ]
When used as an i2c device there is a module usage count mismatch on
removal, preventing the driver from being used thereafter. dvb_attach
increments the usage count so it is properly balanced on removal.
On disconnect of Hauppauge SoloHD/DualHD before:
lsmod | grep lgdt3306a
lgdt3306a 28672 -1
i2c_mux 16384 1 lgdt3306a
On disconnect of Hauppauge SoloHD/DualHD after:
lsmod | grep lgdt3306a
lgdt3306a 28672 0
i2c_mux 16384 1 lgdt3306a
Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5d6ae4f0da ]
When handling an OS descriptor request, one of the first operations is
to zero out the request buffer using the wLength from the setup packet.
There is no bounds checking, so a wLength > 4096 would clobber memory
adjacent to the request buffer. Fix this by taking the min of wLength
and the request buffer length prior to the memset. While at it, define
the buffer length in a header file so that magic numbers don't appear
throughout the code.
When returning data to the host, the data length should be the min of
the wLength and the valid data we have to return. Currently we are
returning wLength, thus requests for a wLength greater than the amount
of data in the OS descriptor buffer would return invalid (albeit zero'd)
data following the valid descriptor data. Fix this by counting the
number of bytes when constructing the data and using this when
determining the length of the request.
Signed-off-by: Chris Dickens <christopher.a.dickens@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4058ebf33c ]
When using a AIO read() operation on the function FS gadget driver a URB is
submitted asynchronously and on URB completion the received data is copied
to the userspace buffer associated with the read operation.
This is done from a kernel worker thread invoking copy_to_user() (through
copy_to_iter()). And while the user space process memory is made available
to the kernel thread using use_mm(), some architecture require in addition
to this that the operation runs with USER_DS set. Otherwise the userspace
memory access will fail.
For example on ARM64 with Privileged Access Never (PAN) and User Access
Override (UAO) enabled the following crash occurs.
Internal error: Accessing user space memory with fs=KERNEL_DS: 9600004f [#1] SMP
Modules linked in:
CPU: 2 PID: 1636 Comm: kworker/2:1 Not tainted 4.9.0-04081-g8ab2dfb-dirty #487
Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
Workqueue: events ffs_user_copy_worker
task: ffffffc87afc8080 task.stack: ffffffc87a00c000
PC is at __arch_copy_to_user+0x190/0x220
LR is at copy_to_iter+0x78/0x3c8
[...]
[<ffffff800847b790>] __arch_copy_to_user+0x190/0x220
[<ffffff80086f25d8>] ffs_user_copy_worker+0x70/0x130
[<ffffff80080b8c64>] process_one_work+0x1dc/0x460
[<ffffff80080b8f38>] worker_thread+0x50/0x4b0
[<ffffff80080bf5a0>] kthread+0xd8/0xf0
[<ffffff8008083680>] ret_from_fork+0x10/0x50
Address this by placing a set_fs(USER_DS) before of the copy operation
and revert it again once the copy operation has finished.
This patch is analogous to commit d7ffde35e3 ("vhost: use USER_DS in
vhost_worker thread") which addresses the same underlying issue.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 946ef68ad4 ]
Some UDC drivers (like the DWC3) expect that the response to a setup()
request is queued from within the setup function itself so that it is
available as soon as setup() has completed.
Upon receiving a setup request the function fs driver creates an event that
is made available to userspace. And only once userspace has acknowledged
that event the response to the setup request is queued.
So it violates the requirement of those UDC drivers and random failures can
be observed. This is basically a race condition and if userspace is able to
read the event and queue the response fast enough all is good. But if it is
not, for example because other processes are currently scheduled to run,
the USB host that sent the setup request will observe an error.
To avoid this the gadget framework provides the USB_GADGET_DELAYED_STATUS
return code. If a setup() callback returns this value the UDC driver is
aware that response is not yet available and can uses the appropriate
methods to handle this case.
Since in the case of function fs the response will never be available when
the setup() function returns make sure that this status code is used.
This fixed random occasional failures that were previously observed on a
DWC3 based system under high system load.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 92a8dd2646 ]
Added missing GUSBCFG programming in host mode, which fixes
transaction errors issue on HiKey and Altera Cyclone V boards.
These field even if was programmed in device mode (in function
dwc2_hsotg_core_init_disconnected()) will be resetting to POR values
after core soft reset applied.
So, each time when switching to host mode required to set this field
to correct value.
Acked-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: Grigor Tovmasyan <tovmasya@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a400efe455 ]
set udev->slot_id to zero when disabling and freeing the xhci slot.
Prevents usb core from calling xhci with a stale slot id.
xHC controller may be reset during resume to recover from some error.
All slots are unusable as they are disabled and freed.
xhci driver starts slot enumeration again from 1 in the order they are
enabled. In the worst case a stale udev->slot_id for one device matches
a newly enabled slot_id for a different device, causing us to
perform a action on the wrong device.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71426535f4 ]
Add native DSD support quirk for Luxman DA-06 DAC, by adding the
PID/VID 1852:5065.
Rename "is_marantz_denon_dac()" function to "is_itf_usb_dsd_2alts_dac()"
to cover broader device family sharing the same USB audio
implementation(*).
For the same reason, rename "is_teac_dsd_dac()" function to
"is_itf_usb_dsd_3alts_dac()".
(*)
These devices have the same USB controller "ITF-USB DSD", supplied by
INTERFACE Co., Ltd.
"ITF-USB DSD" USB controller has two patterns,
Pattern 1. (2 altsets version)
- Altset 0: for control
- Altset 1: for stream (S32)
- Altset 2: for stream (S32, DSD_U32)
Pattern 2. (3 altsets version)
- Altset 0: for control
- Altset 1: for stream (S16)
- Altset 2: for stream (S32)
- Altset 3: for stream (S32, DSD_U32)
"is_itf_usb_dsd_2alts_dac()" returns true, if the DAC has "Pattern 1"
USB controller, and "is_itf_usb_dsd_3alts_dac()" returns true, if
"Pattern2".
Signed-off-by: Nobutaka Okabe <nob77413@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa89adba19 upstream.
zfcp_erp_adapter_reopen() schedules blocking of all of the adapter's
rports via zfcp_scsi_schedule_rports_block() and enqueues a reopen
adapter ERP action via zfcp_erp_action_enqueue(). Both are separately
processed asynchronously and concurrently.
Blocking of rports is done in a kworker by zfcp_scsi_rport_work(). It
calls zfcp_scsi_rport_block(), which then traces a DBF REC "scpdely" via
zfcp_dbf_rec_trig(). zfcp_dbf_rec_trig() acquires the DBF REC spin lock
and then iterates with list_for_each() over the adapter's ERP ready list
without holding the ERP lock. This opens a race window in which the
current list entry can be moved to another list, causing list_for_each()
to iterate forever on the wrong list, as the erp_ready_head is never
encountered as terminal condition.
Meanwhile the ERP action can be processed in the ERP thread by
zfcp_erp_thread(). It calls zfcp_erp_strategy(), which acquires the ERP
lock and then calls zfcp_erp_action_to_running() to move the ERP action
from the ready to the running list. zfcp_erp_action_to_running() can
move the ERP action using list_move() just during the aforementioned
race window. It then traces a REC RUN "erator1" via zfcp_dbf_rec_run().
zfcp_dbf_rec_run() tries to acquire the DBF REC spin lock. If this is
held by the infinitely looping kworker, it effectively spins forever.
Example Sequence Diagram:
Process ERP Thread rport_work
------------------- ------------------- -------------------
zfcp_erp_adapter_reopen()
zfcp_erp_adapter_block()
zfcp_scsi_schedule_rports_block()
lock ERP zfcp_scsi_rport_work()
zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_ADAPTER)
list_add_tail() on ready !(rport_task==RPORT_ADD)
wake_up() ERP thread zfcp_scsi_rport_block()
zfcp_dbf_rec_trig() zfcp_erp_strategy() zfcp_dbf_rec_trig()
unlock ERP lock DBF REC
zfcp_erp_wait() lock ERP
| zfcp_erp_action_to_running()
| list_for_each() ready
| list_move() current entry
| ready to running
| zfcp_dbf_rec_run() endless loop over running
| zfcp_dbf_rec_run_lvl()
| lock DBF REC spins forever
Any adapter recovery can trigger this, such as setting the device offline
or reboot.
V4.9 commit 4eeaa4f3f1 ("zfcp: close window with unblocked rport
during rport gone") introduced additional tracing of (un)blocking of
rports. It missed that the adapter->erp_lock must be held when calling
zfcp_dbf_rec_trig().
This fix uses the approach formerly introduced by commit aa0fec6239
("[SCSI] zfcp: Fix sparse warning by providing new entry in dbf") that got
later removed by commit ae0904f60f ("[SCSI] zfcp: Redesign of the debug
tracing for recovery actions.").
Introduce zfcp_dbf_rec_trig_lock(), a wrapper for zfcp_dbf_rec_trig() that
acquires and releases the adapter->erp_lock for read.
Reported-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Fixes: 4eeaa4f3f1 ("zfcp: close window with unblocked rport during rport gone")
Cc: <stable@vger.kernel.org> # 2.6.32+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit de5cb6eb51 ]
The BPF JIT need safe guarding against spectre v2 in the sk_load_xxx
assembler stubs and the indirect branches generated by the JIT itself
need to be converted to expolines.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6deaa3bbca ]
The BPF JIT uses a 'b <disp>(%r<x>)' instruction in the definition
of the sk_load_word and sk_load_half functions.
Add support for branch-on-condition instructions contained in the
thunk code of an expoline.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4253b0e062 ]
The nospec-branch.c file is compiled without the gcc options to
generate expoline thunks. The return branch of the sysfs show
functions cpu_show_spectre_v1 and cpu_show_spectre_v2 is an indirect
branch as well. These need to be compiled with expolines.
Move the sysfs functions for spectre reporting to a separate file
and loose an '.' for one of the messages.
Cc: stable@vger.kernel.org # 4.16
Fixes: d424986f1d ("s390: add sysfs attributes for spectre")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 23a4d7fd34 ]
The return from the ftrace_stub, _mcount, ftrace_caller and
return_to_handler functions is done with "br %r14" and "br %r1".
These are indirect branches as well and need to use execute
trampolines for CONFIG_EXPOLINE=y.
The ftrace_caller function is a special case as it returns to the
start of a function and may only use %r0 and %r1. For a pre z10
machine the standard execute trampoline uses a LARL + EX to do
this, but this requires *two* registers in the range %r1..%r15.
To get around this the 'br %r1' located in the lowcore is used,
then the EX instruction does not need an address register.
But the lowcore trick may only be used for pre z14 machines,
with noexec=on the mapping for the first page may not contain
instructions. The solution for that is an ALTERNATIVE in the
expoline THUNK generated by 'GEN_BR_THUNK %r1' to switch to
EXRL, this relies on the fact that a machine that supports
noexec=on has EXRL as well.
Cc: stable@vger.kernel.org # 4.16
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 97489e0663 ]
The return from the memmove, memset, memcpy, __memset16, __memset32 and
__memset64 functions are done with "br %r14". These are indirect branches
as well and need to use execute trampolines for CONFIG_EXPOLINE=y.
Cc: stable@vger.kernel.org # 4.16
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 467a3bf219 ]
The return from the crc32_le_vgfm_16/crc32c_le_vgfm_16 and the
crc32_be_vgfm_16 functions are done with "br %r14". These are indirect
branches as well and need to use execute trampolines for CONFIG_EXPOLINE=y.
Cc: stable@vger.kernel.org # 4.16
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6dd85fbb87 ]
To be able to use the expoline branches in different assembler
files move the associated macros from entry.S to a new header
nospec-insn.h.
While we are at it make the macros a bit nicer to use.
Cc: stable@vger.kernel.org # 4.16
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6cf09958f3 ]
The main linker script vmlinux.lds.S for the kernel image merges
the expoline code patch tables into two section ".nospec_call_table"
and ".nospec_return_table". This is *not* done for the modules,
there the sections retain their original names as generated by gcc:
".s390_indirect_call", ".s390_return_mem" and ".s390_return_reg".
The module_finalize code has to check for the compiler generated
section names, otherwise no code patching is done. This slows down
the module code in case of "spectre_v2=off".
Cc: stable@vger.kernel.org # 4.16
Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6a3d1e81a4 ]
With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via
early_initcall is done *after* the early_param functions. This
overwrites any settings done with the nobp/no_spectre_v2/spectre_v2
parameters. The code patching for the kernel is done after the
evaluation of the early parameters but before the early_initcall
is done. The end result is a kernel image that is patched correctly
but the kernel modules are not.
Make sure that the nospec auto detection function is called before the
early parameters are evaluated and before the code patching is done.
Fixes: 6e179d6412 ("s390: add automatic detection of the spectre defense")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fba9eb7946 ]
Add a header with macros usable in assembler files to emit alternative
code sequences. It works analog to the alternatives for inline assmeblies
in C files, with the same restrictions and capabilities.
The syntax is
ALTERNATIVE "<default instructions sequence>", \
"<alternative instructions sequence>", \
"<features-bit>"
and
ALTERNATIVE_2 "<default instructions sequence>", \
"<alternative instructions sqeuence #1>", \
"<feature-bit #1>",
"<alternative instructions sqeuence #2>", \
"<feature-bit #2>"
Reviewed-by: Vasily Gorbik <gor@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d424986f1d ]
Set CONFIG_GENERIC_CPU_VULNERABILITIES and provide the two functions
cpu_show_spectre_v1 and cpu_show_spectre_v2 to report the spectre
mitigations.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bc03559971 ]
Add a boot message if either of the spectre defenses is active.
The message is
"Spectre V2 mitigation: execute trampolines."
or "Spectre V2 mitigation: limited branch prediction."
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6e179d6412 ]
Automatically decide between nobp vs. expolines if the spectre_v2=auto
kernel parameter is specified or CONFIG_EXPOLINE_AUTO=y is set.
The decision made at boot time due to CONFIG_EXPOLINE_AUTO=y being set
can be overruled with the nobp, nospec and spectre_v2 kernel parameters.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b2e2f43a01 ]
Keep the code for the nobp parameter handling with the code for
expolines. Both are related to the spectre v2 mitigation.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a048a07d7f upstream.
On some CPUs we can prevent a vulnerability related to store-to-load
forwarding by preventing store forwarding between privilege domains,
by inserting a barrier in kernel entry and exit paths.
This is known to be the case on at least Power7, Power8 and Power9
powerpc CPUs.
Barriers must be inserted generally before the first load after moving
to a higher privilege, and after the last store before moving to a
lower privilege, HV and PR privilege transitions must be protected.
Barriers are added as patch sections, with all kernel/hypervisor entry
points patched, and the exit points to lower privilge levels patched
similarly to the RFI flush patching.
Firmware advertisement is not implemented yet, so CPU flush types
are hard coded.
Thanks to Michal Suchánek for bug fixes and review.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michal Suchánek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7347a8683 upstream.
This moves the definition of the default security feature flags
(i.e., enabled by default) closer to the security feature flags.
This can be used to restore current flags to the default flags.
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f9bdfe3c7 upstream.
The H_CPU_BEHAV_* flags should be checked for in the 'behaviour' field
of 'struct h_cpu_char_result' -- 'character' is for H_CPU_CHAR_*
flags.
Found by playing around with QEMU's implementation of the hypercall:
H_CPU_CHAR=0xf000000000000000
H_CPU_BEHAV=0x0000000000000000
This clears H_CPU_BEHAV_FAVOUR_SECURITY and H_CPU_BEHAV_L1D_FLUSH_PR
so pseries_setup_rfi_flush() disables 'rfi_flush'; and it also
clears H_CPU_CHAR_L1D_THREAD_PRIV flag. So there is no RFI flush
mitigation at all for cpu_show_meltdown() to report; but currently
it does:
Original kernel:
# cat /sys/devices/system/cpu/vulnerabilities/meltdown
Mitigation: RFI Flush
Patched kernel:
# cat /sys/devices/system/cpu/vulnerabilities/meltdown
Not affected
H_CPU_CHAR=0x0000000000000000
H_CPU_BEHAV=0xf000000000000000
This sets H_CPU_BEHAV_BNDS_CHK_SPEC_BAR so cpu_show_spectre_v1() should
report vulnerable; but currently it doesn't:
Original kernel:
# cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
Not affected
Patched kernel:
# cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
Vulnerable
Brown-paper-bag-by: Michael Ellerman <mpe@ellerman.id.au>
Fixes: f636c14790 ("powerpc/pseries: Set or clear security feature flags")
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6fbe1c55c upstream.
Add a definition for cpu_show_spectre_v2() to override the generic
version. This has several permuations, though in practice some may not
occur we cater for any combination.
The most verbose is:
Mitigation: Indirect branch serialisation (kernel only), Indirect
branch cache disabled, ori31 speculation barrier enabled
We don't treat the ori31 speculation barrier as a mitigation on its
own, because it has to be *used* by code in order to be a mitigation
and we don't know if userspace is doing that. So if that's all we see
we say:
Vulnerable, ori31 speculation barrier enabled
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56986016cb upstream.
Add a definition for cpu_show_spectre_v1() to override the generic
version. Currently this just prints "Not affected" or "Vulnerable"
based on the firmware flag.
Although the kernel does have array_index_nospec() in a few places, we
haven't yet audited all the powerpc code to see where it's necessary,
so for now we don't list that as a mitigation.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e4a16161f upstream.
Now that we have the security flags we can simplify the code in
pseries_setup_rfi_flush() because the security flags have pessimistic
defaults.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37c0bdd00d upstream.
Now that we have the security flags we can significantly simplify the
code in pnv_setup_rfi_flush(), because we can use the flags instead of
checking device tree properties and because the security flags have
pessimistic defaults.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff348355e9 upstream.
Now that we have the security feature flags we can make the
information displayed in the "meltdown" file more informative.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8ad3304156 upstream.
This landed in setup_64.c for no good reason other than we had nowhere
else to put it. Now that we have a security-related file, that is a
better place for it so move it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 77addf6e95 upstream.
Now that we have feature flags for security related things, set or
clear them based on what we see in the device tree provided by
firmware.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f636c14790 upstream.
Now that we have feature flags for security related things, set or
clear them based on what we receive from the hypercall.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c4bc36628d upstream.
Add some additional values which have been defined for the
H_GET_CPU_CHARACTERISTICS hypercall.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a868f6343 upstream.
This commit adds security feature flags to reflect the settings we
receive from firmware regarding Spectre/Meltdown mitigations.
The feature names reflect the names we are given by firmware on bare
metal machines. See the hostboot source for details.
Arguably these could be firmware features, but that then requires them
to be read early in boot so they're available prior to asm feature
patching, but we don't actually want to use them for patching. We may
also want to dynamically update them in future, which would be
incompatible with the way firmware features work (at the moment at
least). So for now just make them separate flags.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84749a58b6 upstream.
This ensures the fallback flush area is always allocated on pseries,
so in case a LPAR is migrated from a patched to an unpatched system,
it is possible to enable the fallback flush in the target system.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5aa1437d2d upstream.
open file, unlink it, then use ioctl(2) to make it immutable or
append only. Now close it and watch the blocks *not* freed...
Immutable/append-only checks belong in ->setattr().
Note: the bug is old and backport to anything prior to 737f2e93b9
("ext2: convert to use the new truncate convention") will need
these checks lifted into ext2_setattr().
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 00ad691ab1 ]
Never directly free @dev after calling device_register(), even
if it returned an error. Always use put_device() to give up the
reference initialized.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 490068deae ]
Stress on qedi/qedr load unload lead to list_del corruption.
This is due to ll2 connection terminate freeing resources without
verifying that no more ll2 processing will occur.
This patch unregisters the ll2 status block before terminating
the connection to assure this race does not occur.
Fixes: 1d6cff4fca ("qed: Add iSCSI out of order packet handling")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ffd2c0d127 ]
The ll2 flows of flushing the txq/rxq need to be synchronized with the
regular fp processing. Caused list corruption during load/unload stress
tests.
Fixes: 0a7fb11c23 ("qed: Add Light L2 support")
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b80d0b93b9 ]
Currently ip6gre and ip6erspan share single metadata mode device,
using 'collect_md_tun'. Thus, when doing:
ip link add dev ip6gre11 type ip6gretap external
ip link add dev ip6erspan12 type ip6erspan external
RTNETLINK answers: File exists
simply fails due to the 2nd tries to create the same collect_md_tun.
The patch fixes it by adding a separate collect md tunnel device
for the ip6erspan, 'collect_md_tun_erspan'. As a result, a couple
of places need to refactor/split up in order to distinguish ip6gre
and ip6erspan.
First, move the collect_md check at ip6gre_tunnel_{unlink,link} and
create separate function {ip6gre,ip6ersapn}_tunnel_{link_md,unlink_md}.
Then before link/unlink, make sure the link_md/unlink_md is called.
Finally, a separate ndo_uninit is created for ip6erspan. Tested it
using the samples/bpf/test_tunnel_bpf.sh.
Fixes: ef7baf5e08 ("ip6_gre: add ip6 erspan collect_md mode")
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2d665034f2 ]
Even though ip6erspan_tap_init() sets up hlen and tun_hlen according to
what ERSPAN needs, it goes ahead to call ip6gre_tnl_link_config() which
overwrites these settings with GRE-specific ones.
Similarly for changelink callbacks, which are handled by
ip6gre_changelink() calls ip6gre_tnl_change() calls
ip6gre_tnl_link_config() as well.
The difference ends up being 12 vs. 20 bytes, and this is generally not
a problem, because a 12-byte request likely ends up allocating more and
the extra 8 bytes are thus available. However correct it is not.
So replace the newlink and changelink callbacks with an ERSPAN-specific
ones, reusing the newly-introduced _common() functions.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c8632fc30b ]
Extract from ip6gre_changelink() a reusable function
ip6gre_changelink_common(). This will allow introduction of
ERSPAN-specific _changelink() function with not a lot of code
duplication.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7fa38a7c85 ]
Extract from ip6gre_newlink() a reusable function
ip6gre_newlink_common(). The ip6gre_tnl_link_config() call needs to be
made customizable for ERSPAN, thus reorder it with calls to
ip6_tnl_change_mtu() and dev_hold(), and extract the whole tail to the
caller, ip6gre_newlink(). Thus enable an ERSPAN-specific _newlink()
function without a lot of duplicity.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a6465350ef ]
Split a reusable function ip6gre_tnl_copy_tnl_parm() from
ip6gre_tnl_change(). This will allow ERSPAN-specific code to
reuse the common parts while customizing the behavior for ERSPAN.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a483373ead ]
The function ip6gre_tnl_link_config() is used for setting up
configuration of both ip6gretap and ip6erspan tunnels. Split the
function into the common part and the route-lookup part. The latter then
takes the calculated header length as an argument. This split will allow
the patches down the line to sneak in a custom header length computation
for the ERSPAN tunnel.
Fixes: 5a963eb61b ("ip6_gre: Add ERSPAN native tunnel support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f3002c1374 ]
The gen bits must be read first from (resp. written last to) DMA memory.
The proper way to enforce this on Linux is to call dma_rmb() (resp.
dma_wmb()).
Signed-off-by: Regis Duchesne <hpreg@vmware.com>
Acked-by: Ronak Doshi <doshir@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 61aeecea40 ]
The DMA mask must be set before, not after, the first DMA map operation, or
the first DMA map operation could in theory fail on some systems.
Fixes: b0eb57cb97 ("VMXNET3: Add support for virtual IOMMU")
Signed-off-by: Regis Duchesne <hpreg@vmware.com>
Acked-by: Ronak Doshi <doshir@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d775f26b29 ]
Correct the indirect register offsets in collecting TX rate limit info
in UP CIM logs.
Also, T5 doesn't support these indirect register offsets, so remove
them from collection logic.
Fixes: be6e36d916 ("cxgb4: collect TX rate limit info in UP CIM logs")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 55c82617c3 ]
This driver supports EISA devices in addition to PCI devices, and relied
on the legacy behavior of the pci_dma* shims to pass on a NULL pointer
to the DMA API, and the DMA API being able to handle that. When the
NULL forwarding broke the EISA support got broken. Fix this by converting
to the DMA API instead of the legacy PCI shims.
Fixes: 4167b2ad ("PCI: Remove NULL device handling from PCI DMA API")
Reported-by: tedheadster <tedheadster@gmail.com>
Tested-by: tedheadster <tedheadster@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1942adf642 ]
It was possible to delete only one half of an IPv6, which would leave
the second half still programmed and possibly in use. Instead of
checking for the unused bitmap, we need to check the unique bitmap, and
refuse any deletion that does not match that criteria. We also need to
move that check from bcm_sf2_cfp_rule_del_one() into its caller:
bcm_sf2_cfp_rule_del() otherwise we would not be able to delete second
halves anymore that would not pass the first test.
Fixes: ba0696c22e ("net: dsa: bcm_sf2: Add support for IPv6 CFP rules")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6c05561c54 ]
We had several issues that would make the programming of IPv6 rules both
inconsistent and error prone:
- the chain ID that we would be asking the hardware to put in the
packet's Broadcom tag would be off by one, it would return one of the
two indexes, but not the one user-space specified
- when an user specified a particular location to insert a CFP rule at,
we would not be returning the same index, which would be confusing if
nothing else
- finally, like IPv4, it would be possible to overflow the last entry by
re-programming it
Fix this by swapping the usage of rule_index[0] and rule_index[1] where
relevant in order to return a consistent and correct user-space
experience.
Fixes: ba0696c22e ("net: dsa: bcm_sf2: Add support for IPv6 CFP rules")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5447d78623 ]
Even if commit 1d27732f41 ("net: dsa: setup and teardown ports") indicated
that registering a devlink instance for unused ports is not a problem, and this
is true, this can be confusing nonetheless, so let's not do it.
Fixes: 1d27732f41 ("net: dsa: setup and teardown ports")
Reported-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 43a5e00f38 ]
When we let the kernel pick up a rule location with RX_CLS_LOC_ANY, we
would be able to overwrite the last rules because of a number of issues.
The IPv4 code path would not be checking that rule_index is within
bounds, and it would also only be allowed to pick up rules from range
0..126 instead of the full 0..127 range. This would lead us to allow
overwriting the last rule when we let the kernel pick-up the location.
Fixes: 3306145866 ("net: dsa: bcm_sf2: Move IPv4 CFP processing to specific functions")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 849a742c59 ]
Earlier code of doing bitwise AND with field width bits was wrong.
Instead, simplify code to calculate ntuple_mask based on supplied
fields and then compare with mask configured in hw - which is the
correct and simpler way to validate ntuple mask.
Fixes: 3eb8b62d5a ("cxgb4: add support to create hash-filters via tc-flower offload")
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7063efd33b ]
After commit b196d88aba ("tun: fix use after free for ptr_ring") we
need clean up tx ring during release(). But unfortunately, it tries to
do the cleanup blindly after socket were destroyed which will lead
another use-after-free. Fix this by doing the cleanup before dropping
the last reference of the socket in __tun_detach().
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Acked-by: Andrei Vagin <avagin@virtuozzo.com>
Fixes: b196d88aba ("tun: fix use after free for ptr_ring")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b196d88aba ]
We used to initialize ptr_ring during TUNSETIFF, this is because its
size depends on the tx_queue_len of netdevice. And we try to clean it
up when socket were detached from netdevice. A race were spotted when
trying to do uninit during a read which will lead a use after free for
pointer ring. Solving this by always initialize a zero size ptr_ring
in open() and do resizing during TUNSETIFF, and then we can safely do
cleanup during close(). With this, there's no need for the workaround
that was introduced by commit 4df0bfc799 ("tun: fix a memory leak
for tfile->tx_array").
Reported-by: syzbot+e8b902c3c3fadf0a9dba@syzkaller.appspotmail.com
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Fixes: 1576d98605 ("tun: switch to use skb array for tx")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b84bbaf7a6 ]
Packet sockets allow construction of packets shorter than
dev->hard_header_len to accommodate protocols with variable length
link layer headers. These packets are padded to dev->hard_header_len,
because some device drivers interpret that as a minimum packet size.
packet_snd reserves dev->hard_header_len bytes on allocation.
SOCK_DGRAM sockets call skb_push in dev_hard_header() to ensure that
link layer headers are stored in the reserved range. SOCK_RAW sockets
do the same in tpacket_snd, but not in packet_snd.
Syzbot was able to send a zero byte packet to a device with massive
116B link layer header, causing padding to cross over into skb_shinfo.
Fix this by writing from the start of the llheader reserved range also
in the case of packet_snd/SOCK_RAW.
Update skb_set_network_header to the new offset. This also corrects
it for SOCK_DGRAM, where it incorrectly double counted reserve due to
the skb_push in dev_hard_header.
Fixes: 9ed988cd59 ("packet: validate variable length ll headers")
Reported-by: syzbot+71d74a5406d02057d559@syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 113f99c335 ]
Device features may change during transmission. In particular with
corking, a device may toggle scatter-gather in between allocating
and writing to an skb.
Do not unconditionally assume that !NETIF_F_SG at write time implies
that the same held at alloc time and thus the skb has sufficient
tailroom.
This issue predates git history.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d49baa7e12 ]
It's possible to crash the kernel in several different ways by sending
messages to the SMC_PNETID generic netlink family that are missing the
expected attributes:
- Missing SMC_PNETID_NAME => null pointer dereference when comparing
names.
- Missing SMC_PNETID_ETHNAME => null pointer dereference accessing
smc_pnetentry::ndev.
- Missing SMC_PNETID_IBNAME => null pointer dereference accessing
smc_pnetentry::smcibdev.
- Missing SMC_PNETID_IBPORT => out of bounds array access to
smc_ib_device::pattr[-1].
Fix it by validating that all expected attributes are present and that
SMC_PNETID_IBPORT is nonzero.
Reported-by: syzbot+5cd61039dc9b8bfa6e47@syzkaller.appspotmail.com
Fixes: 6812baabf2 ("smc: establish pnet table management")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5a4931ae01 ]
Similarly to what was done with commit a52956dfc5 ("net sched actions:
fix refcnt leak in skbmod"), fix the error path of tcf_vlan_init() to avoid
refcnt leaks when wrong value of TCA_VLAN_PUSH_VLAN_PROTOCOL is given.
Fixes: 5026c9b1ba ("net sched: vlan action fix late binding")
CC: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 57f6f99fda ]
Avoid exiting the function with a lingering sysfs file (if the first
call to device_create_file() fails while the second succeeds), and avoid
calling devlink_port_unregister() twice.
In other words, either mlx4_init_port_info() succeeds and returns zero, or
it fails, returns non-zero, and requires no cleanup.
Fixes: 096335b3f9 ("mlx4_core: Allow dynamic MTU configuration for IB ports")
Signed-off-by: Tarick Bedeir <tarick@google.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6358d49ac2 ]
While removing queues from the XPS map, the individual CPU ID
alone was used to index the CPUs map, this should be changed to also
factor in the traffic class mapping for the CPU-to-queue lookup.
Fixes: 184c449f91 ("net: Add support for XPS with QoS via traffic classes")
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 240da953fc upstream
The "336996 Speculative Execution Side Channel Mitigations" from
May defines this as SSB_NO, hence lets sync-up.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc226f07dc upstream
Expose the new virtualized architectural mechanism, VIRT_SSBD, for using
speculative store bypass disable (SSBD) under SVM. This will allow guests
to use SSBD on hardware that uses non-architectural mechanisms for enabling
SSBD.
[ tglx: Folded the migration fixup from Paolo Bonzini ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 47c61b3955 upstream
Add the necessary logic for supporting the emulated VIRT_SPEC_CTRL MSR to
x86_virt_spec_ctrl(). If either X86_FEATURE_LS_CFG_SSBD or
X86_FEATURE_VIRT_SPEC_CTRL is set then use the new guest_virt_spec_ctrl
argument to check whether the state must be modified on the host. The
update reuses speculative_store_bypass_update() so the ZEN-specific sibling
coordination can be reused.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be6fcb5478 upstream
x86_spec_ctrL_mask is intended to mask out bits from a MSR_SPEC_CTRL value
which are not to be modified. However the implementation is not really used
and the bitmask was inverted to make a check easier, which was removed in
"x86/bugs: Remove x86_spec_ctrl_set()"
Aside of that it is missing the STIBP bit if it is supported by the
platform, so if the mask would be used in x86_virt_spec_ctrl() then it
would prevent a guest from setting STIBP.
Add the STIBP bit if supported and use the mask in x86_virt_spec_ctrl() to
sanitize the value which is supplied by the guest.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b59bdb569 upstream
x86_spec_ctrl_set() is only used in bugs.c and the extra mask checks there
provide no real value as both call sites can just write x86_spec_ctrl_base
to MSR_SPEC_CTRL. x86_spec_ctrl_base is valid and does not need any extra
masking or checking.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fa8ac49882 upstream
x86_spec_ctrl_base is the system wide default value for the SPEC_CTRL MSR.
x86_spec_ctrl_get_default() returns x86_spec_ctrl_base and was intended to
prevent modification to that variable. Though the variable is read only
after init and globaly visible already.
Remove the function and export the variable instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cc69b34989 upstream
Function bodies are very similar and are going to grow more almost
identical code. Add a bool arg to determine whether SPEC_CTRL is being set
for the guest or restored to the host.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0270be3e34 upstream
The upcoming support for the virtual SPEC_CTRL MSR on AMD needs to reuse
speculative_store_bypass_update() to avoid code duplication. Add an
argument for supplying a thread info (TIF) value and create a wrapper
speculative_store_bypass_update_current() which is used at the existing
call site.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 11fb068349 upstream
Some AMD processors only support a non-architectural means of enabling
speculative store bypass disable (SSBD). To allow a simplified view of
this to a guest, an architectural definition has been created through a new
CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f. With this, a
hypervisor can virtualize the existence of this definition and provide an
architectural method for using SSBD to a guest.
Add the new CPUID feature, the new MSR and update the existing SSBD
support to use this MSR when present.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccbcd26744 upstream
AMD is proposing a VIRT_SPEC_CTRL MSR to handle the Speculative Store
Bypass Disable via MSR_AMD64_LS_CFG so that guests do not have to care
about the bit position of the SSBD bit and thus facilitate migration.
Also, the sibling coordination on Family 17H CPUs can only be done on
the host.
Extend x86_spec_ctrl_set_guest() and x86_spec_ctrl_restore_host() with an
extra argument for the VIRT_SPEC_CTRL MSR.
Hand in 0 from VMX and in SVM add a new virt_spec_ctrl member to the CPU
data structure which is going to be used in later patches for the actual
implementation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f50ddb4f4 upstream
The AMD64_LS_CFG MSR is a per core MSR on Family 17H CPUs. That means when
hyperthreading is enabled the SSBD bit toggle needs to take both cores into
account. Otherwise the following situation can happen:
CPU0 CPU1
disable SSB
disable SSB
enable SSB <- Enables it for the Core, i.e. for CPU0 as well
So after the SSB enable on CPU1 the task on CPU0 runs with SSB enabled
again.
On Intel the SSBD control is per core as well, but the synchronization
logic is implemented behind the per thread SPEC_CTRL MSR. It works like
this:
CORE_SPEC_CTRL = THREAD0_SPEC_CTRL | THREAD1_SPEC_CTRL
i.e. if one of the threads enables a mitigation then this affects both and
the mitigation is only disabled in the core when both threads disabled it.
Add the necessary synchronization logic for AMD family 17H. Unfortunately
that requires a spinlock to serialize the access to the MSR, but the locks
are only shared between siblings.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52817587e7 upstream
The SSBD enumeration is similarly to the other bits magically shared
between Intel and AMD though the mechanisms are different.
Make X86_FEATURE_SSBD synthetic and set it depending on the vendor specific
features or family dependent setup.
Change the Intel bit to X86_FEATURE_SPEC_CTRL_SSBD to denote that SSBD is
controlled via MSR_SPEC_CTRL and fix up the usage sites.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7eb8956a7f upstream
The availability of the SPEC_CTRL MSR is enumerated by a CPUID bit on
Intel and implied by IBRS or STIBP support on AMD. That's just confusing
and in case an AMD CPU has IBRS not supported because the underlying
problem has been fixed but has another bit valid in the SPEC_CTRL MSR,
the thing falls apart.
Add a synthetic feature bit X86_FEATURE_MSR_SPEC_CTRL to denote the
availability on both Intel and AMD.
While at it replace the boot_cpu_has() checks with static_cpu_has() where
possible. This prevents late microcode loading from exposing SPEC_CTRL, but
late loading is already very limited as it does not reevaluate the
mitigation options and other bits and pieces. Having static_cpu_has() is
the simplest and least fragile solution.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15e6c22fd8 upstream
svm_vcpu_run() invokes x86_spec_ctrl_restore_host() after VMEXIT, but
before the host GS is restored. x86_spec_ctrl_restore_host() uses 'current'
to determine the host SSBD state of the thread. 'current' is GS based, but
host GS is not yet restored and the access causes a triple fault.
Move the call after the host GS restore.
Fixes: 885f82bfbc x86/process: Allow runtime control of Speculative Store Bypass
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d66d8ff3d2 upstream
__ssb_select_mitigation() returns one of the members of enum ssb_mitigation,
not ssb_mitigation_cmd; fix the prototype to reflect that.
Fixes: 24f7fc83b9 ("x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f65fb2937 upstream
Intel collateral will reference the SSB mitigation bit in IA32_SPEC_CTL[2]
as SSBD (Speculative Store Bypass Disable).
Hence changing it.
It is unclear yet what the MSR_IA32_ARCH_CAPABILITIES (0x10a) Bit(4) name
is going to be. Following the rename it would be SSBD_NO but that rolls out
to Speculative Store Bypass Disable No.
Also fixed the missing space in X86_FEATURE_AMD_SSBD.
[ tglx: Fixup x86_amd_rds_enable() and rds_tif_to_amd_ls_cfg() as well ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f21b53b20c upstream
Unless explicitly opted out of, anything running under seccomp will have
SSB mitigations enabled. Choosing the "prctl" mode will disable this.
[ tglx: Adjusted it to the new arch_seccomp_spec_mitigate() mechanism ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8bf37d8c06 upstream
The migitation control is simpler to implement in architecture code as it
avoids the extra function call to check the mode. Aside of that having an
explicit seccomp enabled mode in the architecture mitigations would require
even more workarounds.
Move it into architecture code and provide a weak function in the seccomp
code. Remove the 'which' argument as this allows the architecture to decide
which mitigations are relevant for seccomp.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 00a02d0c50 upstream
If a seccomp user is not interested in Speculative Store Bypass mitigation
by default, it can set the new SECCOMP_FILTER_FLAG_SPEC_ALLOW flag when
adding filters.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b849a812f7 upstream
Use PR_SPEC_FORCE_DISABLE in seccomp() because seccomp does not allow to
widen restrictions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 356e4bfff2 upstream
For certain use cases it is desired to enforce mitigations so they cannot
be undone afterwards. That's important for loader stubs which want to
prevent a child from disabling the mitigation again. Will also be used for
seccomp(). The extra state preserving of the prctl state for SSB is a
preparatory step for EBPF dymanic speculation control.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c3070890d upstream
When speculation flaw mitigations are opt-in (via prctl), using seccomp
will automatically opt-in to these protections, since using seccomp
indicates at least some level of sandboxing is desired.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7bbf1373e2 upstream
Adjust arch_prctl_get/set_spec_ctrl() to operate on tasks other than
current.
This is needed both for /proc/$pid/status queries and for seccomp (since
thread-syncing can trigger seccomp in non-current threads).
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a73ec77ee1 upstream
Add prctl based control for Speculative Store Bypass mitigation and make it
the default mitigation for Intel and AMD.
Andi Kleen provided the following rationale (slightly redacted):
There are multiple levels of impact of Speculative Store Bypass:
1) JITed sandbox.
It cannot invoke system calls, but can do PRIME+PROBE and may have call
interfaces to other code
2) Native code process.
No protection inside the process at this level.
3) Kernel.
4) Between processes.
The prctl tries to protect against case (1) doing attacks.
If the untrusted code can do random system calls then control is already
lost in a much worse way. So there needs to be system call protection in
some way (using a JIT not allowing them or seccomp). Or rather if the
process can subvert its environment somehow to do the prctl it can already
execute arbitrary code, which is much worse than SSB.
To put it differently, the point of the prctl is to not allow JITed code
to read data it shouldn't read from its JITed sandbox. If it already has
escaped its sandbox then it can already read everything it wants in its
address space, and do much worse.
The ability to control Speculative Store Bypass allows to enable the
protection selectively without affecting overall system performance.
Based on an initial patch from Tim Chen. Completely rewritten.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 885f82bfbc upstream
The Speculative Store Bypass vulnerability can be mitigated with the
Reduced Data Speculation (RDS) feature. To allow finer grained control of
this eventually expensive mitigation a per task mitigation control is
required.
Add a new TIF_RDS flag and put it into the group of TIF flags which are
evaluated for mismatch in switch_to(). If these bits differ in the previous
and the next task, then the slow path function __switch_to_xtra() is
invoked. Implement the TIF_RDS dependent mitigation control in the slow
path.
If the prctl for controlling Speculative Store Bypass is disabled or no
task uses the prctl then there is no overhead in the switch_to() fast
path.
Update the KVM related speculation control functions to take TID_RDS into
account as well.
Based on a patch from Tim Chen. Completely rewritten.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b617cfc858 upstream
Add two new prctls to control aspects of speculation related vulnerabilites
and their mitigations to provide finer grained control over performance
impacting mitigations.
PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
the following meaning:
Bit Define Description
0 PR_SPEC_PRCTL Mitigation can be controlled per task by
PR_SET_SPECULATION_CTRL
1 PR_SPEC_ENABLE The speculation feature is enabled, mitigation is
disabled
2 PR_SPEC_DISABLE The speculation feature is disabled, mitigation is
enabled
If all bits are 0 the CPU is not affected by the speculation misfeature.
If PR_SPEC_PRCTL is set, then the per task control of the mitigation is
available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation
misfeature will fail.
PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which
is selected by arg2 of prctl(2) per task. arg3 is used to hand in the
control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE.
The common return values are:
EINVAL prctl is not implemented by the architecture or the unused prctl()
arguments are not 0
ENODEV arg2 is selecting a not supported speculation misfeature
PR_SET_SPECULATION_CTRL has these additional return values:
ERANGE arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
ENXIO prctl control of the selected speculation misfeature is disabled
The first supported controlable speculation misfeature is
PR_SPEC_STORE_BYPASS. Add the define so this can be shared between
architectures.
Based on an initial patch from Tim Chen and mostly rewritten.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 28a2775217 upstream
Having everything in nospec-branch.h creates a hell of dependencies when
adding the prctl based switching mechanism. Move everything which is not
required in nospec-branch.h to spec-ctrl.h and fix up the includes in the
relevant files.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da39556f66 upstream
Expose the CPUID.7.EDX[31] bit to the guest, and also guard against various
combinations of SPEC_CTRL MSR values.
The handling of the MSR (to take into account the host value of SPEC_CTRL
Bit(2)) is taken care of in patch:
KVM/SVM/VMX/x86/spectre_v2: Support the combination of guest and host IBRS
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 764f3c2158 upstream
AMD does not need the Speculative Store Bypass mitigation to be enabled.
The parameters for this are already available and can be done via MSR
C001_1020. Each family uses a different bit in that MSR for this.
[ tglx: Expose the bit mask via a variable and move the actual MSR fiddling
into the bugs code as that's the right thing to do and also required
to prepare for dynamic enable/disable ]
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1115a859f3 upstream
Intel and AMD SPEC_CTRL (0x48) MSR semantics may differ in the
future (or in fact use different MSRs for the same functionality).
As such a run-time mechanism is required to whitelist the appropriate MSR
values.
[ tglx: Made the variable __ro_after_init ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 772439717d upstream
Intel CPUs expose methods to:
- Detect whether RDS capability is available via CPUID.7.0.EDX[31],
- The SPEC_CTRL MSR(0x48), bit 2 set to enable RDS.
- MSR_IA32_ARCH_CAPABILITIES, Bit(4) no need to enable RRS.
With that in mind if spec_store_bypass_disable=[auto,on] is selected set at
boot-time the SPEC_CTRL MSR to enable RDS if the platform requires it.
Note that this does not fix the KVM case where the SPEC_CTRL is exposed to
guests which can muck with it, see patch titled :
KVM/SVM/VMX/x86/spectre_v2: Support the combination of guest and host IBRS.
And for the firmware (IBRS to be set), see patch titled:
x86/spectre_v2: Read SPEC_CTRL MSR during boot and re-use reserved bits
[ tglx: Distangled it from the intel implementation and kept the call order ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 24f7fc83b9 upstream
Contemporary high performance processors use a common industry-wide
optimization known as "Speculative Store Bypass" in which loads from
addresses to which a recent store has occurred may (speculatively) see an
older value. Intel refers to this feature as "Memory Disambiguation" which
is part of their "Smart Memory Access" capability.
Memory Disambiguation can expose a cache side-channel attack against such
speculatively read values. An attacker can create exploit code that allows
them to read memory outside of a sandbox environment (for example,
malicious JavaScript in a web page), or to perform more complex attacks
against code running within the same privilege level, e.g. via the stack.
As a first step to mitigate against such attacks, provide two boot command
line control knobs:
nospec_store_bypass_disable
spec_store_bypass_disable=[off,auto,on]
By default affected x86 processors will power on with Speculative
Store Bypass enabled. Hence the provided kernel parameters are written
from the point of view of whether to enable a mitigation or not.
The parameters are as follows:
- auto - Kernel detects whether your CPU model contains an implementation
of Speculative Store Bypass and picks the most appropriate
mitigation.
- on - disable Speculative Store Bypass
- off - enable Speculative Store Bypass
[ tglx: Reordered the checks so that the whole evaluation is not done
when the CPU does not support RDS ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0cc5fa00b0 upstream
Add the CPU feature bit CPUID.7.0.EDX[31] which indicates whether the CPU
supports Reduced Data Speculation.
[ tglx: Split it out from a later patch ]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c456442cd3 upstream
Add the sysfs file for the new vulerability. It does not do much except
show the words 'Vulnerable' for recent x86 cores.
Intel cores prior to family 6 are known not to be vulnerable, and so are
some Atoms and some Xeon Phi.
It assumes that older Cyrix, Centaur, etc. cores are immune.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5cf6875487 upstream
A guest may modify the SPEC_CTRL MSR from the value used by the
kernel. Since the kernel doesn't use IBRS, this means a value of zero is
what is needed in the host.
But the 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to
the other bits as reserved so the kernel should respect the boot time
SPEC_CTRL value and use that.
This allows to deal with future extensions to the SPEC_CTRL interface if
any at all.
Note: This uses wrmsrl() instead of native_wrmsl(). I does not make any
difference as paravirt will over-write the callq *0xfff.. with the wrmsrl
assembler code.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b86883ccb upstream
The 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to all
the other bits as reserved. The Intel SDM glossary defines reserved as
implementation specific - aka unknown.
As such at bootup this must be taken it into account and proper masking for
the bits in use applied.
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199511
[ tglx: Made x86_spec_ctrl_base __ro_after_init ]
Suggested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1aa7a5735a upstream
The macro is not type safe and I did look for why that "g" constraint for
the asm doesn't work: it's because the asm is more fundamentally wrong.
It does
movl %[val], %%eax
but "val" isn't a 32-bit value, so then gcc will pass it in a register,
and generate code like
movl %rsi, %eax
and gas will complain about a nonsensical 'mov' instruction (it's moving a
64-bit register to a 32-bit one).
Passing it through memory will just hide the real bug - gcc still thinks
the memory location is 64-bit, but the "movl" will only load the first 32
bits and it all happens to work because x86 is little-endian.
Convert it to a type safe inline function with a little trick which hands
the feature into the ALTERNATIVE macro.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02a3307aa9 upstream.
If a btree block, aka. extent buffer, is not available in the extent
buffer cache, it'll be read out from the disk instead, i.e.
btrfs_search_slot()
read_block_for_search() # hold parent and its lock, go to read child
btrfs_release_path()
read_tree_block() # read child
Unfortunately, the parent lock got released before reading child, so
commit 5bdd3536cb ("Btrfs: Fix block generation verification race") had
used 0 as parent transid to read the child block. It forces
read_tree_block() not to check if parent transid is different with the
generation id of the child that it reads out from disk.
A simple PoC is included in btrfs/124,
0. A two-disk raid1 btrfs,
1. Right after mkfs.btrfs, block A is allocated to be device tree's root.
2. Mount this filesystem and put it in use, after a while, device tree's
root got COW but block A hasn't been allocated/overwritten yet.
3. Umount it and reload the btrfs module to remove both disks from the
global @fs_devices list.
4. mount -odegraded dev1 and write some data, so now block A is allocated
to be a leaf in checksum tree. Note that only dev1 has the latest
metadata of this filesystem.
5. Umount it and mount it again normally (with both disks), since raid1
can pick up one disk by the writer task's pid, if btrfs_search_slot()
needs to read block A, dev2 which does NOT have the latest metadata
might be read for block A, then we got a stale block A.
6. As parent transid is not checked, block A is marked as uptodate and
put into the extent buffer cache, so the future search won't bother
to read disk again, which means it'll make changes on this stale
one and make it dirty and flush it onto disk.
To avoid the problem, parent transid needs to be passed to
read_tree_block().
In order to get a valid parent transid, we need to hold the parent's
lock until finishing reading child.
This patch needs to be slightly adapted for stable kernels, the
&first_key parameter added to read_tree_block() is from 4.16+
(581c176041). The fix is to replace 0 by 'gen'.
Fixes: 5bdd3536cb ("Btrfs: Fix block generation verification race")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe816d0f1d upstream.
When a transaction is aborted btrfs_cleanup_transaction is called to
cleanup all the various in-flight bits and pieces which migth be
active. One of those is delalloc inodes - inodes which have dirty
pages which haven't been persisted yet. Currently the process of
freeing such delalloc inodes in exceptional circumstances such as
transaction abort boiled down to calling btrfs_invalidate_inodes whose
sole job is to invalidate the dentries for all inodes related to a
root. This is in fact wrong and insufficient since such delalloc inodes
will likely have pending pages or ordered-extents and will be linked to
the sb->s_inode_list. This means that unmounting a btrfs instance with
an aborted transaction could potentially lead inodes/their pages
visible to the system long after their superblock has been freed. This
in turn leads to a "use-after-free" situation once page shrink is
triggered. This situation could be simulated by running generic/019
which would cause such inodes to be left hanging, followed by
generic/176 which causes memory pressure and page eviction which lead
to touching the freed super block instance. This situation is
additionally detected by the unmount code of VFS with the following
message:
"VFS: Busy inodes after unmount of Self-destruct in 5 seconds. Have a nice day..."
Additionally btrfs hits WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
in free_fs_root for the same reason.
This patch aims to rectify the sitaution by doing the following:
1. Change btrfs_destroy_delalloc_inodes so that it calls
invalidate_inode_pages2 for every inode on the delalloc list, this
ensures that all the pages of the inode are released. This function
boils down to calling btrfs_releasepage. During test I observed cases
where inodes on the delalloc list were having an i_count of 0, so this
necessitates using igrab to be sure we are working on a non-freed inode.
2. Since calling btrfs_releasepage might queue delayed iputs move the
call out to btrfs_cleanup_transaction in btrfs_error_commit_super before
calling run_delayed_iputs for the last time. This is necessary to ensure
that delayed iputs are run.
Note: this patch is tagged for 4.14 stable but the fix applies to older
versions too but needs to be backported manually due to conflicts.
CC: stable@vger.kernel.org # 4.14.x: 2b87733134: btrfs: Split btrfs_del_delalloc_inode into 2 functions
CC: stable@vger.kernel.org # 4.14.x
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add comment to igrab ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02ee654d3a upstream.
We set the BTRFS_BALANCE_RESUME flag in the btrfs_recover_balance()
only, which isn't called during the remount. So when resuming from
the paused balance we hit the bug:
kernel: kernel BUG at fs/btrfs/volumes.c:3890!
::
kernel: balance_kthread+0x51/0x60 [btrfs]
kernel: kthread+0x111/0x130
::
kernel: RIP: btrfs_balance+0x12e1/0x1570 [btrfs] RSP: ffffba7d0090bde8
Reproducer:
On a mounted filesystem:
btrfs balance start --full-balance /btrfs
btrfs balance pause /btrfs
mount -o remount,ro /dev/sdb /btrfs
mount -o remount,rw /dev/sdb /btrfs
To fix this set the BTRFS_BALANCE_RESUME flag in
btrfs_resume_balance_async().
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1a63c198dd upstream.
Incompat flag of LZO/ZSTD compression should be set at:
1. mount time (-o compress/compress-force)
2. when defrag is done
3. when property is set
Currently 3. is missing and this commit adds this.
This could lead to a filesystem that uses ZSTD but is not marked as
such. If a kernel without a ZSTD support encounteres a ZSTD compressed
extent, it will handle that but this could be confusing to the user.
Typically the filesystem is mounted with the ZSTD option, but the
discrepancy can arise when a filesystem is never mounted with ZSTD and
then the property on some file is set (and some new extents are
written). A simple mount with -o compress=zstd will fix that up on an
unpatched kernel.
Same goes for LZO, but this has been around for a very long time
(2.6.37) so it's unlikely that a pre-LZO kernel would be used.
Fixes: 5c1aab1dd5 ("btrfs: Add zstd support")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add user visible impact ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f2f0b394b upstream.
[BUG]
btrfs incremental send BUG happens when creating a snapshot of snapshot
that is being used by send.
[REASON]
The problem can happen if while we are doing a send one of the snapshots
used (parent or send) is snapshotted, because snapshoting implies COWing
the root of the source subvolume/snapshot.
1. When doing an incremental send, the send process will get the commit
roots from the parent and send snapshots, and add references to them
through extent_buffer_get().
2. When a snapshot/subvolume is snapshotted, its root node is COWed
(transaction.c:create_pending_snapshot()).
3. COWing releases the space used by the node immediately, through:
__btrfs_cow_block()
--btrfs_free_tree_block()
----btrfs_add_free_space(bytenr of node)
4. Because send doesn't hold a transaction open, it's possible that
the transaction used to create the snapshot commits, switches the
commit root and the old space used by the previous root node gets
assigned to some other node allocation. Allocation of a new node will
use the existing extent buffer found in memory, which we previously
got a reference through extent_buffer_get(), and allow the extent
buffer's content (pages) to be modified:
btrfs_alloc_tree_block
--btrfs_reserve_extent
----find_free_extent (get bytenr of old node)
--btrfs_init_new_buffer (use bytenr of old node)
----btrfs_find_create_tree_block
------alloc_extent_buffer
--------find_extent_buffer (get old node)
5. So send can access invalid memory content and have unpredictable
behaviour.
[FIX]
So we fix the problem by copying the commit roots of the send and
parent snapshots and use those copies.
CallTrace looks like this:
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.c:1861!
invalid opcode: 0000 [#1] SMP
CPU: 6 PID: 24235 Comm: btrfs Tainted: P O 3.10.105 #23721
ffff88046652d680 ti: ffff88041b720000 task.ti: ffff88041b720000
RIP: 0010:[<ffffffffa08dd0e8>] read_node_slot+0x108/0x110 [btrfs]
RSP: 0018:ffff88041b723b68 EFLAGS: 00010246
RAX: ffff88043ca6b000 RBX: ffff88041b723c50 RCX: ffff880000000000
RDX: 000000000000004c RSI: ffff880314b133f8 RDI: ffff880458b24000
RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88041b723c66
R10: 0000000000000001 R11: 0000000000001000 R12: ffff8803f3e48890
R13: ffff8803f3e48880 R14: ffff880466351800 R15: 0000000000000001
FS: 00007f8c321dc8c0(0000) GS:ffff88047fcc0000(0000)
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
R2: 00007efd1006d000 CR3: 0000000213a24000 CR4: 00000000003407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
ffff88041b723c50 ffff8803f3e48880 ffff8803f3e48890 ffff8803f3e48880
ffff880466351800 0000000000000001 ffffffffa08dd9d7 ffff88041b723c50
ffff8803f3e48880 ffff88041b723c66 ffffffffa08dde85 a9ff88042d2c4400
Call Trace:
[<ffffffffa08dd9d7>] ? tree_move_down.isra.33+0x27/0x50 [btrfs]
[<ffffffffa08dde85>] ? tree_advance+0xb5/0xc0 [btrfs]
[<ffffffffa08e83d4>] ? btrfs_compare_trees+0x2d4/0x760 [btrfs]
[<ffffffffa0982050>] ? finish_inode_if_needed+0x870/0x870 [btrfs]
[<ffffffffa09841ea>] ? btrfs_ioctl_send+0xeda/0x1050 [btrfs]
[<ffffffffa094bd3d>] ? btrfs_ioctl+0x1e3d/0x33f0 [btrfs]
[<ffffffff81111133>] ? handle_pte_fault+0x373/0x990
[<ffffffff8153a096>] ? atomic_notifier_call_chain+0x16/0x20
[<ffffffff81063256>] ? set_task_cpu+0xb6/0x1d0
[<ffffffff811122c3>] ? handle_mm_fault+0x143/0x2a0
[<ffffffff81539cc0>] ? __do_page_fault+0x1d0/0x500
[<ffffffff81062f07>] ? check_preempt_curr+0x57/0x90
[<ffffffff8115075a>] ? do_vfs_ioctl+0x4aa/0x990
[<ffffffff81034f83>] ? do_fork+0x113/0x3b0
[<ffffffff812dd7d7>] ? trace_hardirqs_off_thunk+0x3a/0x6c
[<ffffffff81150cc8>] ? SyS_ioctl+0x88/0xa0
[<ffffffff8153e422>] ? system_call_fastpath+0x16/0x1b
---[ end trace 29576629ee80b2e1 ]---
Fixes: 7069830a9e ("Btrfs: add btrfs_compare_trees function")
CC: stable@vger.kernel.org # 3.6+
Signed-off-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a8fca62aa upstream.
If a file has xattrs, we fsync it, to ensure we clear the flags
BTRFS_INODE_NEEDS_FULL_SYNC and BTRFS_INODE_COPY_EVERYTHING from its
inode, the current transaction commits and then we fsync it (without
either of those bits being set in its inode), we end up not logging
all its xattrs. This results in deleting all xattrs when replying the
log after a power failure.
Trivial reproducer
$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt
$ touch /mnt/foobar
$ setfattr -n user.xa -v qwerty /mnt/foobar
$ xfs_io -c "fsync" /mnt/foobar
$ sync
$ xfs_io -c "pwrite -S 0xab 0 64K" /mnt/foobar
$ xfs_io -c "fsync" /mnt/foobar
<power failure>
$ mount /dev/sdb /mnt
$ getfattr --absolute-names --dump /mnt/foobar
<empty output>
$
So fix this by making sure all xattrs are logged if we log a file's inode
item and neither the flags BTRFS_INODE_NEEDS_FULL_SYNC nor
BTRFS_INODE_COPY_EVERYTHING were set in the inode.
Fixes: 36283bf777 ("Btrfs: fix fsync xattr loss in the fast fsync path")
Cc: <stable@vger.kernel.org> # 4.2+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d73c3f8e7 upstream.
Since do_undefinstr() uses get_user to get the undefined
instruction, it can be called before kprobes processes
recursive check. This can cause an infinit recursive
exception.
Prohibit probing on get_user functions.
Fixes: 24ba613c9d ("ARM kprobes: core code")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 70948c05fd upstream.
Prohibit probing on optimized_callback() because
it is called from kprobes itself. If we put a kprobes
on it, that will cause a recursive call loop.
Mark it NOKPROBE_SYMBOL.
Fixes: 0dc016dbd8 ("ARM: kprobes: enable OPTPROBES for ARM 32")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit acf4602001 upstream.
The x86 mmap() code selects the mmap base for an allocation depending on
the bitness of the syscall. For 64bit sycalls it select mm->mmap_base and
for 32bit mm->mmap_compat_base.
exec() calls mmap() which in turn uses in_compat_syscall() to check whether
the mapping is for a 32bit or a 64bit task. The decision is made on the
following criteria:
ia32 child->thread.status & TS_COMPAT
x32 child->pt_regs.orig_ax & __X32_SYSCALL_BIT
ia64 !ia32 && !x32
__set_personality_x32() was dropping TS_COMPAT flag, but
set_personality_64bit() has kept compat syscall flag making
in_compat_syscall() return true during the first exec() syscall.
Which in result has user-visible effects, mentioned by Alexey:
1) It breaks ASAN
$ gcc -fsanitize=address wrap.c -o wrap-asan
$ ./wrap32 ./wrap-asan true
==1217==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
==1217==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
==1217==Process memory map follows:
0x000000400000-0x000000401000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x000000600000-0x000000601000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x000000601000-0x000000602000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x0000f7dbd000-0x0000f7de2000 /lib64/ld-2.27.so
0x0000f7fe2000-0x0000f7fe3000 /lib64/ld-2.27.so
0x0000f7fe3000-0x0000f7fe4000 /lib64/ld-2.27.so
0x0000f7fe4000-0x0000f7fe5000
0x7fed9abff000-0x7fed9af54000
0x7fed9af54000-0x7fed9af6b000 /lib64/libgcc_s.so.1
[snip]
2) It doesn't seem to be great for security if an attacker always knows
that ld.so is going to be mapped into the first 4GB in this case
(the same thing happens for PIEs as well).
The testcase:
$ cat wrap.c
int main(int argc, char *argv[]) {
execvp(argv[1], &argv[1]);
return 127;
}
$ gcc wrap.c -o wrap
$ LD_SHOW_AUXV=1 ./wrap ./wrap true |& grep AT_BASE
AT_BASE: 0x7f63b8309000
AT_BASE: 0x7faec143c000
AT_BASE: 0x7fbdb25fa000
$ gcc -m32 wrap.c -o wrap32
$ LD_SHOW_AUXV=1 ./wrap32 ./wrap true |& grep AT_BASE
AT_BASE: 0xf7eff000
AT_BASE: 0xf7cee000
AT_BASE: 0x7f8b9774e000
Fixes: 1b028f784e ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
Fixes: ada26481df ("x86/mm: Make in_compat_syscall() work during exec")
Reported-by: Alexey Izbyshev <izbyshev@ispras.ru>
Bisected-by: Alexander Monakov <amonakov@ispras.ru>
Investigated-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20180517233510.24996-1-dima@arista.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fed71f7d98 upstream.
Rick bisected a regression on large systems which use the x2apic cluster
mode for interrupt delivery to the commit wich reworked the cluster
management.
The problem is caused by a missing initialization of the clusterid field
in the shared cluster data structures. So all structures end up with
cluster ID 0 which only allows sharing between all CPUs which belong to
cluster 0. All other CPUs with a cluster ID > 0 cannot share the data
structure because they cannot find existing data with their cluster
ID. This causes malfunction with IPIs because IPIs are sent to the wrong
cluster and the caller waits for ever that the target CPU handles the IPI.
Add the missing initialization when a upcoming CPU is the first in a
cluster so that the later booting CPUs can find the data and share it for
proper operation.
Fixes: 023a611748 ("x86/apic/x2apic: Simplify cluster management")
Reported-by: Rick Warner <rick@microway.com>
Bisected-by: Rick Warner <rick@microway.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Rick Warner <rick@microway.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1805171418210.1947@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb0146daef upstream.
Prohibit kprobes on do_undefinstr because kprobes on
arm is implemented by undefined instruction. This means
if we probe do_undefinstr(), it can cause infinit
recursive exception.
Fixes: 24ba613c9d ("ARM kprobes: core code")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fa9d1cfaf upstream.
mm_pkey_is_allocated() treats pkey 0 as unallocated. That is
inconsistent with the manpages, and also inconsistent with
mm->context.pkey_allocation_map. Stop special casing it and only
disallow values that are actually bad (< 0).
The end-user visible effect of this is that you can now use
mprotect_pkey() to set pkey=0.
This is a bit nicer than what Ram proposed[1] because it is simpler
and removes special-casing for pkey 0. On the other hand, it does
allow applications to pkey_free() pkey-0, but that's just a silly
thing to do, so we are not going to protect against it.
The scenario that could happen is similar to what happens if you free
any other pkey that is in use: it might get reallocated later and used
to protect some other data. The most likely scenario is that pkey-0
comes back from pkey_alloc(), an access-disable or write-disable bit
is set in PKRU for it, and the next stack access will SIGSEGV. It's
not horribly different from if you mprotect()'d your stack or heap to
be unreadable or unwritable, which is generally very foolish, but also
not explicitly prevented by the kernel.
1. http://lkml.kernel.org/r/1522112702-27853-1-git-send-email-linuxram@us.ibm.com
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>p
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellermen <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Cc: stable@vger.kernel.org
Fixes: 58ab9a088d ("x86/pkeys: Check against max pkey to avoid overflows")
Link: http://lkml.kernel.org/r/20180509171358.47FD785E@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a0b152083 upstream.
I got a bug report that the following code (roughly) was
causing a SIGSEGV:
mprotect(ptr, size, PROT_EXEC);
mprotect(ptr, size, PROT_NONE);
mprotect(ptr, size, PROT_READ);
*ptr = 100;
The problem is hit when the mprotect(PROT_EXEC)
is implicitly assigned a protection key to the VMA, and made
that key ACCESS_DENY|WRITE_DENY. The PROT_NONE mprotect()
failed to remove the protection key, and the PROT_NONE->
PROT_READ left the PTE usable, but the pkey still in place
and left the memory inaccessible.
To fix this, we ensure that we always "override" the pkee
at mprotect() if the VMA does not have execute-only
permissions, but the VMA has the execute-only pkey.
We had a check for PROT_READ/WRITE, but it did not work
for PROT_NONE. This entirely removes the PROT_* checks,
which ensures that PROT_NONE now works.
Reported-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellermen <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Cc: stable@vger.kernel.org
Fixes: 62b5f7d013 ("mm/core, x86/mm/pkeys: Add execute-only protection keys support")
Link: http://lkml.kernel.org/r/20180509171351.084C5A71@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c1a2ee1b5 upstream.
Commit 539d39eb27 ("bcache: fix wrong return value in bch_debug_init()")
returns the return value of debugfs_create_dir() to bcache_init(). When
CONFIG_DEBUG_FS=n, bch_debug_init() always returns 1 and makes
bcache_init() failedi.
This patch makes bch_debug_init() always returns 0 if CONFIG_DEBUG_FS=n,
so bcache can continue to work for the kernels which don't have debugfs
enanbled.
Changelog:
v4: Add Acked-by from Kent Overstreet.
v3: Use IS_ENABLED(CONFIG_DEBUG_FS) to replace #ifdef DEBUG_FS.
v2: Remove a warning information
v1: Initial version.
Fixes: Commit 539d39eb27 ("bcache: fix wrong return value in bch_debug_init()")
Cc: stable@vger.kernel.org
Signed-off-by: Coly Li <colyli@suse.de>
Reported-by: Massimo B. <massimo.b@gmx.net>
Reported-by: Kai Krakow <kai@kaishome.de>
Tested-by: Kai Krakow <kai@kaishome.de>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Kai Krakow <kai@kaishome.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e68adcd2f upstream.
Calling qdio_release_memory() on error is just plain wrong. It frees
the main qdio_irq struct, when following code still uses it.
Also, no other error path in qdio_establish() does this. So trust
callers to clean up via qdio_free() if some step of the QDIO
initialization fails.
Fixes: 779e6e1c72 ("[S390] qdio: new qdio driver.")
Cc: <stable@vger.kernel.org> #v2.6.27+
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4bbaf2584b upstream.
Correct a trinity finding for the perf_event_open() system call with
a perf event attribute structure that uses a frequency but has the
sampling frequency set to zero. This causes a FP divide exception during
the sample rate initialization for the hardware sampling facility.
Fixes: 8c069ff4bd ("s390/perf: add support for the CPU-Measurement Sampling Facility")
Cc: stable@vger.kernel.org # 3.14+
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e521813468 upstream.
Ever since CQ/QAOB support was added, calling qdio_free() straight after
qdio_alloc() results in qdio_release_memory() accessing uninitialized
memory (ie. q->u.out.use_cq and q->u.out.aobs). Followed by a
kmem_cache_free() on the random AOB addresses.
For older kernels that don't have 6e30c549f6, the same applies if
qdio_establish() fails in the DEV_STATE_ONLINE check.
While initializing q->u.out.use_cq would be enough to fix this
particular bug, the more future-proof change is to just zero-alloc the
whole struct.
Fixes: 104ea556ee ("qdio: support asynchronous delivery of storage blocks")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ab1e8d8960 upstream.
It is unsafe to do virtual to physical translations before mm_init() is
called if struct page is needed in order to determine the memory section
number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init()
we initialize struct pages for all the allocated memory when deferred
struct pages are used.
My recent fix in commit c9e97a1997 ("mm: initialize pages on demand
during boot") exposed this problem, because it greatly reduced number of
pages that are initialized before mm_init(), but the problem existed
even before my fix, as Fengguang Wu found.
Below is a more detailed explanation of the problem.
We initialize struct pages in four places:
1. Early in boot a small set of struct pages is initialized to fill the
first section, and lower zones.
2. During mm_init() we initialize "struct pages" for all the memory that
is allocated, i.e reserved in memblock.
3. Using on-demand logic when pages are allocated after mm_init call
(when memblock is finished)
4. After smp_init() when the rest free deferred pages are initialized.
The problem occurs if we try to do va to phys translation of a memory
between steps 1 and 2. Because we have not yet initialized struct pages
for all the reserved pages, it is inherently unsafe to do va to phys if
the translation itself requires access of "struct page" as in case of
this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP
The following path exposes the problem:
start_kernel()
trap_init()
setup_cpu_entry_areas()
setup_cpu_entry_area(cpu)
get_cpu_gdt_paddr(cpu)
per_cpu_ptr_to_phys(addr)
pcpu_addr_to_page(addr)
virt_to_page(addr)
pfn_to_page(__pa(addr) >> PAGE_SHIFT)
We disable this path by not allowing NEED_PER_CPU_KM with deferred
struct pages feature.
The problems are discussed in these threads:
http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.comhttp://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.comhttp://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180515175124.1770-1-pasha.tatashin@oracle.com
Fixes: 3a80a7fa79 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f418224e8 upstream.
Fix a race in the multi-order iteration code which causes the kernel to
hit a GP fault. This was first seen with a production v4.15 based
kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
order 9 PMD DAX entries.
The race has to do with how we tear down multi-order sibling entries
when we are removing an item from the tree. Remember for example that
an order 2 entry looks like this:
struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
where 'entry' is in some slot in the struct radix_tree_node, and the
three slots following 'entry' contain sibling pointers which point back
to 'entry.'
When we delete 'entry' from the tree, we call :
radix_tree_delete()
radix_tree_delete_item()
__radix_tree_delete()
replace_slot()
replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL. This means that for a
brief period of time we end up with one or more of the siblings removed,
so:
struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
This causes an issue if you have a reader iterating over the slots in
the tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection. This is a common case in
mm/filemap.c.
The issue is that when __radix_tree_next_slot() => skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot.
Normally this works:
V preceding slot
struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
^ current slot
This lets you find the first sibling, and you skip them all in order.
But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:
V preceding slot
struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
^ current slot
This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers. This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.
In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at
'entry'.
We fix this race by fixing the way that skip_siblings() detects sibling
nodes. Instead of testing against the preceding slot we instead look
for siblings via is_sibling_entry() which compares against the position
of the struct radix_tree_node.slots[] array. This ensures that sibling
entries are properly identified, even if they are no longer contiguous
with the 'entry' they point to.
Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
Fixes: 148deab223 ("radix-tree: improve multiorder iterators")
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: CR, Sapthagirish <sapthagirish.cr@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1d2a31397 upstream.
Similarly to opal_event_shutdown, opal_nvram_write can be called in
the crash path with irqs disabled. Special case the delay to avoid
sleeping in invalid context.
Fixes: 3b8070335f ("powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops")
Cc: stable@vger.kernel.org # v3.2
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90d6176333 upstream.
The code is doing monolithic reads for all chunks except the last one
which is wrong since a monolithic read will issue the
READ0+ADDRS+READ_START sequence. It not only takes longer because it
forces the NAND chip to reload the page content into its internal
cache, but by doing that we also reset the column pointer to 0, which
means we'll always read the first chunk instead of moving to the next
one.
Rework the code to do a monolithic read only for the first chunk,
then switch to naked reads for all intermediate chunks and finally
issue a last naked read for the last chunk.
Fixes: 02f26ecf8c mtd: nand: add reworked Marvell NAND controller driver
Cc: stable@vger.kernel.org
Reported-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 06cb616b1b upstream.
Not all revisions of DW I2C controller implement the enable status register.
On platforms where that's the case (e.g. BG2CD and SPEAr ARM SoCs), waiting
for enable will time out as reading the unimplemented register yields zero.
It was observed that reading the IC_ENABLE_STATUS register once suffices to
avoid getting it stuck on Bay Trail hardware, so replace polling with one
dummy read of the register.
Fixes: fba4adbbf6 ("i2c: designware: must wait for enable")
Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Tested-by: Ben Gardner <gardner.ben@gmail.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f43194c144 upstream.
Marvell PPv2.2 controller present on CP-110 need the extra "mg_core_clk"
clock to avoid system hangs when powering some network interfaces up.
This issue appeared after a recent clock rework on Armada 7K/8K platforms.
This commit adds the new clock and updates the documentation accordingly.
[gregory.clement: use the real first commit to fix and add the cc:stable
flag]
Fixes: e3af9f7c6e ("RM64: dts: marvell: armada-cp110: Fix clock resources for various node")
Cc: <stable@vger.kernel.org>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a057344806 upstream.
The Marvell XSMI controller needs 3 clocks to operate correctly :
- The MG clock (clk 5)
- The MG Core clock (clk 6)
- The GOP clock (clk 18)
This commit adds them, to avoid system hangs when using these
interfaces.
[gregory.clement: use the real first commit to fix and add the cc:stable
flag]
Fixes: f66b2aff46 ("arm64: dts: marvell: add xmdio nodes for 7k/8k")
Cc: <stable@vger.kernel.org>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 569ccae68b upstream.
rules in nftables a free'd using kfree, but protected by rcu, i.e. we
must wait for a grace period to elapse.
Normal removal patch does this, but nf_tables_newrule() doesn't obey
this rule during error handling.
It calls nft_trans_rule_add() *after* linking rule, and, if that
fails to allocate memory, it unlinks the rule and then kfree() it --
this is unsafe.
Switch order -- first add rule to transaction list, THEN link it
to public list.
Note: nft_trans_rule_add() uses GFP_KERNEL; it will not fail so this
is not a problem in practice (spotted only during code review).
Fixes: 0628b123c9 ("netfilter: nfnetlink: add batch support and use it from nf_tables")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f6adf4815 upstream.
set->name must be free'd here in case ops->init fails.
Fixes: 387454901b ("netfilter: nf_tables: Allow set names of up to 255 chars")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb765d1c33 upstream.
Bump the file's refcount before moving the reference into the fd table,
not afterwards. The old code could drop the file's refcount to zero for a
short moment before calling get_file() via get_dma_buf().
This code can only be triggered on ARM systems that use Linaro's OP-TEE.
Fixes: 967c9cca2c ("tee: generic TEE subsystem")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85f4f12d51 upstream.
Reviewing Tobin's patches for getting pointers out early before
entropy has been established, I noticed that there's a lone smp_mb() in
the code. As with most lone memory barriers, this one appears to be
incorrectly used.
We currently basically have this:
get_random_bytes(&ptr_key, sizeof(ptr_key));
/*
* have_filled_random_ptr_key==true is dependent on get_random_bytes().
* ptr_to_id() needs to see have_filled_random_ptr_key==true
* after get_random_bytes() returns.
*/
smp_mb();
WRITE_ONCE(have_filled_random_ptr_key, true);
And later we have:
if (unlikely(!have_filled_random_ptr_key))
return string(buf, end, "(ptrval)", spec);
/* Missing memory barrier here. */
hashval = (unsigned long)siphash_1u64((u64)ptr, &ptr_key);
As the CPU can perform speculative loads, we could have a situation
with the following:
CPU0 CPU1
---- ----
load ptr_key = 0
store ptr_key = random
smp_mb()
store have_filled_random_ptr_key
load have_filled_random_ptr_key = true
BAD BAD BAD! (you're so bad!)
Because nothing prevents CPU1 from loading ptr_key before loading
have_filled_random_ptr_key.
But this race is very unlikely, but we can't keep an incorrect smp_mb() in
place. Instead, replace the have_filled_random_ptr_key with a static_branch
not_filled_random_ptr_key, that is initialized to true and changed to false
when we get enough entropy. If the update happens in early boot, the
static_key is updated immediately, otherwise it will have to wait till
entropy is filled and this happens in an interrupt handler which can't
enable a static_key, as that requires a preemptible context. In that case, a
work_queue is used to enable it, as entropy already took too long to
establish in the first place waiting a little more shouldn't hurt anything.
The benefit of using the static key is that the unlikely branch in
vsprintf() now becomes a nop.
Link: http://lkml.kernel.org/r/20180515100558.21df515e@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes: ad67b74d24 ("printk: hash addresses printed with %p")
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45dd9b0666 upstream.
Doing an audit of trace events, I discovered two trace events in the xen
subsystem that use a hack to create zero data size trace events. This is not
what trace events are for. Trace events add memory footprint overhead, and
if all you need to do is see if a function is hit or not, simply make that
function noinline and use function tracer filtering.
Worse yet, the hack used was:
__array(char, x, 0)
Which creates a static string of zero in length. There's assumptions about
such constructs in ftrace that this is a dynamic string that is nul
terminated. This is not the case with these tracepoints and can cause
problems in various parts of ftrace.
Nuke the trace events!
Link: http://lkml.kernel.org/r/20180509144605.5a220327@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes: 95a7d76897 ("xen/mmu: Use Xen specific TLB flush instead of the generic one.")
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d66a735571 upstream.
If the translation of a channel program fails, we may end up attempting
to clean up (free, unpin) stuff that never got translated (and allocated,
pinned) in the first place.
By adjusting the lengths of the chains accordingly (so the element that
failed, and all subsequent elements are excluded) cleanup activities
based on false assumptions can be avoided.
Let's make sure cp_free works properly after cp_prefetch returns with an
error by setting ch_len of a ccw chain to the number of the translated
CCWs on that chain.
Cc: stable@vger.kernel.org #v4.12+
Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20180423110113.59385-2-bjsdjshi@linux.vnet.ibm.com>
[CH: fixed typos]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b031622f5 upstream.
The SMN (System Management Network) on Family 17h AMD CPUs is also accessed
from other drivers, specifically EDAC. Accessing it directly is racy.
On top of that, accessing the SMN through root bridge 00:00 is wrong on
multi-die CPUs and may result in reading the temperature from the wrong
die. Use available API functions to fix the problem.
For this to work, add dependency on AMD_NB. Also change the Raven Ridge
PCI device ID to point to Data Fabric Function 3, since this ID is used
by the API functions to find the CPU node.
Cc: stable@vger.kernel.org # v4.16+
Tested-by: Gabriel Craciunescu <nix.or.die@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf308242ab upstream.
kvm_read_guest() will eventually look up in kvm_memslots(), which requires
either to hold the kvm->slots_lock or to be inside a kvm->srcu critical
section.
In contrast to x86 and s390 we don't take the SRCU lock on every guest
exit, so we have to do it individually for each kvm_read_guest() call.
Provide a wrapper which does that and use that everywhere.
Note that ending the SRCU critical section before returning from the
kvm_read_guest() wrapper is safe, because the data has been *copied*, so
we don't need to rely on valid references to the memslot anymore.
Cc: Stable <stable@vger.kernel.org> # 4.8+
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 711702b57c upstream.
kvm_read_guest() will eventually look up in kvm_memslots(), which requires
either to hold the kvm->slots_lock or to be inside a kvm->srcu critical
section.
In contrast to x86 and s390 we don't take the SRCU lock on every guest
exit, so we have to do it individually for each kvm_read_guest() call.
Use the newly introduced wrapper for that.
Cc: Stable <stable@vger.kernel.org> # 4.12+
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c4188762f upstream.
Apparently the development of update_affinity() overlapped with the
promotion of irq_lock to be _irqsave, so the patch didn't convert this
lock over. This will make lockdep complain.
Fix this by disabling IRQs around the lock.
Cc: stable@vger.kernel.org
Fixes: 08c9fd0421 ("KVM: arm/arm64: vITS: Add a helper to update the affinity of an LPI")
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 388d435968 upstream.
As Jan reported [1], lockdep complains about the VGIC not being bullet
proof. This seems to be due to two issues:
- When commit 006df0f349 ("KVM: arm/arm64: Support calling
vgic_update_irq_pending from irq context") promoted irq_lock and
ap_list_lock to _irqsave, we forgot two instances of irq_lock.
lockdeps seems to pick those up.
- If a lock is _irqsave, any other locks we take inside them should be
_irqsafe as well. So the lpi_list_lock needs to be promoted also.
This fixes both issues by simply making the remaining instances of those
locks _irqsave.
One irq_lock is addressed in a separate patch, to simplify backporting.
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/575718.html
Cc: stable@vger.kernel.org
Fixes: 006df0f349 ("KVM: arm/arm64: Support calling vgic_update_irq_pending from irq context")
Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64f7a11586 upstream.
Update SECONDARY_EXEC_DESC for UMIP emulation if and only UMIP
is actually being emulated. Skipping the VMCS update eliminates
unnecessary VMREAD/VMWRITE when UMIP is supported in hardware,
and on platforms that don't have SECONDARY_VM_EXEC_CONTROL. The
latter case resolves a bug where KVM would fill the kernel log
with warnings due to failed VMWRITEs on older platforms.
Fixes: 0367f205a3 ("KVM: vmx: add support for emulating UMIP")
Cc: stable@vger.kernel.org #4.16
Reported-by: Paolo Zeppegno <pzeppegno@gmail.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Radim KrÄmář <rkrcmar@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5eb9a07a4a upstream.
Added fix for probing of spi-nor device non-zero chip selects. Set
MSPI_CDRAM_PCS (peripheral chip select) with spi master for MSPI
controller and not for MSPI/BSPI spi-nor master controller. Ensure
setting of cs bit in chip select register on chip select change.
Fixes: fa236a7ef2 ("spi: bcm-qspi: Add Broadcom MSPI driver")
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit efc4a13724 upstream.
Currently the 32-bit device address only is supported for DMA. However,
starting from Intel Sunrisepoint PCH the DMA address of the device FIFO
can be 64-bit.
Change the respective variable to be compatible with DMA engine
expectations, i.e. to phys_addr_t.
Fixes: 34cadd9c1b ("spi: pxa2xx: Add support for Intel Sunrisepoint")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f12888dfa upstream.
In snd_ctl_elem_add_compat(), the fields of the struct 'data' need to be
copied from the corresponding fields of the struct 'data32' in userspace.
This is achieved by invoking copy_from_user() and get_user() functions. The
problem here is that the 'type' field is copied twice. One is by
copy_from_user() and one is by get_user(). Given that the 'type' field is
not used between the two copies, the second copy is *completely* redundant
and should be removed for better performance and cleanup. Also, these two
copies can cause inconsistent data: as the struct 'data32' resides in
userspace and a malicious userspace process can race to change the 'type'
field between the two copies to cause inconsistent data. Depending on how
the data is used in the future, such an inconsistency may cause potential
security risks.
For above reasons, we should take out the second copy.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21493316a3 upstream.
Currently it's not possible to set volume lower than 26% (it just mutes).
Also fixes this warning:
Warning! Unlikely big volume range (=9472), cval->res is probably wrong.
[13] FU [PCM Playback Volume] ch = 2, val = -9473/-1/1
, and volume works fine for full range.
Signed-off-by: Federico Cuello <fedux@fedux.com.ar>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c171654caa upstream.
stub_probe() calls put_busid_priv() in an error path when device isn't
found in the busid_table. Fix it by making put_busid_priv() safe to be
called with null struct bus_id_priv pointer.
This problem happens when "usbip bind" is run without loading usbip_host
driver and then running modprobe. The first failed bind attempt unbinds
the device from the original driver and when usbip_host is modprobed,
stub_probe() runs and doesn't find the device in its busid table and calls
put_busid_priv(0 with null bus_id_priv pointer.
usbip-host 3-10.2: 3-10.2 is not in match_busid table... skip!
[ 367.359679] =====================================
[ 367.359681] WARNING: bad unlock balance detected!
[ 367.359683] 4.17.0-rc4+ #5 Not tainted
[ 367.359685] -------------------------------------
[ 367.359688] modprobe/2768 is trying to release lock (
[ 367.359689]
==================================================================
[ 367.359696] BUG: KASAN: null-ptr-deref in print_unlock_imbalance_bug+0x99/0x110
[ 367.359699] Read of size 8 at addr 0000000000000058 by task modprobe/2768
[ 367.359705] CPU: 4 PID: 2768 Comm: modprobe Not tainted 4.17.0-rc4+ #5
Fixes: 22076557b0 ("usbip: usbip_host: fix NULL-ptr deref and use-after-free errors") in usb-linus
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22076557b0 upstream.
usbip_host updates device status without holding lock from stub probe,
disconnect and rebind code paths. When multiple requests to import a
device are received, these unprotected code paths step all over each
other and drive fails with NULL-ptr deref and use-after-free errors.
The driver uses a table lock to protect the busid array for adding and
deleting busids to the table. However, the probe, disconnect and rebind
paths get the busid table entry and update the status without holding
the busid table lock. Add a new finer grain lock to protect the busid
entry. This new lock will be held to search and update the busid entry
fields from get_busid_idx(), add_match_busid() and del_match_busid().
match_busid_show() does the same to access the busid entry fields.
get_busid_priv() changed to return the pointer to the busid entry holding
the busid lock. stub_probe(), stub_disconnect() and stub_device_rebind()
call put_busid_priv() to release the busid lock before returning. This
changes fixes the unprotected code paths eliminating the race conditions
in updating the busid entries.
Reported-by: Jakub Jirasek
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7510df3f29 upstream.
After removing usbip_host module, devices it releases are left without
a driver. For example, when a keyboard or a mass storage device are
bound to usbip_host when it is removed, these devices are no longer
bound to any driver.
Fix it to run device_attach() from the module exit routine to restore
the devices to their original drivers. This includes cleanup changes
and moving device_attach() code to a common routine to be called from
rebind_store() and usbip_host_exit().
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e180f167d upstream.
Device is left in the busid_table after unbind and rebind. Rebind
initiates usb bus scan and the original driver claims the device.
After rescan the device should be deleted from the busid_table as
it no longer belongs to usbip_host.
Fix it to delete the device after device_attach() succeeds.
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2278446e2b upstream.
Hub driver will try to disable a USB3 device twice at logical disconnect,
racing with xhci_free_dev() callback from the first port disable.
This can be triggered with "udisksctl power-off --block-device <disk>"
or by writing "1" to the "remove" sysfs file for a USB3 device
in 4.17-rc4.
USB3 devices don't have a similar disabled link state as USB2 devices,
and use a U3 suspended link state instead. In this state the port
is still enabled and connected.
hub_port_connect() first disconnects the device, then later it notices
that device is still enabled (due to U3 states) it will try to disable
the port again (set to U3).
The xhci_free_dev() called during device disable is async, so checking
for existing xhci->devs[i] when setting link state to U3 the second time
was successful, even if device was being freed.
The regression was caused by, and whole thing revealed by,
Commit 44a182b9d1 ("xhci: Fix use-after-free in xhci_free_virt_device")
which sets xhci->devs[i]->udev to NULL before xhci_virt_dev() returned.
and causes a NULL pointer dereference the second time we try to set U3.
Fix this by checking xhci->devs[i]->udev exists before setting link state.
The original patch went to stable so this fix needs to be applied there as
well.
Fixes: 44a182b9d1 ("xhci: Fix use-after-free in xhci_free_virt_device")
Cc: <stable@vger.kernel.org>
Reported-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f7ccc2ccc upstream.
proc_pid_cmdline_read() and environ_read() directly access the target
process' VM to retrieve the command line and environment. If this
process remaps these areas onto a file via mmap(), the requesting
process may experience various issues such as extra delays if the
underlying device is slow to respond.
Let's simply refuse to access file-backed areas in these functions.
For this we add a new FOLL_ANON gup flag that is passed to all calls
to access_remote_vm(). The code already takes care of such failures
(including unmapped areas). Accesses via /proc/pid/mem were not
changed though.
This was assigned CVE-2018-1120.
Note for stable backports: the patch may apply to kernels prior to 4.11
but silently miss one location; it must be checked that no call to
access_remote_vm() keeps zero as the last argument.
Reported-by: Qualys Security Advisory <qsa@qualys.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 52c5cd1bf0 ]
In an SFP EEPROM values can be read to get information about a given SFP
module. One of those is the bitrate, which can be determined using a
nominal bitrate in addition with min and max values (in %). The SFP code
currently compute both BR,min and BR,max values thanks to this nominal
and min,max values.
This patch fixes the BR,min computation as the min value should be
subtracted to the nominal one, not added.
Fixes: 9962acf7fb ("sfp: add support for 1000Base-PX and 1000Base-BX10")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6082d9c9c9 ]
Adding the vector offset when calling to mlx5_vector2eqn() is wrong.
This is because mlx5_vector2eqn() checks if EQ index is equal to vector number
and the fact that the internal completion vectors that mlx5 allocates
don't get an EQ index.
The second problem here is that using effective_affinity_mask gives the same
CPU for different vectors.
This leads to unmapped queues when calling it from blk_mq_rdma_map_queues().
This doesn't happen when using affinity_hint mask.
Fixes: 2572cf57d7 ("mlx5: fix mlx5_get_vector_affinity to start from completion vector 0")
Fixes: 05e0cc84e0 ("net/mlx5: Fix get vector affinity helper function")
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8ccc113172 ]
Resources are not freed in the reverse order of the allocation.
Labels are also mixed-up.
Fix it and reorder code and labels in the error handling path of
'mlxsw_core_bus_device_register()'
Fixes: ef3116e540 ("mlxsw: spectrum: Register KVD resources with devlink")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0e8411e426 ]
After route cache is flushed via ipv4_sysctl_rtcache_flush(), we forget
to reset fnhe_mtu_locked in rt_bind_exception(). When pmtu is updated
in __ip_rt_update_pmtu(), it will return directly since the pmtu is
still locked. e.g.
+ ip netns exec client ping 10.10.1.1 -c 1 -s 1400 -M do
PING 10.10.1.1 (10.10.1.1) 1400(1428) bytes of data.
>From 10.10.0.254 icmp_seq=1 Frag needed and DF set (mtu = 0)
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 55be9f25be ]
On older windows hosts the net_device instance is returned to
the caller of rndis_filter_device_add() without having the presence
bit set first. This would cause any subsequent calls to network device
operations (e.g. MTU change, channel change) to fail after the device
is detached once, returning -ENODEV.
Instead of returning the device instabce, we take the exit path where
we call netif_device_attach()
Fixes: 7b2ee50c0c ("hv_netvsc: common detach logic")
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6a9a27d539 ]
When processing a duplicate cookie-echo chunk, sctp moves the new
temp asoc's stream out/in into the old asoc, and later frees this
new temp asoc.
But now after this move, the new temp asoc's stream->outcnt is not
cleared while stream->out is set to NULL, which would cause a same
crash as the one fixed in Commit 79d0895140 ("sctp: fix error
path in sctp_stream_init") when freeing this asoc later.
This fix is to clear this outcnt in sctp_stream_update.
Fixes: f952be79ce ("sctp: introduce struct sctp_stream_out_ext")
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 50a5852a65 ]
Firmware requires that the ttl value for an encapsulating ipv4 tunnel
header be included as an action field. Prior to the support of Geneve
tunnel encap (when ttl set was removed completely), ttl value was
extracted from the tunnel key. However, tests have shown that this can
still produce a ttl of 0.
Fix the issue by setting the namespace default value for each new tunnel.
Follow up patch for net-next will do a full route lookup.
Fixes: 3ca3059dc3 ("nfp: flower: compile Geneve encap actions")
Fixes: b27d6a95a7 ("nfp: compile flower vxlan tunnel set actions")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1f3ccc3c3f ]
While adding the DSA notifier, we will be sending DSA notifications with
info->master that is going to point to a particular net_device instance.
Our logic in bcm_sysport_map_queues() correctly disambiguates net_device
instances that are not covered by our own driver, but it will not make
sure that info->master points to a particular driver instance that we
are interested in. In a system where e.g: two or more SYSTEMPORT
instances are registered, this would lead in programming two or more
times the queue mapping, completely messing with the logic which does
the queue/port allocation and tracking.
Fix this by looking at the notifier_block pointer which is unique per
instance and allows us to go back to our driver private structure, and
in turn to the backing net_device instance.
Fixes: d156576362 ("net: systemport: Establish lower/upper queue mapping")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 35f80acb24 ]
When the trust state is set to dscp and the netdev is down, the inline
header size is not updated. When netdev is up, the inline header size
stays at L2 instead of IP.
Fix this issue by updating the private parameter when the netdev is in
down so that when netdev is up, it picks up the right header size.
Fixes: fbcb127e89 ("net/mlx5e: Support DSCP trust state ...")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c7f46cca8c ]
When IGMP snooping is enabled on a bridge, traffic forwarded by an MDB
entry should be sent to both ports member in the MDB's ports list and
mrouter ports.
In case a port needs to be removed from an MDB's ports list, but this
port is also configured as an mrouter port, then do not update the
device so that it will continue to forward traffic through that port.
Fix a copy-paste error that checked that IGMP snooping is enabled twice
instead of checking the port's mrouter state.
Fixes: ded711c87a ("mlxsw: spectrum_switchdev: Consider mrouter status for mdb changes")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Colin King <colin.king@canonical.com>
Reviewed-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 69678bcd4d ]
Damir reported a breakage of SO_BINDTODEVICE for UDP sockets.
In absence of VRF devices, after commit fb74c27735 ("net:
ipv4: add second dif to udp socket lookups") the dif mismatch
isn't fatal anymore for UDP socket lookup with non null
sk_bound_dev_if, breaking SO_BINDTODEVICE semantics.
This changeset addresses the issue making the dif match mandatory
again in the above scenario.
Reported-by: Damir Mansurov <dnman@oktetlabs.ru>
Fixes: fb74c27735 ("net: ipv4: add second dif to udp socket lookups")
Fixes: 1801b570dd ("net: ipv6: add second dif to udp socket lookups")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1ccef350db ]
For ICMPv4, the checksum is calculated from the ICMP headers and data.
Since the ICMPv4 checksum doesn't cover the IP header, we can allow to
do L3 header re-write for this protocol.
Fixes: bdd66ac0ae ('net/mlx5e: Disallow TC offloading of unsupported match/action combinations')
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 97f3efb643 ]
The hyper-v transparent bonding should have used master_dev_link.
The netvsc device should look like a master bond device not
like the upper side of a tunnel.
This makes the semantics the same so that userspace applications
looking at network devices see the correct master relationshipship.
Fixes: 0c195567a8 ("netvsc: transparent VF management")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9c26f5f89d ]
When we fail to initialize the RX root namespace, we need
to clean only that and not the entire flow steering.
Currently the code may try to clean the flow steering twice
on error witch leads to null pointer deference.
Make sure we clean correctly.
Fixes: fba53f7b57 ("net/mlx5: Introduce mlx5_flow_steering structure")
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d9a96ec362 ]
In case of a dma_mapping_error, do not use wi->num_dma
as a parameter for dma unmap function because it's yet
to be set, and holds an out-of-date value.
Use actual value (local variable num_dma) instead.
Fixes: 34802a42b3 ("net/mlx5e: Do not modify the TX SKB")
Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d68d75fdc3 ]
In case modules are not configured, error out when tp->ops is null
and prevent later null pointer dereference.
Fixes: 33a48927c1 ("sched: push TC filter protocol creation into a separate function")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 21706ee8a4 ]
There was a regression at some point from the intended functionality of
commit f60c3704e8 ("bonding: Fix alb mode to only use first level
vlans.")
Given the return value vlan_get_encap_level() we need to store the nest
level of the bond device, and then compare the vlan's encap level to
this. Without this, this check always fails and learning packets are
never sent.
In addition, this same commit caused a regression in the behavior of
balance_alb, which requires learning packets be sent for all interfaces
using the slave's mac in order to load balance properly. For vlan's
that have not set a user mac, we can send after checking one bit.
Otherwise we need send the set mac, albeit defeating rx load balancing
for that vlan.
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4fa8667ca3 ]
Make sure multicast, broadcast, and zero mac's cannot be the output of rlb
updates, which should all be directed arps. Receive load balancing will be
collapsed if any of these happen, as the switch will broadcast.
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d89a2adb8b ]
tg3_free_consistent() calls dma_free_coherent() to free tp->hw_stats
under spinlock and can trigger BUG_ON() in vunmap() because vunmap()
may sleep. Fix it by removing the spinlock and relying on the
TG3_FLAG_INIT_COMPLETE flag to prevent race conditions between
tg3_get_stats64() and tg3_free_consistent(). TG3_FLAG_INIT_COMPLETE
is always cleared under tp->lock before tg3_free_consistent()
and therefore tg3_get_stats64() can safely access tp->hw_stats
under tp->lock if TG3_FLAG_INIT_COMPLETE is set.
Fixes: f5992b72eb ("tg3: Fix race condition in tg3_get_stats64().")
Reported-by: Zumeng Chen <zumeng.chen@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 16ae6aa170 ]
The TCP repair sequence of operation is to first set the socket in
repair mode, then inject the TCP stats into the socket with repair
socket options, then call connect() to re-activate the socket. The
connect syscall simply returns and set state to ESTABLISHED
mode. As a result Fast Open is meaningless for TCP repair.
However allowing sendto() system call with MSG_FASTOPEN flag half-way
during the repair operation could unexpectedly cause data to be
sent, before the operation finishes changing the internal TCP stats
(e.g. MSS). This in turn triggers TCP warnings on inconsistent
packet accounting.
The fix is to simply disallow Fast Open operation once the socket
is in the repair mode.
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e6e6a278b1 ]
Previously the bbr->idle_restart tracking was zeroing out the
bbr->idle_restart bit upon ACKs that did not SACK or ACK anything,
e.g. receiving incoming data or receiver window updates. In such
situations BBR would forget that this was a restart-from-idle
situation, and if the min_rtt had expired it would unnecessarily enter
PROBE_RTT (even though we were actually restarting from idle but had
merely forgotten that fact).
The fix is simple: we need to remember we are restarting from idle
until we receive a S/ACK for some data (a S/ACK for the first flight
of data we send as we are restarting).
This commit is a stable candidate for kernels back as far as 4.9.
Fixes: 0f8782ea14 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 46e16d4b95 ]
When processing a duplicate cookie-echo chunk, for case 'D', sctp will
not process the param from this chunk. It means old asoc has nothing
to be updated, and the new temp asoc doesn't have the complete info.
So there's no reason to use the new asoc when creating the cookie-ack
chunk. Otherwise, like when auth is enabled for cookie-ack, the chunk
can not be set with auth, and it will definitely be dropped by peer.
This issue is there since very beginning, and we fix it by using the
old asoc instead.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6910e25de2 ]
In Commit 1f45f78f8e ("sctp: allow GSO frags to access the chunk too"),
it held the chunk in sctp_ulpevent_make_rcvmsg to access it safely later
in recvmsg. However, it also added sctp_chunk_put in fail_mark err path,
which is only triggered before holding the chunk.
syzbot reported a use-after-free crash happened on this err path, where
it shouldn't call sctp_chunk_put.
This patch simply removes this call.
Fixes: 1f45f78f8e ("sctp: allow GSO frags to access the chunk too")
Reported-by: syzbot+141d898c5f24489db4aa@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d625329b06 ]
Since sctp ipv6 socket also supports v4 addrs, it's possible to
compare two v4 addrs in pf v6 .cmp_addr, sctp_inet6_cmp_addr.
However after Commit 1071ec9d45 ("sctp: do not check port in
sctp_inet6_cmp_addr"), it no longer calls af1->cmp_addr, which
in this case is sctp_v4_cmp_addr, but calls __sctp_v6_cmp_addr
where it handles them as two v6 addrs. It would cause a out of
bounds crash.
syzbot found this crash when trying to bind two v4 addrs to a
v6 socket.
This patch fixes it by adding the process for two v4 addrs in
sctp_inet6_cmp_addr.
Fixes: 1071ec9d45 ("sctp: do not check port in sctp_inet6_cmp_addr")
Reported-by: syzbot+cd494c1dd681d4d93ebb@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ce402f044e ]
When auth is enabled for cookie-ack chunk, in sctp_inq_pop, sctp
processes auth chunk first, then continues to the next chunk in
this packet if chunk_end + chunk_hdr size < skb_tail_pointer().
Otherwise, it will go to the next packet or discard this chunk.
However, it missed the fact that cookie-ack chunk's size is equal
to chunk_hdr size, which couldn't match that check, and thus this
chunk would not get processed.
This patch fixes it by changing the check to chunk_end + chunk_hdr
size <= skb_tail_pointer().
Fixes: 26b87c7881 ("net: sctp: fix remote memory pressure from excessive queueing")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 59d8d4434f ]
Now sctp only delays the authentication for the normal cookie-echo
chunk by setting chunk->auth_chunk in sctp_endpoint_bh_rcv(). But
for the duplicated one with auth, in sctp_assoc_bh_rcv(), it does
authentication first based on the old asoc, which will definitely
fail due to the different auth info in the old asoc.
The duplicated cookie-echo chunk will create a new asoc with the
auth info from this chunk, and the authentication should also be
done with the new asoc's auth info for all of the collision 'A',
'B' and 'D'. Otherwise, the duplicated cookie-echo chunk with auth
will never pass the authentication and create the new connection.
This issue exists since very beginning, and this fix is to make
sctp_assoc_bh_rcv() follow the way sctp_endpoint_bh_rcv() does
for the normal cookie-echo chunk to delay the authentication.
While at it, remove the unused params from sctp_sf_authenticate()
and define sctp_auth_chunk_verify() used for all the places that
do the delayed authentication.
v1->v2:
fix the typo in changelog as Marcelo noticed.
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3148dedfe7 ]
Since commit a92a08499b "r8169: improve runtime pm in general and
suspend unused ports" interfaces w/o link are runtime-suspended after
10s. On systems where drivers take longer to load this can lead to the
situation that the interface is runtime-suspended already when it's
initially brought up.
This shouldn't be a problem because rtl_open() resumes MAC/PHY.
However with at least one chip version the interface doesn't properly
come up, as reported here:
https://bugzilla.kernel.org/show_bug.cgi?id=199549
The vendor driver uses a delay to give certain chip versions some
time to resume before starting the PHY configuration. So let's do
the same. I don't know which chip versions may be affected,
therefore apply this delay always.
This patch was reported to fix the issue for RTL8168h.
I was able to reproduce the issue on an Asus H310I-Plus which also
uses a RTL8168h. Also in my case the patch fixed the issue.
Reported-by: Slava Kardakov <ojab@ojab.ru>
Tested-by: Slava Kardakov <ojab@ojab.ru>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5697db4a69 ]
The USB_DEVICE_INTERFACE_NUMBER matching macro assumes that
the { vendorid, productid, interfacenumber } set uniquely
identifies one specific function. This has proven to fail
for some configurable devices. One example is the Quectel
EM06/EP06 where the same interface number can be either
QMI or MBIM, without the device ID changing either.
Fix by requiring the vendor-specific class for interface number
based matching. Functions of other classes can and should use
class based matching instead.
Fixes: 03304bcb5e ("net: qmi_wwan: use fixed interface number matching")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 72f17baf23 ]
If an OVS_ATTR_NESTED attribute type is found while walking
through netlink attributes, we call nlattr_set() recursively
passing the length table for the following nested attributes, if
different from the current one.
However, once we're done with those sub-nested attributes, we
should continue walking through attributes using the current
table, instead of using the one related to the sub-nested
attributes.
For example, given this sequence:
1 OVS_KEY_ATTR_PRIORITY
2 OVS_KEY_ATTR_TUNNEL
3 OVS_TUNNEL_KEY_ATTR_ID
4 OVS_TUNNEL_KEY_ATTR_IPV4_SRC
5 OVS_TUNNEL_KEY_ATTR_IPV4_DST
6 OVS_TUNNEL_KEY_ATTR_TTL
7 OVS_TUNNEL_KEY_ATTR_TP_SRC
8 OVS_TUNNEL_KEY_ATTR_TP_DST
9 OVS_KEY_ATTR_IN_PORT
10 OVS_KEY_ATTR_SKB_MARK
11 OVS_KEY_ATTR_MPLS
we switch to the 'ovs_tunnel_key_lens' table on attribute #3,
and we don't switch back to 'ovs_key_lens' while setting
attributes #9 to #11 in the sequence. As OVS_KEY_ATTR_MPLS
evaluates to 21, and the array size of 'ovs_tunnel_key_lens' is
15, we also get this kind of KASan splat while accessing the
wrong table:
[ 7654.586496] ==================================================================
[ 7654.594573] BUG: KASAN: global-out-of-bounds in nlattr_set+0x164/0xde9 [openvswitch]
[ 7654.603214] Read of size 4 at addr ffffffffc169ecf0 by task handler29/87430
[ 7654.610983]
[ 7654.612644] CPU: 21 PID: 87430 Comm: handler29 Kdump: loaded Not tainted 3.10.0-866.el7.test.x86_64 #1
[ 7654.623030] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
[ 7654.631379] Call Trace:
[ 7654.634108] [<ffffffffb65a7c50>] dump_stack+0x19/0x1b
[ 7654.639843] [<ffffffffb53ff373>] print_address_description+0x33/0x290
[ 7654.647129] [<ffffffffc169b37b>] ? nlattr_set+0x164/0xde9 [openvswitch]
[ 7654.654607] [<ffffffffb53ff812>] kasan_report.part.3+0x242/0x330
[ 7654.661406] [<ffffffffb53ff9b4>] __asan_report_load4_noabort+0x34/0x40
[ 7654.668789] [<ffffffffc169b37b>] nlattr_set+0x164/0xde9 [openvswitch]
[ 7654.676076] [<ffffffffc167ef68>] ovs_nla_get_match+0x10c8/0x1900 [openvswitch]
[ 7654.684234] [<ffffffffb61e9cc8>] ? genl_rcv+0x28/0x40
[ 7654.689968] [<ffffffffb61e7733>] ? netlink_unicast+0x3f3/0x590
[ 7654.696574] [<ffffffffc167dea0>] ? ovs_nla_put_tunnel_info+0xb0/0xb0 [openvswitch]
[ 7654.705122] [<ffffffffb4f41b50>] ? unwind_get_return_address+0xb0/0xb0
[ 7654.712503] [<ffffffffb65d9355>] ? system_call_fastpath+0x1c/0x21
[ 7654.719401] [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
[ 7654.726298] [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
[ 7654.733195] [<ffffffffb53fe4b5>] ? kasan_unpoison_shadow+0x35/0x50
[ 7654.740187] [<ffffffffb53fe62a>] ? kasan_kmalloc+0xaa/0xe0
[ 7654.746406] [<ffffffffb53fec32>] ? kasan_slab_alloc+0x12/0x20
[ 7654.752914] [<ffffffffb53fe711>] ? memset+0x31/0x40
[ 7654.758456] [<ffffffffc165bf92>] ovs_flow_cmd_new+0x2b2/0xf00 [openvswitch]
[snip]
[ 7655.132484] The buggy address belongs to the variable:
[ 7655.138226] ovs_tunnel_key_lens+0xf0/0xffffffffffffd400 [openvswitch]
[ 7655.145507]
[ 7655.147166] Memory state around the buggy address:
[ 7655.152514] ffffffffc169eb80: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa
[ 7655.160585] ffffffffc169ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 7655.168644] >ffffffffc169ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa
[ 7655.176701] ^
[ 7655.184372] ffffffffc169ed00: fa fa fa fa 00 00 00 00 fa fa fa fa 00 00 00 05
[ 7655.192431] ffffffffc169ed80: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
[ 7655.200490] ==================================================================
Reported-by: Hangbin Liu <liuhangbin@gmail.com>
Fixes: 982b527004 ("openvswitch: Fix mask generation for nested attributes.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 080324c36a ]
In the case of writing a partial tls record we forgot to clear the
ctx->in_tcp_sendpages flag, causing some connections to stall.
Fixes: c212d2c7fc ("net/tls: Don't recursively call push_record during tls_write_space callbacks")
Signed-off-by: Andre Tomt <andre@tomt.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c212d2c7fc ]
It is reported that in some cases, write_space may be called in
do_tcp_sendpages, such that we recursively invoke do_tcp_sendpages again:
[ 660.468802] ? do_tcp_sendpages+0x8d/0x580
[ 660.468826] ? tls_push_sg+0x74/0x130 [tls]
[ 660.468852] ? tls_push_record+0x24a/0x390 [tls]
[ 660.468880] ? tls_write_space+0x6a/0x80 [tls]
...
tls_push_sg already does a loop over all sending sg's, so ignore
any tls_write_space notifications until we are done sending.
We then have to call the previous write_space to wake up
poll() waiters after we are done with the send loop.
Reported-by: Andre Tomt <andre@tomt.net>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 784813aed6 ]
The smc_poll code tries to finish connect() if the socket is in
state SMC_INIT and polling of the internal CLC-socket returns with
EPOLLOUT. This makes sense for a select/poll call following a connect
call, but not without preceding connect().
With this patch smc_poll starts connect logic only, if the CLC-socket
is no longer in its initial state TCP_CLOSE.
In addition, a poll error on the internal CLC-socket is always
propagated to the SMC socket.
With this patch the code path mentioned by syzbot
https://syzkaller.appspot.com/bug?extid=03faa2dc16b8b64be396
is no longer possible.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Reported-by: syzbot+03faa2dc16b8b64be396@syzkaller.appspotmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7df40c2673 ]
Normally, a socket can not be freed/reused unless all its TX packets
left qdisc and were TX-completed. However connect(AF_UNSPEC) allows
this to happen.
With commit fc59d5bdf1 ("pkt_sched: fq: clear time_next_packet for
reused flows") we cleared f->time_next_packet but took no special
action if the flow was still in the throttled rb-tree.
Since f->time_next_packet is the key used in the rb-tree searches,
blindly clearing it might break rb-tree integrity. We need to make
sure the flow is no longer in the rb-tree to avoid this problem.
Fixes: fc59d5bdf1 ("pkt_sched: fq: clear time_next_packet for reused flows")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a52956dfc5 ]
When application fails to pass flags in netlink TLV when replacing
existing skbmod action, the kernel will leak refcnt:
$ tc actions get action skbmod index 1
total acts 0
action order 0: skbmod pipe set smac 00:11:22:33:44:55
index 1 ref 1 bind 0
For example, at this point a buggy application replaces the action with
index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags,
however refcnt gets bumped:
$ tc actions get actions skbmod index 1
total acts 0
action order 0: skbmod pipe set smac 00:11:22:33:44:55
index 1 ref 2 bind 0
$
Tha patch fixes this by calling tcf_idr_release() on existing actions.
Fixes: 86da71b573 ("net_sched: Introduce skbmod action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f85900c3e1 ]
The HW doesn't support matching on frag first/later, return error if we are
asked to offload that.
Fixes: 3f7d0eb42d ("net/mlx5e: Offload TC matching on packets being IP fragments")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6ad4e91c6d ]
Add check of coalescing parameters received through ethtool are within
range of values supported by the HW.
Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the
users through ethtool. The ethtool support up to 32 bit value for each.
However, mlx4 modify cq limits the coalescing time parameter and
coalescing frames parameters to 16 bits.
Return out of range error if user tries to set these parameters to
higher values.
Change type of sample-interval and adaptive_rx_coal parameters in mlx4
driver to u32 as the ethtool holds them as u32 and these parameters are
not limited due to mlx4 HW.
Fixes: c27a02cd94 ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a577d868b7 ]
If an error occurs, 'mlx4_en_destroy_netdev()' is called.
It then calls 'mlx4_en_free_resources()' which does the needed resources
cleanup.
So, doing some explicit kfree in the error handling path would lead to
some double kfree.
Simplify code to avoid such a case.
Fixes: 67f8b1dcb9 ("net/mlx4_en: Refactor the XDP forwarding rings scheme")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5e5add172e ]
In dual_mac mode packets arrived on one port should not be forwarded by
switch hw to another port. Only Linux Host can forward packets between
ports. The below test case (reported in [1]) shows that packet arrived on
one port can be leaked to anoter (reproducible with dual port evms):
- connect port 1 (eth0) to linux Host 0 and run tcpdump or Wireshark
- connect port 2 (eth1) to linux Host 1 with vlan 1 configured
- ping <IPx> from Host 1 through vlan 1 interface.
ARP packets will be seen on Host 0.
Issue happens because dual_mac mode is implemnted using two vlans: 1 (Port
1+Port 0) and 2 (Port 2+Port 0), so there are vlan records created for for
each vlan. By default, the ALE will find valid vlan record in its table
when vlan 1 tagged packet arrived on Port 2 and so forwards packet to all
ports which are vlan 1 members (like Port.
To avoid such behaviorr the ALE VLAN ID Ingress Check need to be enabled
for each external CPSW port (ALE_PORTCTLn.VID_INGRESS_CHECK) so ALE will
drop ingress packets if Rx port is not VLAN member.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 14224923c3 ]
Currently, skb->len and skb->data_len are set to the page size, not
the packet size. This causes the frame check sequence to not be
located at the "end" of the packet resulting in ethernet frame check
errors. The driver does work currently, but stricter kernel facing
networking solutions like OpenVSwitch will drop these packets as
invalid.
These changes set the packet size correctly so that these errors no
longer occur. The length does not include the frame check sequence, so
that subtraction was removed.
Tested on Oracle/SUN Multithreaded 10-Gigabit Ethernet Network
Controller [108e:abcd] and validated in wireshark.
Signed-off-by: Rob Taglang <rob@taglang.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1b97013bfb ]
Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed
earlier in 919483096b.
* udp_sendmsg one was there since the beginning when linux sources were
first added to git;
* ping_v4_sendmsg one was copy/pasted in c319b4d76b.
Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options
have to be freed if they were allocated previously.
Add label so that future callers (if any) can use it instead of kfree()
before return that is easy to forget.
Fixes: c319b4d76b (net: ipv4: add IPPROTO_ICMP socket kind)
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 94720e3aee ]
Allow some non-cached routes to use non-expired fnhe:
1. ip_del_fnhe: moved above and now called by find_exception.
The 4.5+ commit deed49df73 expires fnhe only when caching
routes. Change that to:
1.1. use fnhe for non-cached local output routes, with the help
from (2)
1.2. allow __mkroute_input to detect expired fnhe (outdated
fnhe_gw, for example) when do_cache is false, eg. when itag!=0
for unicast destinations.
2. __mkroute_output: keep fi to allow local routes with orig_oif != 0
to use fnhe info even when the new route will not be cached into fnhe.
After commit 839da4d989 ("net: ipv4: set orig_oif based on fib
result for local traffic") it means all local routes will be affected
because they are not cached. This change is used to solve a PMTU
problem with IPVS (and probably Netfilter DNAT) setups that redirect
local clients from target local IP (local route to Virtual IP)
to new remote IP target, eg. IPVS TUN real server. Loopback has
64K MTU and we need to create fnhe on the local route that will
keep the reduced PMTU for the Virtual IP. Without this change
fnhe_pmtu is updated from ICMP but never exposed to non-cached
local routes. This includes routes with flowi4_oif!=0 for 4.6+ and
with flowi4_oif=any for 4.14+).
3. update_or_create_fnhe: make sure fnhe_expires is not 0 for
new entries
Fixes: 839da4d989 ("net: ipv4: set orig_oif based on fib result for local traffic")
Fixes: d6d5e999e5 ("route: do not cache fib route info on local routes with oif")
Fixes: deed49df73 ("route: check and remove route cache when we get route")
Cc: David Ahern <dsahern@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e8238fc2bd ]
When we set a bond slave's master to bridge via ioctl, we only check
the IFF_BRIDGE_PORT flag. Although we will find the slave's real master
at netdev_master_upper_dev_link() later, it already does some settings
and allocates some resources. It would be better to return as early
as possible.
v1 -> v2:
use netdev_master_upper_dev_get() instead of netdev_has_any_upper_dev()
to check if we have a master, because not all upper devs are masters,
e.g. vlan device.
Reported-by: syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e50d9ebae upstream.
If a controller reset is requested while the device has no namespaces,
we were incorrectly returning ENETRESET. This patch adds the check for
ADMIN_ONLY controller state to indicate a successful reset.
Fixes: 8000d1fdb0 ("nvme-rdma: fix sysfs invoked reset_ctrl error flow ")
Cc: <stable@vger.kernel.org>
Signed-off-by: Charles Machalow <charles.machalow@intel.com>
[changelog]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9abd68ef45 upstream.
Some P3100 drives have a bug where they think WRRU (weighted round robin)
is always enabled, even though the host doesn't set it. Since they think
it's enabled, they also look at the submission queue creation priority. We
used to set that to MEDIUM by default, but that was removed in commit
81c1cd9835. This causes various issues on that drive. Add a quirk to
still set MEDIUM priority for that controller.
Fixes: 81c1cd9835 ("nvme/pci: Don't set reserved SQ create flags")
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8da6cdef5 upstream.
tmu_read() in case of Exynos4210 might return error for out of bound
values. Current code ignores such value, what leads to reporting critical
temperature value. Add proper error code propagation to exynos_get_temp()
function.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
CC: stable@vger.kernel.org # v4.6+
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88fc6f73fd upstream.
When thermal sensor is not yet enabled, reading temperature might return
random value. This might even result in stopping system booting when such
temperature is higher than the critical value. Fix this by checking if TMU
has been actually enabled before reading the temperature.
This change fixes booting of Exynos4210-based board with TMU enabled (for
example Samsung Trats board), which was broken since v4.4 kernel release.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 9e4249b403 ("thermal: exynos: Fix first temperature read after registering sensor")
CC: stable@vger.kernel.org # v4.6+
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fc54910280 upstream.
Jeremy Cline correctly points out in rhbz#1514836 that a device where the
QCA rome chipset needs the USB_QUIRK_RESET_RESUME quirk, may also ship
with a different wifi/bt chipset in some configurations.
If that is the case then we are needlessly penalizing those other chipsets
with a reset-resume quirk, typically causing 0.4W extra power use because
this disables runtime-pm.
This commit moves the DMI table check to a btusb_check_needs_reset_resume()
helper (so that we can easily also call it for other chipsets) and calls
this new helper only for QCA_ROME chipsets for now.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1514836
Cc: stable@vger.kernel.org
Cc: Jeremy Cline <jcline@redhat.com>
Suggested-by: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a62dcf486 upstream.
Commit d50f4630c2 ("arm: dts: Remove p1010-flexcan compatible from imx
series dts") removed the fallback compatible "fsl,p1010-flexcan" from
the imx device trees. As the flexcan cores on i.MX25, i.MX35 and i.MX53
are identical, introduce the first as fallback for the two latter ones.
Fixes: d50f4630c2 ("arm: dts: Remove p1010-flexcan compatible from imx series dts")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: linux-stable <stable@vger.kernel.org> # >= v4.16
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 97739501f2 upstream.
If the next_freq field of struct sugov_policy is set to UINT_MAX,
it shouldn't be used for updating the CPU frequency (this is a
special "invalid" value), but after commit b7eaf1aab9 (cpufreq:
schedutil: Avoid reducing frequency of busy CPUs prematurely) it
may be passed as the new frequency to sugov_update_commit() in
sugov_update_single().
Fix that by adding an extra check for the special UINT_MAX value
of next_freq to sugov_update_single().
Fixes: b7eaf1aab9 (cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely)
Reported-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.12+ <stable@vger.kernel.org> # 4.12+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cfcadfaad7 upstream.
Commit 0847684cfc (PCI / PM: Simplify device wakeup settings code)
went too far and dropped the device_may_wakeup() check from
pci_enable_wake() which causes wakeup to be enabled during system
suspend, hibernation or shutdown for some PCI devices that are not
allowed by user space to wake up the system from sleep (or power off).
As a result of this, excessive power is drawn by some of the affected
systems while in sleep states or off.
Restore the device_may_wakeup() check in pci_enable_wake(), but make
sure that the PCI bus type's runtime suspend callback will not call
device_may_wakeup() which is about system wakeup from sleep and not
about device wakeup from runtime suspend.
Fixes: 0847684cfc (PCI / PM: Simplify device wakeup settings code)
Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Cc: 4.13+ <stable@vger.kernel.org> # 4.13+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8feaec33b9 upstream.
USB controller ASM1042 stops working after commit de3ef1eb1c (PM /
core: Drop run_wake flag from struct dev_pm_info).
The device in question is not power managed by platform firmware,
furthermore, it only supports PME# from D3cold:
Capabilities: [78] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot-,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Before commit de3ef1eb1c, the device never gets runtime suspended.
After that commit, the device gets runtime suspended to D3hot, which can
not generate any PME#.
usb_hcd_pci_probe() unconditionally calls device_wakeup_enable(), hence
device_can_wakeup() in pci_dev_run_wake() always returns true.
So pci_dev_run_wake() needs to check PME wakeup capability as its first
condition.
In addition, change wakeup flag passed to pci_target_state() from false
to true, because we want to find the deepest state different from D3cold
that the device can still generate PME#. In this case, it's D0 for the
device in question.
Fixes: de3ef1eb1c (PM / core: Drop run_wake flag from struct dev_pm_info)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: 4.13+ <stable@vger.kernel.org> # 4.13+
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2be147f745 upstream.
pool can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/atm/zatm.c:1462 zatm_ioctl() warn: potential spectre issue
'zatm_dev->pool_info' (local cap)
Fix this by sanitizing pool before using it to index
zatm_dev->pool_info
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit acf784bd0c upstream.
ioc_data.dev_num can be controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
net/atm/lec.c:702 lec_vcc_attach() warn: potential spectre issue
'dev_lec'
Fix this by sanitizing ioc_data.dev_num before using it to index
dev_lec. Also, notice that there is another instance in which array
dev_lec is being indexed using ioc_data.dev_num at line 705:
lec_vcc_added(netdev_priv(dev_lec[ioc_data.dev_num]),
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f0b408eebc upstream.
Clear the old_state and new_state pointers for every object in
drm_atomic_state_default_clear(). Otherwise
drm_atomic_get_{new,old}_*_state() will hand out stale pointers to
anyone who hasn't first confirmed that the object is in fact part of
the current atomic transcation, if they are called after we've done
the ww backoff dance while hanging on to the same drm_atomic_state.
For example, handle_conflicting_encoders() looks like it could hit
this since it iterates the full connector list and just calls
drm_atomic_get_new_connector_state() for each.
And I believe we have now witnessed this happening at least once in
i915 check_digital_port_conflicts(). Commit 8b69449d26 ("drm/i915:
Remove last references to drm_atomic_get_existing* macros") changed
the safe drm_atomic_get_existing_connector_state() to the unsafe
drm_atomic_get_new_connector_state(), which opened the doors for
this particular bug there as well.
v2: Split private objs out to a separate patch (Daniel)
Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Abhay Kumar <abhay.kumar@intel.com>
Fixes: 581e49fe6b ("drm/atomic: Add new iterators over all state, v3.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180502183247.5746-1-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d219554d9 upstream.
On intel_dp_compute_config() we were calculating the needed vco
for eDP on gen9 and we stashing it in
intel_atomic_state.cdclk.logical.vco
However few moments later on intel_modeset_checks() we fully
replace entire intel_atomic_state.cdclk.logical with
dev_priv->cdclk.logical fully overwriting the logical desired
vco for eDP on gen9.
So, with wrong VCO value we end up with wrong desired cdclk, but
also it will raise a lot of WARNs: On gen9, when we read
CDCLK_CTL to verify if we configured properly the desired
frequency the CD Frequency Select bits [27:26] == 10b can mean
337.5 or 308.57 MHz depending on the VCO. So if we have wrong
VCO value stashed we will believe the frequency selection didn't
stick and start to raise WARNs of cdclk mismatch.
[ 42.857519] [drm:intel_dump_cdclk_state [i915]] Changing CDCLK to 308571 kHz, VCO 8640000 kHz, ref 24000 kHz, bypass 24000 kHz, voltage level 0
[ 42.897269] cdclk state doesn't match!
[ 42.901052] WARNING: CPU: 5 PID: 1116 at drivers/gpu/drm/i915/intel_cdclk.c:2084 intel_set_cdclk+0x5d/0x110 [i915]
[ 42.938004] RIP: 0010:intel_set_cdclk+0x5d/0x110 [i915]
[ 43.155253] WARNING: CPU: 5 PID: 1116 at drivers/gpu/drm/i915/intel_cdclk.c:2084 intel_set_cdclk+0x5d/0x110 [i915]
[ 43.170277] [drm:intel_dump_cdclk_state [i915]] [hw state] 337500 kHz, VCO 8100000 kHz, ref 24000 kHz, bypass 24000 kHz, voltage level 0
[ 43.182566] [drm:intel_dump_cdclk_state [i915]] [sw state] 308571 kHz, VCO 8640000 kHz, ref 24000 kHz, bypass 24000 kHz, voltage level 0
v2: Move the entire eDP's vco logical adjustment to inside
the skl_modeset_calc_cdclk as suggested by Ville.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: bb0f4aab0e ("drm/i915: Track full cdclk state for the logical and actual cdclk frequencies")
Cc: <stable@vger.kernel.org> # v4.12+
Link: https://patchwork.freedesktop.org/patch/msgid/20180502175255.5344-1-rodrigo.vivi@intel.com
(cherry picked from commit 3297234a05)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8f48f96db upstream.
Fix `[drm:intel_enable_lvds] *ERROR* timed out waiting for panel to
power on` in kernel log at boot time.
Toshiba Satellite Z930 laptops needs between 1 and 2 seconds to power
on its screen during Intel i915 DRM initialization. This currently
results in a `[drm:intel_enable_lvds] *ERROR* timed out waiting for
panel to power on` message appearing in the kernel log during boot
time and when stopping the machine.
This change increases the timeout of the `intel_enable_lvds` function
from 1 to 5 seconds, letting enough time for the Satellite 930 LCD
screen to power on, and suppressing the error message from the kernel
log.
This patch has been successfully tested on Linux 4.14 running on a
Toshiba Satellite Z930.
[vsyrjala: bump the timeout from 2 to 5 seconds to match the DP
code and properly cover the max hw timeout of ~4 seconds, and
drop the comment about the specific machine since this is not
a particulary surprising issue, nor specific to that one machine]
Signed-off-by: Florent Flament <contact@florentflament.com>
Cc: stable@vger.kernel.org
Cc: Pavel Petrovic <ppetrovic@acm.org>
Cc: Sérgio M. Basto <sergio@serjux.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103414
References: https://bugzilla.kernel.org/show_bug.cgi?id=57591
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180419160700.19828-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 280b54ade5)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da291320ba upstream.
GFP_TRANSHUGE tries very hard to allocate huge pages, which can result
in long delays with high memory pressure. I have observed firefox
freezing for up to around a minute due to this while restic was taking
a full system backup.
Since we don't really need huge pages, use GFP_TRANSHUGE_LIGHT |
__GFP_NORETRY instead, in order to fail quickly when there are no huge
pages available.
Set __GFP_KSWAPD_RECLAIM as well, in order for huge pages to be freed
up in the background if necessary.
With these changes, I'm no longer seeing freezes during a restic backup.
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2ee41fd95 upstream.
One layout supported by the Marvell NAND controller supports NAND pages
of 2048 bytes, all handled in one single chunk when using BCH with a
strength of 4-bit per 512 bytes. In this case, instead of the generic
XTYPE_WRITE_DISPATCH/XTYPE_LAST_NAKED_RW couple, the controller expects
to receive XTYPE_MONOLITHIC_RW.
This fixes problems at boot like:
[ 1.315475] Scanning device for bad blocks
[ 3.203108] marvell-nfc f10d0000.flash: Timeout waiting for RB signal
[ 3.209564] nand_bbt: error while writing BBT block -110
[ 4.243106] marvell-nfc f10d0000.flash: Timeout waiting for RB signal
[ 5.283106] marvell-nfc f10d0000.flash: Timeout waiting for RB signal
[ 5.289562] nand_bbt: error -110 while marking block 2047 bad
[ 6.323106] marvell-nfc f10d0000.flash: Timeout waiting for RB signal
[ 6.329559] nand_bbt: error while writing BBT block -110
[ 7.363106] marvell-nfc f10d0000.flash: Timeout waiting for RB signal
[ 8.403105] marvell-nfc f10d0000.flash: Timeout waiting for RB signal
[ 8.409559] nand_bbt: error -110 while marking block 2046 bad
...
Fixes: 02f26ecf8c ("mtd: nand: add reworked Marvell NAND controller driver")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32bee8f48f upstream.
When sending packets as fast as possible using "cangen -g 0 -i -x", the
HI-3110 occasionally latches the interrupt pin high on completion of a
packet, but doesn't set the TXCPLT bit in the INTF register. The INTF
register contains 0x00 as if no interrupt has occurred. Even waiting
for a few milliseconds after the interrupt doesn't help.
Work around this apparent erratum by instead checking the TXMTY bit in
the STATF register ("TX FIFO empty"). We know that we've queued up a
packet for transmission if priv->tx_len is nonzero. If the TX FIFO is
empty, transmission of that packet must have completed.
Note that this is congruent with our handling of received packets, which
likewise gleans from the STATF register whether a packet is waiting in
the RX FIFO, instead of looking at the INTF register.
Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Akshay Bhat <akshay.bhat@timesys.com>
Cc: Casey Fitzpatrick <casey.fitzpatrick@timesys.com>
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Akshay Bhat <akshay.bhat@timesys.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5cec9425b4 upstream.
hi3110_get_berr_counter() may run concurrently to the rest of the driver
but neglects to acquire the lock protecting access to the SPI device.
As a result, it and the rest of the driver may clobber each other's tx
and rx buffers.
We became aware of this issue because transmission of packets with
"cangen -g 0 -i -x" frequently hung. It turns out that agetty executes
->do_get_berr_counter every few seconds via the following call stack:
CPU: 2 PID: 1605 Comm: agetty
[<7f3f7500>] (hi3110_get_berr_counter [hi311x])
[<7f130204>] (can_fill_info [can_dev])
[<80693bc0>] (rtnl_fill_ifinfo)
[<806949ec>] (rtnl_dump_ifinfo)
[<806b4834>] (netlink_dump)
[<806b4bc8>] (netlink_recvmsg)
[<8065f180>] (sock_recvmsg)
[<80660f90>] (___sys_recvmsg)
[<80661e7c>] (__sys_recvmsg)
[<80661ec0>] (SyS_recvmsg)
[<80108b20>] (ret_fast_syscall+0x0/0x1c)
agetty listens to netlink messages in order to update the login prompt
when IP addresses change (if /etc/issue contains \4 or \6 escape codes):
https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/commit/?id=e36deb6424e8
It's a useful feature, though it seems questionable that it causes CAN
bit error statistics to be queried.
Be that as it may, if hi3110_get_berr_counter() is invoked while a frame
is sent by hi3110_hw_tx(), bogus SPI transfers like the following may
occur:
=> 12 00 (hi3110_get_berr_counter() wanted to transmit
EC 00 to query the transmit error counter,
but the first byte was overwritten by
hi3110_hw_tx_frame())
=> EA 00 3E 80 01 FB (hi3110_hw_tx_frame() wanted to transmit a
frame, but the first byte was overwritten by
hi3110_get_berr_counter() because it wanted
to query the receive error counter)
This sequence hangs the transmission because the driver believes it has
sent a frame and waits for the interrupt signaling completion, but in
reality the chip has never sent away the frame since the commands it
received were malformed.
Fix by acquiring the SPI lock in hi3110_get_berr_counter().
I've scrutinized the entire driver for further unlocked SPI accesses but
found no others.
Cc: Mathias Duckeck <m.duckeck@kunbus.de>
Cc: Akshay Bhat <akshay.bhat@timesys.com>
Cc: Casey Fitzpatrick <casey.fitzpatrick@timesys.com>
Cc: Stef Walter <stefw@redhat.com>
Cc: Karel Zak <kzak@redhat.com>
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Akshay Bhat <akshay.bhat@timesys.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e030a373d upstream.
In commit 88462d2a78 ("can: flexcan: Remodel FlexCAN register r/w APIs
for big endian FlexCAN controllers.") the following logic was
implemented:
if the dt property "big-endian" is given or
the device is compatible to "fsl,p1010-flexcan":
use big-endian mode;
else
use little-endian mode;
This relies on commit d50f4630c2 ("arm: dts: Remove p1010-flexcan
compatible from imx series dts") which was applied a few commits later.
Without this commit (or an old device tree used for booting a new
kernel) the flexcan devices on i.MX25, i.MX28, i.MX35 and i.MX53 match
the 'the device is compatible to "fsl,p1010-flexcan"' test and so are
switched erroneously to big endian mode.
Instead of the check above put a quirk in devtype data and rely on
of_match_device yielding the most compatible match
Fixes: 88462d2a78 ("can: flexcan: Remodel FlexCAN register r/w APIs for big endian FlexCAN controllers.")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Gavin Schenk <g.schenk@eckelmann.de>
Cc: linux-stable <stable@vger.kernel.org> # >= v4.16
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a15b38fd2 upstream.
rsize/wsize cap should be applied before ceph_osdc_new_request() is
called. Otherwise, if the size is limited by the cap instead of the
stripe unit, ceph_osdc_new_request() would setup an extent op that is
bigger than what dio_get_pages_alloc() would pin and add to the page
vector, triggering asserts in the messenger.
Cc: stable@vger.kernel.org
Fixes: 95cca2b44e ("ceph: limit osd write size")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 27ae357fa8 upstream.
Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.
This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma. Since munlock_vma_pages_range() depends
on clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.
This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd
is zapped by the oom reaper during follow_page_mask() after the check
for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a
kernel oops.
Fix this by manually freeing all possible memory from the mm before
doing the munlock and then setting MMF_OOM_SKIP. The oom reaper can not
run on the mm anymore so the munlock is safe to do in exit_mmap(). It
also matches the logic that the oom reaper currently uses for
determining when to set MMF_OOM_SKIP itself, so there's no new risk of
excessive oom killing.
This issue fixes CVE-2018-1000200.
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com
Fixes: 2129258024 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: David Rientjes <rientjes@google.com>
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc432c3d7f upstream.
The regex match function regex_match_front() in the tracing filter logic,
was fixed to test just the pattern length from testing the entire test
string. That is, it went from strncmp(str, r->pattern, len) to
strcmp(str, r->pattern, r->len).
The issue is that str is not guaranteed to be nul terminated, and if r->len
is greater than the length of str, it can access more memory than is
allocated.
The solution is to add a simple test if (len < r->len) return 0.
Cc: stable@vger.kernel.org
Fixes: 285caad415 ("tracing/filters: Fix MATCH_FRONT_ONLY filter matching")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 184add2ca2 upstream.
Richard Jones has reported that using med_power_with_dipm on a T450s
with a Sandisk SD7UB3Q256G1001 SSD (firmware version X2180501) is
causing the machine to hang.
Switching the LPM to max_performance fixes this, so it seems that
this Sandisk SSD does not handle LPM well.
Note in the past there have been bug-reports about the following
Sandisk models not working with min_power, so we may need to extend
the quirk list in the future: name - firmware
Sandisk SD6SB2M512G1022I - X210400
Sandisk SD6PP4M-256G-1006 - A200906
Cc: stable@vger.kernel.org
Cc: Richard W.M. Jones <rjones@redhat.com>
Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ab3dbcf78f upstream.
If the main loop in linehandle_create() encounters an error, it
unwinds completely by freeing all previously requested GPIO
descriptors. However, if the error occurs in the beginning of
the loop before that GPIO is requested, then the exit code
attempts to free a null descriptor. If extrachecks is enabled,
gpiod_free() triggers a WARN_ON.
Instead, keep a separate count of legitimate GPIOs so that only
those are freed.
Cc: stable@vger.kernel.org
Fixes: d7c51b47ac ("gpio: userspace ABI for reading/writing GPIO lines")
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a0b987344 upstream.
Commit 3a4d44b616 ("ntp: Move adjtimex related compat syscalls to
native counterparts") removed the memset() in compat_get_timex(). Since
then, the compat adjtimex syscall can invoke do_adjtimex() with an
uninitialized ->tai.
If do_adjtimex() doesn't write to ->tai (e.g. because the arguments are
invalid), compat_put_timex() then copies the uninitialized ->tai field
to userspace.
Fix it by adding the memset() back.
Fixes: 3a4d44b616 ("ntp: Move adjtimex related compat syscalls to native counterparts")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b8b784958e upstream.
Syzbot has reported that it can hit a NULL pointer dereference in
wb_workfn() due to wb->bdi->dev being NULL. This indicates that
wb_workfn() was called for an already unregistered bdi which should not
happen as wb_shutdown() called from bdi_unregister() should make sure
all pending writeback works are completed before bdi is unregistered.
Except that wb_workfn() itself can requeue the work with:
mod_delayed_work(bdi_wq, &wb->dwork, 0);
and if this happens while wb_shutdown() is waiting in:
flush_delayed_work(&wb->dwork);
the dwork can get executed after wb_shutdown() has finished and
bdi_unregister() has cleared wb->bdi->dev.
Make wb_workfn() use wakeup_wb() for requeueing the work which takes all
the necessary precautions against racing with bdi unregistration.
CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
CC: Tejun Heo <tj@kernel.org>
Fixes: 839a8e8660
Reported-by: syzbot <syzbot+9873874c735f2892e7e9@syzkaller.appspotmail.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f53823c181 upstream.
syzbot is reporting use after free bug in debugfs_remove() [1].
This is because fault injection made memory allocation for
debugfs_create_file() from bdi_debug_register() from bdi_register_va()
fail and continued with setting WB_registered. But when debugfs_remove()
is called from debugfs_remove(bdi->debug_dir) from bdi_debug_unregister()
from bdi_unregister() from release_bdi() because WB_registered was set
by bdi_register_va(), IS_ERR_OR_NULL(bdi->debug_dir) == false despite
debugfs_remove(bdi->debug_dir) was already called from bdi_register_va().
Fix this by making IS_ERR_OR_NULL(bdi->debug_dir) == true.
[1] https://syzkaller.appspot.com/bug?id=5ab4efd91a96dcea9b68104f159adf4af2a6dfc1
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+049cb4ae097049dac137@syzkaller.appspotmail.com>
Fixes: 97f0769793 ("bdi: convert bdi_debug_register to int")
Cc: weiping zhang <zhangweiping@didichuxing.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 23a27722b5 upstream.
i2cdev_ioctl_rdwr() allocates i2c_msg.buf using memdup_user(), which
returns ZERO_SIZE_PTR if i2c_msg.len is zero.
Currently i2cdev_ioctl_rdwr() always dereferences the buf pointer in case
of I2C_M_RD | I2C_M_RECV_LEN transfer. That causes a kernel oops in
case of zero len.
Let's check the len against zero before dereferencing buf pointer.
This issue was triggered by syzkaller.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
[wsa: use '< 1' instead of '!' for easier readability]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4eaf431f6f upstream.
syzbot has triggered a NULL ptr dereference when allocation fault
injection enforces a failure and alloc_mem_cgroup_per_node_info
initializes memcg->nodeinfo only half way through.
But __mem_cgroup_free still tries to free all per-node data and
dereferences pn->lruvec_stat_cpu unconditioanlly even if the specific
per-node data hasn't been initialized.
The bug is quite unlikely to hit because small allocations do not fail
and we would need quite some numa nodes to make struct
mem_cgroup_per_node large enough to cross the costly order.
Link: http://lkml.kernel.org/r/20180406100906.17790-1-mhocko@kernel.org
Reported-by: syzbot+8a5de3cce7cdc70e9ebe@syzkaller.appspotmail.com
Fixes: 00f3ca2c2d ("mm: memcontrol: per-lruvec stats infrastructure")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a38bb98d9 upstream.
syzbot reported a possible deadlock in perf_event_detach_bpf_prog.
The error details:
======================================================
WARNING: possible circular locking dependency detected
4.16.0-rc7+ #3 Not tainted
------------------------------------------------------
syz-executor7/24531 is trying to acquire lock:
(bpf_event_mutex){+.+.}, at: [<000000008a849b07>] perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854
but task is already holding lock:
(&mm->mmap_sem){++++}, at: [<0000000038768f87>] vm_mmap_pgoff+0x198/0x280 mm/util.c:353
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&mm->mmap_sem){++++}:
__might_fault+0x13a/0x1d0 mm/memory.c:4571
_copy_to_user+0x2c/0xc0 lib/usercopy.c:25
copy_to_user include/linux/uaccess.h:155 [inline]
bpf_prog_array_copy_info+0xf2/0x1c0 kernel/bpf/core.c:1694
perf_event_query_prog_array+0x1c7/0x2c0 kernel/trace/bpf_trace.c:891
_perf_ioctl kernel/events/core.c:4750 [inline]
perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4770
vfs_ioctl fs/ioctl.c:46 [inline]
do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
SYSC_ioctl fs/ioctl.c:701 [inline]
SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
-> #0 (bpf_event_mutex){+.+.}:
lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
__mutex_lock_common kernel/locking/mutex.c:756 [inline]
__mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854
perf_event_free_bpf_prog kernel/events/core.c:8147 [inline]
_free_event+0xbdb/0x10f0 kernel/events/core.c:4116
put_event+0x24/0x30 kernel/events/core.c:4204
perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172
remove_vma+0xb4/0x1b0 mm/mmap.c:172
remove_vma_list mm/mmap.c:2490 [inline]
do_munmap+0x82a/0xdf0 mm/mmap.c:2731
mmap_region+0x59e/0x15a0 mm/mmap.c:1646
do_mmap+0x6c0/0xe00 mm/mmap.c:1483
do_mmap_pgoff include/linux/mm.h:2223 [inline]
vm_mmap_pgoff+0x1de/0x280 mm/util.c:355
SYSC_mmap_pgoff mm/mmap.c:1533 [inline]
SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491
SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&mm->mmap_sem);
lock(bpf_event_mutex);
lock(&mm->mmap_sem);
lock(bpf_event_mutex);
*** DEADLOCK ***
======================================================
The bug is introduced by Commit f371b304f1 ("bpf/tracing: allow
user space to query prog array on the same tp") where copy_to_user,
which requires mm->mmap_sem, is called inside bpf_event_mutex lock.
At the same time, during perf_event file descriptor close,
mm->mmap_sem is held first and then subsequent
perf_event_detach_bpf_prog needs bpf_event_mutex lock.
Such a senario caused a deadlock.
As suggested by Daniel, moving copy_to_user out of the
bpf_event_mutex lock should fix the problem.
Fixes: f371b304f1 ("bpf/tracing: allow user space to query prog array on the same tp")
Reported-by: syzbot+dc5ca0e4c9bfafaf2bae@syzkaller.appspotmail.com
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b1993a2de1 upstream.
syzbot reported :
BUG: KMSAN: uninit-value in rtnh_ok include/net/nexthop.h:11 [inline]
BUG: KMSAN: uninit-value in fib_count_nexthops net/ipv4/fib_semantics.c:469 [inline]
BUG: KMSAN: uninit-value in fib_create_info+0x554/0x8d20 net/ipv4/fib_semantics.c:1091
@remaining is an integer, coming from user space.
If it is negative we want rtnh_ok() to return false.
Fixes: 4e902c5741 ("[IPv4]: FIB configuration using struct fib_config")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 53d0e83f93 upstream.
rds_tcp_connection allocation/free management has the potential to be
called from __rds_conn_create after IRQs have been disabled, so
spin_[un]lock_bh cannot be used with rds_tcp_conn_lock.
Bottom-halves that need to synchronize for critical sections protected
by rds_tcp_conn_lock should instead use rds_destroy_pending() correctly.
Reported-by: syzbot+c68e51bb5e699d3f8d91@syzkaller.appspotmail.com
Fixes: ebeeb1ad9b ("rds: tcp: use rds_destroy_pending() to synchronize
netns/module teardown and rds connection/workq management")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 269bd202bc upstream.
The introduction of support for CLK_SET_RATE_PARENT flag for clkctrl
clocks used a generic clock flag, which causes a conflict with the
rest of the clkctrl flags, namely the NO_IDLEST flag. This can cause
boot failures on certain platforms where this flag is introduced, by
omitting the wait for the clockctrl module to be fully enabled before
proceeding with rest of the code.
Fix this by moving all the clkctrl specific flags to their own bit-range.
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Fixes: 49159a9dc3 ("clk: ti: add support for CLK_SET_RATE_PARENT flag")
Reported-by: Christophe Lyon <christophe.lyon@linaro.org>
Tested-by: Tony Lindgren <tony@atomide.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Cc: Sam Protsenko <semen.protsenko@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f1e53abff upstream.
Dmitry reports 32bit ebtables on 64bit kernel got broken by
a recent change that returns -EINVAL when ruleset has no entries.
ebtables however only counts user-defined chains, so for the
initial table nentries will be 0.
Don't try to allocate the compat array in this case, as no user
defined rules exist no rule will need 64bit translation.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 7d7d7e0211 ("netfilter: compat: reject huge allocation requests")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c92c7a3c5 upstream.
As Miklos reported and suggested:
This pattern repeats two times in trace_uprobe.c and in
kernel/events/core.c as well:
ret = kern_path(filename, LOOKUP_FOLLOW, &path);
if (ret)
goto fail_address_parse;
inode = igrab(d_inode(path.dentry));
path_put(&path);
And it's wrong. You can only hold a reference to the inode if you
have an active ref to the superblock as well (which is normally
through path.mnt) or holding s_umount.
This way unmounting the containing filesystem while the tracepoint is
active will give you the "VFS: Busy inodes after unmount..." message
and a crash when the inode is finally put.
Solution: store path instead of inode.
This patch fixes two instances in trace_uprobe.c. struct path is added to
struct trace_uprobe to keep the inode and containing mount point
referenced.
Link: http://lkml.kernel.org/r/20180423172135.4050588-1-songliubraving@fb.com
Fixes: f3f096cfed ("tracing: Provide trace events interface for uprobes")
Fixes: 33ea4b2427 ("perf/core: Implement the 'perf_uprobe' PMU")
Cc: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Howard McLauchlan <hmclauchlan@fb.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2aae7bcfa4 upstream.
Because of how the code flips between tsc-early and tsc clocksources
it might need to mark one or both unstable. The current code in
mark_tsc_unstable() only worked because previously it registered the
tsc clocksource once and then never touched it.
Since it now unregisters the tsc-early clocksource, it needs to know
if a clocksource got unregistered and the current cs->mult test
doesn't work for that. Instead use list_empty(&cs->list) to test for
registration.
Furthermore, since clocksource_mark_unstable() needs to place the cs
on the wd_list, it links the cs->list and cs->wd_list serialization.
It must not see a clocsource registered (!empty cs->list) but already
past dequeue_watchdog(). So place {en,de}queue{,_watchdog}() under the
same lock.
Provided cs->list is initialized to empty, this then allows us to
unconditionally use clocksource_mark_unstable(), regardless of the
registration state.
Fixes: aa83c45762 ("x86/tsc: Introduce early tsc clocksource")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Diego Viola <diego.viola@gmail.com>
Cc: len.brown@intel.com
Cc: rjw@rjwysocki.net
Cc: diego.viola@gmail.com
Cc: rui.zhang@intel.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180502135312.GS12217@hirez.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ecf08dad72 upstream.
Since the commit "8003c9ae204e: add APIC Timer periodic/oneshot mode VMX
preemption timer support", a Windows 10 guest has some erratic timer
spikes.
Here the results on a 150000 times 1ms timer without any load:
Before 8003c9ae20 | After 8003c9ae20
Max 1834us | 86000us
Mean 1100us | 1021us
Deviation 59us | 149us
Here the results on a 150000 times 1ms timer with a cpu-z stress test:
Before 8003c9ae20 | After 8003c9ae20
Max 32000us | 140000us
Mean 1006us | 1997us
Deviation 140us | 11095us
The root cause of the problem is starting hrtimer with an expiry time
already in the past can take more than 20 milliseconds to trigger the
timer function. It can be solved by forward such past timers
immediately, rather than submitting them to hrtimer_start().
In case the timer is periodic, update the target expiration and call
hrtimer_start with it.
v2: Check if the tsc deadline is already expired. Thank you Mika.
v3: Execute the past timers immediately rather than submitting them to
hrtimer_start().
v4: Rearm the periodic timer with advance_periodic_target_expiration() a
simpler version of set_target_expiration(). Thank you Paolo.
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@blade-group.com>
8003c9ae20 ("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7fe3fa3b5e upstream.
As reported by Randy Dunlap:
>> WARNING: unmet direct dependencies detected for DELL_SMBIOS
>> Depends on [m]: X86 [=y] && X86_PLATFORM_DEVICES [=y]
>> && (DCDBAS [=m] ||
>> DCDBAS [=m]=n) && (ACPI_WMI [=n] || ACPI_WMI [=n]=n)
>> Selected by [y]:
>> - DELL_LAPTOP [=y] && X86 [=y] && X86_PLATFORM_DEVICES [=y]
>> && DMI [=y]
>> && BACKLIGHT_CLASS_DEVICE [=y] && (ACPI_VIDEO [=n] ||
>> ACPI_VIDEO [=n]=n)
>> && (RFKILL [=n] || RFKILL [=n]=n) && SERIO_I8042 [=y]
>>
Right now it's possible to set dell laptop to compile in but this
causes dell-smbios to compile in which breaks if dcdbas is a module.
Dell laptop shouldn't select dell-smbios anymore, but depend on it.
Fixes: 32d7b19bad (platform/x86: dell-smbios: Resolve dependency error on DCDBAS)
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Cc: stable@vger.kernel.org
Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9aea9b6cc7 upstream.
The usb_request pointer could be NULL in musb_g_tx(), where the
tracepoint call would trigger the NULL pointer dereference failure when
parsing the members of the usb_request pointer.
Move the tracepoint call to where the usb_request pointer is already
checked to solve the issue.
Fixes: fc78003e53 ("usb: musb: gadget: add usb-request tracepoints")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b63f1329d upstream.
musb_start_urb() doesn't check the pass-in parameter if it is NULL. But
in musb_bulk_nak_timeout() the parameter passed to musb_start_urb() is
returned from first_qh(), which could be NULL.
So wrap the musb_start_urb() call here with a if condition check to
avoid the potential NULL pointer dereference.
Fixes: f283862f3b ("usb: musb: NAK timeout scheme on bulk TX endpoint")
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3a65808f0 upstream.
Reimplement interface masking using device flags stored directly in the
device-id table. This will make it easier to add and maintain device-id
entries by using a more compact and readable notation compared to the
current implementation (which manages pairs of masks in separate
blacklist structs).
Two convenience macros are used to flag an interface as either reserved
or as not supporting modem-control requests:
{ USB_DEVICE(TELIT_VENDOR_ID, TELIT_PRODUCT_ME910_DUAL_MODEM),
.driver_info = NCTRL(0) | RSVD(3) },
For now, we limit the highest maskable interface number to seven, which
allows for (up to 16) additional device flags to be added later should
need arise.
Note that this will likely need to be backported to stable in order to
make future device-id backports more manageable.
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb5ee84ea7 upstream.
Some non-compliant high-speed USB devices have bulk endpoints with a
1024-byte maxpacket size. Although such endpoints don't work with
xHCI host controllers, they do work with EHCI controllers. We used to
accept these invalid sizes (with a warning), but we no longer do
because of an unintentional change introduced by commit aed9d65ac3
("USB: validate wMaxPacketValue entries in endpoint descriptors").
This patch restores the old behavior, so that people with these
peculiar devices can use them without patching their kernels by hand.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Elvinas <elvinas@veikia.lt>
Fixes: aed9d65ac3 ("USB: validate wMaxPacketValue entries in endpoint descriptors")
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 96bd39df29 upstream.
dwc3_ep_dequeue() waits for completion of End Transfer command using
wait_event_lock_irq(), which will release the dwc3->lock while waiting
and reacquire after completion. This allows a potential race condition
with ep_disable() which also removes all requests from started_list
and pending_list.
The check for NULL r->trb should catch this but currently it exits to
the wrong 'out1' label which calls dwc3_gadget_giveback(). Since its
list entry was already removed, if CONFIG_DEBUG_LIST is enabled a
'list_del corruption' bug is thrown since its next/prev pointers are
already LIST_POISON1/2. If r->trb is NULL it should simply exit to
'out0'.
Fixes: cf3113d893 ("usb: dwc3: gadget: properly increment dequeue pointer on ep_dequeue")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Mayank Rana <mrana@codeaurora.org>
Signed-off-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4842ed5bfc upstream.
If we get an invalid device configuration from a palm 3 type device, we
might incorrectly parse things, and we have the potential to crash in
"interesting" ways.
Fix this up by verifying the size of the configuration passed to us by
the device, and only if it is correct, will we handle it.
Note that this also fixes an information leak of slab data.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ johan: add comment about the info leak ]
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 44a182b9d1 upstream.
KASAN found a use-after-free in xhci_free_virt_device+0x33b/0x38e
where xhci_free_virt_device() sets slot id to 0 if udev exists:
if (dev->udev && dev->udev->slot_id)
dev->udev->slot_id = 0;
dev->udev will be true even if udev is freed because dev->udev is
not set to NULL.
set dev->udev pointer to NULL in xhci_free_dev()
The original patch went to stable so this fix needs to be applied
there as well.
Fixes: a400efe455 ("xhci: zero usb device slot_id member when disabling and freeing a xhci slot")
Cc: <stable@vger.kernel.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e538409257 upstream.
Commit 65c7923057 tried to clear the custom firmware path on exit by
writing a single space to the firmware_class.path parameter. This
doesn't work because nothing strips this space from the value stored
and fw_get_filesystem_firmware() only ignores zero-length paths.
Instead, write a null byte.
Fixes: 0a8adf5847 ("test: add firmware_class loader test")
Fixes: 65c7923057 ("test_firmware: fix setting old custom fw path back on exit")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7aef1c207 upstream.
Commit b9f19259b8 ("drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl")
introduced a mechanism to mark some BOs as purgeable to allow the driver
to drop them under memory pressure. In order to implement this feature
we had to add a mechanism to mark BOs as currently used by a piece of
hardware which materialized through the ->usecnt counter.
Plane code is supposed to increment usecnt when it attaches a BO to a
plane and decrement it when it's done with this BO, which was done in
the ->prepare_fb() and ->cleanup_fb() hooks. The problem is, async page
flip logic does not go through the regular atomic update path, and
->prepare_fb() and ->cleanup_fb() are not called in this case.
Fix that by manually calling vc4_bo_{inc,dec}_usecnt() in the
async-page-flip path.
Note that all this should go away as soon as we get generic async page
flip support in the core, in the meantime, this fix should do the
trick.
Fixes: b9f19259b8 ("drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl")
Reported-by: Peter Robinson <pbrobinson@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180430133232.32457-1-boris.brezillon@bootlin.com
Link: https://patchwork.freedesktop.org/patch/msgid/20180430133232.32457-1-boris.brezillon@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 998ac6d21c upstream.
In preivous patch:
Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_exist
We avoid starting btrfs transaction and get this information from
fs_info->running_transaction directly.
When accessing running_transaction in check_delayed_ref, there's a
chance that current transaction will be freed by commit transaction
after the NULL pointer check of running_transaction is passed.
After looking all the other places using fs_info->running_transaction,
they are either protected by trans_lock or holding the transactions.
Fix this by using trans_lock and increasing the use_count.
Fixes: e4c3b2dcd1 ("Btrfs: kill trans in run_delalloc_nocow and btrfs_cross_ref_exist")
CC: stable@vger.kernel.org # 4.14+
Signed-off-by: ethanwu <ethanwu@synology.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2df19e19ae upstream.
When a CQ is shared by multiple QPs, c4iw_flush_hw_cq() needs to acquire
corresponding QP lock before moving the CQEs into its corresponding SW
queue and accessing the SQ contents for completing a WR.
Ignore CQEs if corresponding QP is already flushed.
Cc: stable@vger.kernel.org
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45d924571a upstream.
When an invalid num_vls is used as a module parameter, the code
execution follows an exception path where the macro dd_dev_err()
expects dd->pcidev->dev not to be NULL in hfi1_init_dd(). This
causes a NULL pointer dereference.
Fix hfi1_init_dd() by initializing dd->pcidev and dd->pcidev->dev
earlier in the code. If a dd exists, then dd->pcidev and
dd->pcidev->dev always exists.
BUG: unable to handle kernel NULL pointer dereference
at 00000000000000f0
IP: __dev_printk+0x15/0x90
Workqueue: events work_for_cpu_fn
RIP: 0010:__dev_printk+0x15/0x90
Call Trace:
dev_err+0x6c/0x90
? hfi1_init_pportdata+0x38d/0x3f0 [hfi1]
hfi1_init_dd+0xdd/0x2530 [hfi1]
? pci_conf1_read+0xb2/0xf0
? pci_read_config_word.part.9+0x64/0x80
? pci_conf1_write+0xb0/0xf0
? pcie_capability_clear_and_set_word+0x57/0x80
init_one+0x141/0x490 [hfi1]
local_pci_probe+0x3f/0xa0
work_for_cpu_fn+0x10/0x20
process_one_work+0x152/0x350
worker_thread+0x1cf/0x3e0
kthread+0xf5/0x130
? max_active_store+0x80/0x80
? kthread_bind+0x10/0x10
? do_syscall_64+0x6e/0x1a0
? SyS_exit_group+0x10/0x10
ret_from_fork+0x35/0x40
Cc: <stable@vger.kernel.org> # 4.9.x
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a0bcb046b upstream.
AHG may be armed to use the stored header, which by design is limited
to edits in the PSN/A 32 bit word (bth2).
When the code is trying to send a BECN, the use of the stored header
will lose the BECN bit.
Fix by avoiding AHG when getting ready to send a BECN. This is
accomplished by always claiming the packet is not a middle packet which
is an AHG precursor. BECNs are not a normal case and this should not
hurt AHG optimizations.
Cc: <stable@vger.kernel.org> # 4.14.x
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f59fb9e051 upstream.
The code for handling a marked UD packet unconditionally returns the
dlid in the header of the FECN marked packet. This is not correct
for multicast packets where the DLID is in the multicast range.
The subsequent attempt to send the CNP with the multicast lid will
cause the chip to halt the ack send context because the source
lid doesn't match the chip programming. The send context will
be halted and flush any other pending packets in the pio ring causing
the CNP to not be sent.
A part of investigating the fix, it was determined that the 16B work
broke the FECN routine badly with inconsistent use of 16 bit and 32 bits
types for lids and pkeys. Since the port's source lid was correctly 32
bits the type mixmatches need to be dealt with at the same time as
fixing the CNP header issue.
Fix these issues by:
- Using the ports lid for as the SLID for responding to FECN marked UD
packets
- Insure pkey is always 16 bit in this and subordinate routines
- Insure lids are 32 bits in this and subordinate routines
Cc: <stable@vger.kernel.org> # 4.14.x
Fixes: 88733e3b84 ("IB/hfi1: Add 16B UD support")
Reviewed-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3fe6c62bc upstream.
Fix build errors when INFINIBAND_USER_ACCESS=m and MLX5_INFINIBAND=y.
The build error occurs when the mlx5 driver code attempts to use
USER_ACCESS interfaces, which are built as a loadable module.
Fixes these build errors:
drivers/infiniband/hw/mlx5/main.o: In function `populate_specs_root':
../drivers/infiniband/hw/mlx5/main.c:4982: undefined reference to `uverbs_default_get_objects'
../drivers/infiniband/hw/mlx5/main.c:4994: undefined reference to `uverbs_alloc_spec_tree'
drivers/infiniband/hw/mlx5/main.o: In function `depopulate_specs_root':
../drivers/infiniband/hw/mlx5/main.c:5001: undefined reference to `uverbs_free_spec_tree'
Build-tested with multiple config combinations.
Fixes: 8c84660bb4 ("IB/mlx5: Initialize the parsing tree root without the help of uverbs")
Cc: stable@vger.kernel.org # reported against 4.16
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f32ac2e45 upstream.
Before the change, if the user passed a static rate value different
than zero and the FW doesn't support static rate,
it would end up configuring rate of 2.5 GBps.
Fix this by using rate 0; unlimited, in cases where FW
doesn't support static rate configuration.
Cc: <stable@vger.kernel.org> # 3.10
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Danit Goldberg <danitg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f9ca2d868 upstream.
Despite being advertised to user space application, the RSS inner
header flag was filtered by checks at the beginning of QP creation
routine.
Cc: <stable@vger.kernel.org> # 4.15
Fixes: 4d02ebd9bb ("IB/mlx4: Fix RSS hash fields restrictions")
Fixes: 07d84f7b6a ("IB/mlx4: Add support to RSS hash for inner headers")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4bd701ac4 upstream.
Failure in rereg MR releases UMEM but leaves the MR to be destroyed
by the user. As a result the following scenario may happen:
"create MR -> rereg MR with failure -> call to rereg MR again" and
hit "NULL-ptr deref or user memory access" errors.
Ensure that rereg MR is only performed on a non-dead MR.
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # 4.5
Fixes: 395a8e4c32 ("IB/mlx5: Refactoring register MR code")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09abfe7b5b upstream.
The RDMA CM will select a source device and address by consulting
the routing table if no source address is passed into
rdma_resolve_address(). Userspace will ask for this by passing an
all-zero source address in the RESOLVE_IP command. Unfortunately
the new check for non-zero address size rejects this with EINVAL,
which breaks valid userspace applications.
Fix this by explicitly allowing a zero address family for the source.
Fixes: 2975d5de64 ("RDMA/ucma: Check AF family prior resolving address")
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26bff1bd74 upstream.
The c4iw_rdev_close() logic was not releasing all the hw
resources (PBL and RQT memory) during the device removal
event (driver unload / system reboot). This can cause panic
in gen_pool_destroy().
The module remove function will wait for all the hw
resources to be released during the device removal event.
Fixes c12a67fe(iw_cxgb4: free EQ queue memory on last deref)
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Cc: stable@vger.kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7553961631 upstream.
Commit 7ed1c1901f (tools: fix cross-compile var clobbering) removed
setting of LD to $(CROSS_COMPILE)gcc. This broke build of acpica
(acpidump) in power/acpi:
ld: unrecognized option '-D_LINUX'
The tools pass CFLAGS to the linker (incl. -D_LINUX), so revert this
particular change and let LD be $(CC) again. Note that the old behaviour
was a bit different, it used $(CROSS_COMPILE)gcc which was eliminated by
the commit 7ed1c1901f. We use $(CC) for that reason.
Fixes: 7ed1c1901f (tools: fix cross-compile var clobbering)
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: 4.16+ <stable@vger.kernel.org> # 4.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d83fb1425 upstream.
During the "insert range" fallocate operation, i_size grows by the
specified 'len' bytes. XFS verifies that i_size + len < s_maxbytes, as
it should. But this comparison is done using the signed 'loff_t', and
'i_size + len' can wrap around to a negative value, causing the check to
incorrectly pass, resulting in an inode with "negative" i_size. This is
possible on 64-bit platforms, where XFS sets s_maxbytes = LLONG_MAX.
ext4 and f2fs don't run into this because they set a smaller s_maxbytes.
Fix it by using subtraction instead.
Reproducer:
xfs_io -f file -c "truncate $(((1<<63)-1))" -c "finsert 0 4096"
Fixes: a904b1ca57 ("xfs: Add support FALLOC_FL_INSERT_RANGE for fallocate")
Cc: <stable@vger.kernel.org> # v4.1+
Originally-From: Eric Biggers <ebiggers@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix signed integer addition overflow too]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af8a41cccf upstream.
Some HP laptops have only a single wifi antenna. This would not be a
problem except that they were shipped with an incorrectly encoded
EFUSE. It should have been possible to open the computer and transfer
the antenna connection to the other terminal except that such action
might void the warranty, and moving the antenna broke the Windows
driver. The fix was to add a module option that would override the
EFUSE encoding. That was done with commit c18d8f5095 ("rtlwifi:
rtl8723be: Add antenna select module parameter"). There was still a
problem with Bluetooth coexistence, which was addressed with commit
baa1702290 ("rtlwifi: btcoexist: Implement antenna selection").
There were still problems, thus there were commit 0ff78adeef
("rtlwifi: rtl8723be: fix ant_sel code") and commit 6d62269283
("rtlwifi: btcoexist: Fix antenna selection code"). Despite all these
attempts at fixing the problem, the code is not yet right. A proper
fix is important as there are now instances of laptops having
RTL8723DE chips with the same problem.
The module parameter ant_sel is used to control antenna number and path.
At present enum ANT_{X2,X1} is used to define the antenna number, but
this choice is not intuitive, thus change to a new enum ANT_{MAIN,AUX}
to make it more readable. This change showed examples where incorrect
values were used. It was also possible to remove a workaround in
halbtcoutsrc.c.
The experimental results with single antenna connected to specific path
are now as follows:
ant_sel ANT_MAIN(#1) ANT_AUX(#2)
0 -8 -62
1 -62 -10
2 -6 -60
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Fixes: c18d8f5095 ("rtlwifi: rtl8723be: Add antenna select module parameter")
Fixes: baa1702290 ("rtlwifi: btcoexist: Implement antenna selection")
Fixes: 0ff78adeef ("rtlwifi: rtl8723be: fix ant_sel code")
Fixes: 6d62269283 ("rtlwifi: btcoexist: Fix antenna selection code")
Cc: Stable <stable@vger.kernel.org> # 4.7+
Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f372b81101 upstream.
This patch adds the correct platform data information for the Caroline
Chromebook, so that the mouse button does not get stuck in pressed state
after the first click.
The Samus button keymap and platform data definition are the correct
ones for Caroline, so they have been reused here.
Signed-off-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Signed-off-by: Salvatore Bellizzi <lkml@seppia.net>
Tested-by: Guenter Roeck <groeck@chromium.org>
Cc: stable@vger.kernel.org
[dtor: adjusted vendor spelling to match shipping firmware]
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6bd6ae6396 upstream.
UI_SET_LEDBIT ioctl() causes the following KASAN splat when used with
led > LED_CHARGING:
[ 1274.663418] BUG: KASAN: slab-out-of-bounds in input_leds_connect+0x611/0x730 [input_leds]
[ 1274.663426] Write of size 8 at addr ffff88003377b2c0 by task ckb-next-daemon/5128
This happens because we were writing to the led structure before making
sure that it exists.
Reported-by: Tasos Sahanidis <tasos@tasossah.com>
Tested-by: Tasos Sahanidis <tasos@tasossah.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4678df184 upstream.
The errseq_t infrastructure assumes that errors which occurred before
the file descriptor was opened are of no interest to the application.
This turns out to be a regression for some applications, notably Postgres.
Before errseq_t, a writeback error would be reported exactly once (as
long as the inode remained in memory), so Postgres could open a file,
call fsync() and find out whether there had been a writeback error on
that file from another process.
This patch changes the errseq infrastructure to report errors to all
file descriptors which are opened after the error occurred, but before
it was reported to any file descriptor. This restores the user-visible
behaviour.
Cc: stable@vger.kernel.org
Fixes: 5660e13d2f ("fs: new infrastructure for writeback error handling and reporting")
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76b3421b39 upstream.
Some control API callbacks in aloop driver are too lazy to take the
loopback->cable_lock and it results in possible races of cable access
while it's being freed. It eventually lead to a UAF, as reported by
fuzzer recently.
This patch covers such control API callbacks and add the proper mutex
locks.
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 306a4f3ca7 upstream.
Show paused ALSA aloop device as inactive, i.e. the control
"PCM Slave Active" set as false. Notification sent upon state change.
This makes it possible for client capturing from aloop device to know if
data is expected. Without it the client expects data even if playback
is paused.
Signed-off-by: Robert Rosengren <robert.rosengren@axis.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52759c0963 upstream.
At a commit f91c9d7610 ('ALSA: firewire-lib: cache maximum length of
payload to reduce function calls'), maximum size of payload for tx
isochronous packet is cached to reduce the number of function calls.
This cache was programmed to updated at a first callback of ohci1394 IR
context. However, the maximum size is required to queueing packets before
starting the isochronous context.
As a result, the cached value is reused to queue packets in next time to
starting the isochronous context. Then the cache is updated in a first
callback of the isochronous context. This can cause kernel NULL pointer
dereference in a below call graph:
(sound/firewire/amdtp-stream.c)
amdtp_stream_start()
->queue_in_packet()
->queue_packet()
(drivers/firewire/core-iso.c)
->fw_iso_context_queue()
->struct fw_card_driver.queue_iso()
(drivers/firewire/ohci.c)
= ohci_queue_iso()
->queue_iso_packet_per_buffer()
buffer->pages[page]
The issued dereference occurs in a case that:
- target unit supports different stream formats for sampling transmission
frequency.
- maximum length of payload for tx stream in a first trial is bigger
than the length in a second trial.
In this case, correct number of pages are allocated for DMA and the 'pages'
array has enough elements, while index of the element is wrongly calculated
according to the old value of length of payload in a call of
'queue_in_packet()'. Then it causes the issue.
This commit fixes the critical bug. This affects all of drivers in ALSA
firewire stack in Linux kernel v4.12 or later.
[12665.302360] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[12665.302415] IP: ohci_queue_iso+0x47c/0x800 [firewire_ohci]
[12665.302439] PGD 0
[12665.302440] P4D 0
[12665.302450]
[12665.302470] Oops: 0000 [#1] SMP PTI
[12665.302487] Modules linked in: ...
[12665.303096] CPU: 1 PID: 12760 Comm: jackd Tainted: P OE 4.13.0-38-generic #43-Ubuntu
[12665.303154] Hardware name: /DH77DF, BIOS KCH7710H.86A.0069.2012.0224.1825 02/24/2012
[12665.303215] task: ffff9ce87da2ae80 task.stack: ffffb5b8823d0000
[12665.303258] RIP: 0010:ohci_queue_iso+0x47c/0x800 [firewire_ohci]
[12665.303301] RSP: 0018:ffffb5b8823d3ab8 EFLAGS: 00010086
[12665.303337] RAX: ffff9ce4f4876930 RBX: 0000000000000008 RCX: ffff9ce88a3955e0
[12665.303384] RDX: 0000000000000000 RSI: 0000000034877f00 RDI: 0000000000000000
[12665.303427] RBP: ffffb5b8823d3b68 R08: ffff9ce8ccb390a0 R09: ffff9ce877639ab0
[12665.303475] R10: 0000000000000108 R11: 0000000000000000 R12: 0000000000000003
[12665.303513] R13: 0000000000000000 R14: ffff9ce4f4876950 R15: 0000000000000000
[12665.303554] FS: 00007f2ec467f8c0(0000) GS:ffff9ce8df280000(0000) knlGS:0000000000000000
[12665.303600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12665.303633] CR2: 0000000000000030 CR3: 00000002dcf90004 CR4: 00000000000606e0
[12665.303674] Call Trace:
[12665.303698] fw_iso_context_queue+0x18/0x20 [firewire_core]
[12665.303735] queue_packet+0x88/0xe0 [snd_firewire_lib]
[12665.303770] amdtp_stream_start+0x19b/0x270 [snd_firewire_lib]
[12665.303811] start_streams+0x276/0x3c0 [snd_dice]
[12665.303840] snd_dice_stream_start_duplex+0x1bf/0x480 [snd_dice]
[12665.303882] ? vma_gap_callbacks_rotate+0x1e/0x30
[12665.303914] ? __rb_insert_augmented+0xab/0x240
[12665.303936] capture_prepare+0x3c/0x70 [snd_dice]
[12665.303961] snd_pcm_do_prepare+0x1d/0x30 [snd_pcm]
[12665.303985] snd_pcm_action_single+0x3b/0x90 [snd_pcm]
[12665.304009] snd_pcm_action_nonatomic+0x68/0x70 [snd_pcm]
[12665.304035] snd_pcm_prepare+0x68/0x90 [snd_pcm]
[12665.304058] snd_pcm_common_ioctl1+0x4c0/0x940 [snd_pcm]
[12665.304083] snd_pcm_capture_ioctl1+0x19b/0x250 [snd_pcm]
[12665.304108] snd_pcm_capture_ioctl+0x27/0x40 [snd_pcm]
[12665.304131] do_vfs_ioctl+0xa8/0x630
[12665.304148] ? entry_SYSCALL_64_after_hwframe+0xe9/0x139
[12665.304172] ? entry_SYSCALL_64_after_hwframe+0xe2/0x139
[12665.304195] ? entry_SYSCALL_64_after_hwframe+0xdb/0x139
[12665.304218] ? entry_SYSCALL_64_after_hwframe+0xd4/0x139
[12665.304242] ? entry_SYSCALL_64_after_hwframe+0xcd/0x139
[12665.304265] ? entry_SYSCALL_64_after_hwframe+0xc6/0x139
[12665.304288] ? entry_SYSCALL_64_after_hwframe+0xbf/0x139
[12665.304312] ? entry_SYSCALL_64_after_hwframe+0xb8/0x139
[12665.304335] ? entry_SYSCALL_64_after_hwframe+0xb1/0x139
[12665.304358] SyS_ioctl+0x79/0x90
[12665.304374] ? entry_SYSCALL_64_after_hwframe+0x72/0x139
[12665.304397] entry_SYSCALL_64_fastpath+0x24/0xab
[12665.304417] RIP: 0033:0x7f2ec3750ef7
[12665.304433] RSP: 002b:00007fff99e31388 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[12665.304465] RAX: ffffffffffffffda RBX: 00007fff99e312f0 RCX: 00007f2ec3750ef7
[12665.304494] RDX: 0000000000000000 RSI: 0000000000004140 RDI: 0000000000000007
[12665.304522] RBP: 0000556ebc63fd60 R08: 0000556ebc640560 R09: 0000000000000000
[12665.304553] R10: 0000000000000001 R11: 0000000000000246 R12: 0000556ebc63fcf0
[12665.304584] R13: 0000000000000000 R14: 0000000000000007 R15: 0000000000000000
[12665.304612] Code: 01 00 00 44 89 eb 45 31 ed 45 31 db 66 41 89 1e 66 41 89 5e 0c 66 45 89 5e 0e 49 8b 49 08 49 63 d4 4d 85 c0 49 63 ff 48 8b 14 d1 <48> 8b 72 30 41 8d 14 37 41 89 56 04 48 63 d3 0f 84 ce 00 00 00
[12665.304713] RIP: ohci_queue_iso+0x47c/0x800 [firewire_ohci] RSP: ffffb5b8823d3ab8
[12665.304743] CR2: 0000000000000030
[12665.317701] ---[ end trace 9d55b056dd52a19f ]---
Fixes: f91c9d7610 ('ALSA: firewire-lib: cache maximum length of payload to reduce function calls')
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f22e52528 upstream.
The sequencer virmidi code has an open race at its output trigger
callback: namely, virmidi keeps only one event packet for processing
while it doesn't protect for concurrent output trigger calls.
snd_virmidi_output_trigger() tries to process the previously
unfinished event before starting encoding the given MIDI stream, but
this is done without any lock. Meanwhile, if another rawmidi stream
starts the output trigger, this proceeds further, and overwrites the
event package that is being processed in another thread. This
eventually corrupts and may lead to the invalid memory access if the
event type is like SYSEX.
The fix is just to move the spinlock to cover both the pending event
and the new stream.
The bug was spotted by a new fuzzer, RaceFuzzer.
BugLink: http://lkml.kernel.org/r/20180426045223.GA15307@dragonet.kaist.ac.kr
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f13876e2c3 upstream.
Since snd_pcm_ioctl_xfern_compat() has no PCM state check, it may go
further and hit the sanity check pcm_sanity_check() when the ioctl is
called right after open. It may eventually spew a kernel warning, as
triggered by syzbot, depending on kconfig.
The lack of PCM state check there was just an oversight. Although
it's no real crash, the spurious kernel warning is annoying, so let's
add the proper check.
Reported-by: syzbot+1dac3a4f6bc9c1c675d4@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a30abaa40 upstream.
The commit c469652bb5 ("ALSA: hda - Use IS_REACHABLE() for
dependency on input") simplified the dependencies with IS_REACHABLE()
macro, but it broke due to its incorrect usage: it should have been
IS_REACHABLE(CONFIG_INPUT) instead of IS_REACHABLE(INPUT).
Fixes: c469652bb5 ("ALSA: hda - Use IS_REACHABLE() for dependency on input")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ece1397cbc upstream.
Some variants of the Arm Cortex-55 cores (r0p0, r0p1, r1p0) suffer
from an erratum 1024718, which causes incorrect updates when DBM/AP
bits in a page table entry is modified without a break-before-make
sequence. The work around is to skip enabling the hardware DBM feature
on the affected cores. The hardware Access Flag management features
is not affected. There are some other cores suffering from this
errata, which could be added to the midr_list to trigger the work
around.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: ckadabi@codeaurora.org
Reviewed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac1e55b1fd upstream.
Modules such as nouveau.ko and i915.ko have a link time dependency on
acpi_lid_open(), and due to its use of acpi_bus_register_driver(),
the button.ko module that provides it is only loadable when booted in
ACPI mode. However, the ACPI button driver can be built into the core
kernel as well, in which case the dependency can always be satisfied,
and the dependent modules can be loaded regardless of whether the
system was booted in ACPI mode or not.
So let's fix this asymmetry by making the ACPI button driver loadable
as a module even if not booted in ACPI mode, so it can provide the
acpi_lid_open() symbol in the same way as when built into the kernel.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[ rjw: Minor adjustments of comments, whitespace and names. ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85bd0ba1ff upstream.
Although we've implemented PSCI 0.1, 0.2 and 1.0, we expose either 0.1
or 1.0 to a guest, defaulting to the latest version of the PSCI
implementation that is compatible with the requested version. This is
no different from doing a firmware upgrade on KVM.
But in order to give a chance to hypothetical badly implemented guests
that would have a fit by discovering something other than PSCI 0.2,
let's provide a new API that allows userspace to pick one particular
version of the API.
This is implemented as a new class of "firmware" registers, where
we expose the PSCI version. This allows the PSCI version to be
save/restored as part of a guest migration, and also set to
any supported version if the guest requires it.
Cc: stable@vger.kernel.org #4.16
Reviewed-by: Christoffer Dall <cdall@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f71addd34 upstream.
Kaike reported that in tests rdma hrtimers occasionaly stopped working. He
did great debugging, which provided enough context to decode the problem.
CPU 3 CPU 2
idle
start sched_timer expires = 712171000000
queue->next = sched_timer
start rdmavt timer. expires = 712172915662
lock(baseof(CPU3))
tick_nohz_stop_tick()
tick = 716767000000 timerqueue_add(tmr)
hrtimer_set_expires(sched_timer, tick);
sched_timer->expires = 716767000000 <---- FAIL
if (tmr->expires < queue->next->expires)
hrtimer_start(sched_timer) queue->next = tmr;
lock(baseof(CPU3))
unlock(baseof(CPU3))
timerqueue_remove()
timerqueue_add()
ts->sched_timer is queued and queue->next is pointing to it, but then
ts->sched_timer.expires is modified.
This not only corrupts the ordering of the timerqueue RB tree, it also
makes CPU2 see the new expiry time of timerqueue->next->expires when
checking whether timerqueue->next needs to be updated. So CPU2 sees that
the rdma timer is earlier than timerqueue->next and sets the rdma timer as
new next.
Depending on whether it had also seen the new time at RB tree enqueue, it
might have queued the rdma timer at the wrong place and then after removing
the sched_timer the RB tree is completely hosed.
The problem was introduced with a commit which tried to solve inconsistency
between the hrtimer in the tick_sched data and the underlying hardware
clockevent. It split out hrtimer_set_expires() to store the new tick time
in both the NOHZ and the NOHZ + HIGHRES case, but missed the fact that in
the NOHZ + HIGHRES case the hrtimer might still be queued.
Use hrtimer_start(timer, tick...) for the NOHZ + HIGHRES case which sets
timer->expires after canceling the timer and move the hrtimer_set_expires()
invocation into the NOHZ only code path which is not affected as it merily
uses the hrtimer as next event storage so code pathes can be shared with
the NOHZ + HIGHRES case.
Fixes: d4af6d933c ("nohz: Fix spurious warning when hrtimer and clockevent get out of sync")
Reported-by: "Wan Kaike" <kaike.wan@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Cc: "Marciniszyn Mike" <mike.marciniszyn@intel.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: linux-rdma@vger.kernel.org
Cc: "Dalessandro Dennis" <dennis.dalessandro@intel.com>
Cc: "Fleck John" <john.fleck@intel.com>
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: "Weiny Ira" <ira.weiny@intel.com>
Cc: "linux-rdma@vger.kernel.org"
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804241637390.1679@nanos.tec.linutronix.de
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1804242119210.1597@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09e182d17e upstream.
Vitezslav reported a case where the
"Timeout during microcode update!"
panic would hit. After a deeper look, it turned out that his .config had
CONFIG_HOTPLUG_CPU disabled which practically made save_mc_for_early() a
no-op.
When that happened, the discovered microcode patch wasn't saved into the
cache and the late loading path wouldn't find any.
This, then, lead to early exit from __reload_late() and thus CPUs waiting
until the timeout is reached, leading to the panic.
In hindsight, that function should have been written so it does not return
before the post-synchronization. Oh well, I know better now...
Fixes: bb8c13d61a ("x86/microcode: Fix CPU synchronization routine")
Reported-by: Vitezslav Samel <vitezslav@samel.cz>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Vitezslav Samel <vitezslav@samel.cz>
Tested-by: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20180418081140.GA2439@pc11.op.pod.cz
Link: https://lkml.kernel.org/r/20180421081930.15741-2-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da6fa7ef67 upstream.
Recent AMD systems support using MWAIT for C1 state. However, MWAIT will
not allow deeper cstates than C1 on current systems.
play_dead() expects to use the deepest state available. The deepest state
available on AMD systems is reached through SystemIO or HALT. If MWAIT is
available, it is preferred over the other methods, so the CPU never reaches
the deepest possible state.
Don't try to use MWAIT to play_dead() on AMD systems. Instead, use CPUIDLE
to enter the deepest state advertised by firmware. If CPUIDLE is not
available then fallback to HALT.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: https://lkml.kernel.org/r/20180403140228.58540-1-Yazen.Ghannam@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1a512c0882 upstream.
A bugfix broke the x32 shmid64_ds and msqid64_ds data structure layout
(as seen from user space) a few years ago: Originally, __BITS_PER_LONG
was defined as 64 on x32, so we did not have padding after the 64-bit
__kernel_time_t fields, After __BITS_PER_LONG got changed to 32,
applications would observe extra padding.
In other parts of the uapi headers we seem to have a mix of those
expecting either 32 or 64 on x32 applications, so we can't easily revert
the path that broke these two structures.
Instead, this patch decouples x32 from the other architectures and moves
it back into arch specific headers, partially reverting the even older
commit 73a2d096fd ("x86: remove all now-duplicate header files").
It's not clear whether this ever made any difference, since at least
glibc carries its own (correct) copy of both of these header files,
so possibly no application has ever observed the definitions here.
Based on a suggestion from H.J. Lu, I tried out the tool from
https://github.com/hjl-tools/linux-header to find other such
bugs, which pointed out the same bug in statfs(), which also has
a separate (correct) copy in glibc.
Fixes: f4b4aae182 ("x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "H . J . Lu" <hjl.tools@gmail.com>
Cc: Jeffrey Walton <noloader@gmail.com>
Cc: stable@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20180424212013.3967461-1-arnd@arndb.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f287765680 upstream.
The below commit
"drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, v2"
introduces a slight behavioral change to rmfb. Instead of disabling a crtc
when the primary plane is disabled, it now preserves it.
Since DC is currently not equipped to handle this we need to fail such
a commit, otherwise we might see a corrupted screen.
This is based on Shirish's previous approach but avoids adding all
planes to the new atomic state which leads to a full update in DC for
any commit, and is not what we intend.
Theoretically DM should be able to deal with states with fully populated planes,
even for simple updates, such as cursor updates. This should still be
addressed in the future.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c7b8de0038 upstream.
We shouldn't attempt to read EDID in atomic_check. We really shouldn't
even be modifying the connector object, or any other non-state object,
but this is a start at least.
Moving EDID cleanup to dm_dp_mst_connector_destroy from
dm_dp_destroy_mst_connector to ensure the EDID is still available for
headless mode.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75569c182e upstream.
Otherwise, the SQ may skip some of the register writes, or shader waves may
be allocated where we don't expect them, so that as a result we don't actually
reset all of the register SRAMs. This can lead to spurious ECC errors later on
if a shader uses an uninitialized register.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 682e6b4da5 upstream.
The OPAL RTC driver does not sleep in case it gets OPAL_BUSY or
OPAL_BUSY_EVENT from firmware, which causes large scheduling
latencies, up to 50 seconds have been observed here when RTC stops
responding (BMC reboot can do it).
Fix this by converting it to the standard form OPAL_BUSY loop that
sleeps.
Fixes: 628daa8d5a ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0f7f5b6c6 upstream.
gpstate_timer_handler() uses synchronous smp_call to set the pstate
on the requested core. This causes the below hard lockup:
smp_call_function_single+0x110/0x180 (unreliable)
smp_call_function_any+0x180/0x250
gpstate_timer_handler+0x1e8/0x580
call_timer_fn+0x50/0x1c0
expire_timers+0x138/0x1f0
run_timer_softirq+0x1e8/0x270
__do_softirq+0x158/0x3e4
irq_exit+0xe8/0x120
timer_interrupt+0x9c/0xe0
decrementer_common+0x114/0x120
-- interrupt: 901 at doorbell_global_ipi+0x34/0x50
LR = arch_send_call_function_ipi_mask+0x120/0x130
arch_send_call_function_ipi_mask+0x4c/0x130
smp_call_function_many+0x340/0x450
pmdp_invalidate+0x98/0xe0
change_huge_pmd+0xe0/0x270
change_protection_range+0xb88/0xe40
mprotect_fixup+0x140/0x340
SyS_mprotect+0x1b4/0x350
system_call+0x58/0x6c
One way to avoid this is removing the smp-call. We can ensure that the
timer always runs on one of the policy-cpus. If the timer gets
migrated to a cpu outside the policy then re-queue it back on the
policy->cpus. This way we can get rid of the smp-call which was being
used to set the pstate on the policy->cpus.
Fixes: 7bc54b652f ("timers, cpufreq/powernv: Initialize the gpstate timer as pinned")
Cc: stable@vger.kernel.org # v4.8+
Reported-by: Nicholas Piggin <npiggin@gmail.com>
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd709e72cb upstream.
Commit 99492c39f3 ("earlycon: Fix __earlycon_table stride") tried to fix
__earlycon_table stride by forcing the earlycon_id struct alignment to 32
and asking the linker to 32-byte align the __earlycon_table symbol. This
fix was based on commit 07fca0e57f ("tracing: Properly align linker
defined symbols") which tried a similar fix for the tracing subsystem.
However, this fix doesn't quite work because there is no guarantee that
gcc will place structures packed into an array format. In fact, gcc 4.9
chooses to 64-byte align these structs by inserting additional padding
between the entries because it has no clue that they are supposed to be in
an array. If we are unlucky, the linker will assign symbol
"__earlycon_table" to a 32-byte aligned address which does not correspond
to the 64-byte aligned contents of section "__earlycon_table".
To address this same problem, the fix to the tracing system was
subsequently re-implemented using a more robust table of pointers approach
by commits:
3d56e331b6 ("tracing: Replace syscall_meta_data struct array with pointer array")
6549864629 ("tracepoints: Fix section alignment using pointer array")
e4a9ea5ee7 ("tracing: Replace trace_event struct array with pointer array")
Let's use this same "array of pointers to structs" approach for
EARLYCON_TABLE.
Fixes: 99492c39f3 ("earlycon: Fix __earlycon_table stride")
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Suggested-by: Aaron Durbin <adurbin@chromium.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit be71eda538 upstream.
Reading file /proc/modules shows the correct address:
[root@s35lp76 ~]# cat /proc/modules | egrep '^qeth_l2'
qeth_l2 94208 1 - Live 0x000003ff80401000
and reading file /sys/module/qeth_l2/sections/.text
[root@s35lp76 ~]# cat /sys/module/qeth_l2/sections/.text
0x0000000018ea8363
displays a random address.
This breaks the perf tool which uses this address on s390
to calculate start of .text section in memory.
Fix this by printing the correct (unhashed) address.
Thanks to Jessica Yu for helping on this.
Fixes: ef0010a309 ("vsprintf: don't use 'restricted_pointer()' when not restricting")
Cc: <stable@vger.kernel.org> # v4.15+
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 881c93c0fb upstream.
If the driver module is loaded when FPGA is configured, the FPGA
is reset because nconfig is pulled low (low-active gpio inited
with GPIOD_OUT_HIGH activates the signal which means setting its
value to low). Init nconfig with GPIOD_OUT_LOW to prevent this.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Acked-by: Alan Tull <atull@kernel.org>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Cc: stable <stable@vger.kernel.org> # 4.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit faf6a2a441 upstream.
It is not possible to get DMA32 zone memory through kmalloc, causing
the vboxguest driver to malfunction due to getting memory above
4G which the PCI device cannot handle.
This commit changes the kmalloc calls where the 4G limit matters to
using __get_free_pages() fixing vboxguest not working on x86_64 guests
with more then 4G RAM.
Cc: stable@vger.kernel.org
Reported-by: Eloy Coto Pereiro <eloy.coto@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f6f9885b05 upstream.
This is a preparation patch for fixing issues on x86_64 virtual-machines
with more then 4G of RAM, atm we pass __GFP_DMA32 to kmalloc, but kmalloc
does not honor that, so we need to switch to get_pages, which means we
will not be able to use kfree to free memory allocated with vbg_alloc_req.
While at it also remove a comment on a vbg_alloc_req call which talks
about Windows (inherited from the vbox upstream cross-platform code).
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02cfde67df upstream.
Move the declarations of functions from vboxguest_utils.c which are only
meant for vboxguest internal use from include/linux/vbox_utils.h to
drivers/virt/vboxguest/vboxguest_core.h.
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae860a19f3 upstream.
If a driver uses DPM_FLAG_SMART_SUSPEND and the device is already
runtime suspended when hibernate is started PCI core skips runtime
resuming the device but still clears pci_dev->state_saved. After the
hibernation image is written pci_pm_thaw_noirq() makes sure subsequent
thaw phases for the device are also skipped leaving it runtime suspended
with pci_dev->state_saved == false.
When the device is eventually runtime resumed pci_pm_runtime_resume()
restores config space by calling pci_restore_standard_config(), however
because pci_dev->state_saved == false pci_restore_state() never actually
restores the config space leaving the device in a state that is not what
the driver might expect.
For example here is what happens for intel-lpss I2C devices once the
hibernation snapshot is taken:
intel-lpss 0000:00:15.0: power state changed by ACPI to D0
intel-lpss 0000:00:1e.0: power state changed by ACPI to D3cold
video LNXVIDEO:00: Restoring backlight state
PM: hibernation exit
i2c_designware i2c_designware.1: Unknown Synopsys component type: 0xffffffff
i2c_designware i2c_designware.0: Unknown Synopsys component type: 0xffffffff
i2c_designware i2c_designware.1: timeout in disabling adapter
i2c_designware i2c_designware.0: timeout in disabling adapter
Since PCI config space is not restored the device is still in D3hot
making MMIO register reads return 0xffffffff.
Fix this by clearing pci_dev->state_saved only if we actually end up
runtime resuming the device.
Fixes: c4b65157ae (PCI / PM: Take SMART_SUSPEND driver flag into account)
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c55ad1c21 upstream.
ceph_con_workfn() validates con->state before calling try_read() and
then try_write(). However, try_read() temporarily releases con->mutex,
notably in process_message() and ceph_con_in_msg_alloc(), opening the
window for ceph_con_close() to sneak in, close the connection and
release con->sock. When try_write() is called on the assumption that
con->state is still valid (i.e. not STANDBY or CLOSED), a NULL sock
gets passed to the networking stack:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: selinux_socket_sendmsg+0x5/0x20
Make sure con->state is valid at the top of try_write() and add an
explicit BUG_ON for this, similar to try_read().
Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/23706
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b4c443d13 upstream.
If we go without an established session for a while, backoff delay will
climb to 30 seconds. The keepalive timeout is also 30 seconds, so it's
pretty easily hit after a prolonged hunting for a monitor: we don't get
a chance to send out a keepalive in time, which means we never get back
a keepalive ack in time, cutting an established session and attempting
to connect to a different monitor every 30 seconds:
[Sun Apr 1 23:37:05 2018] libceph: mon0 10.80.20.99:6789 session established
[Sun Apr 1 23:37:36 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon
[Sun Apr 1 23:37:36 2018] libceph: mon2 10.80.20.103:6789 session established
[Sun Apr 1 23:38:07 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon
[Sun Apr 1 23:38:07 2018] libceph: mon1 10.80.20.100:6789 session established
[Sun Apr 1 23:38:37 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon
[Sun Apr 1 23:38:37 2018] libceph: mon2 10.80.20.103:6789 session established
[Sun Apr 1 23:39:08 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon
The regular keepalive interval is 10 seconds. After ->hunting is
cleared in finish_hunting(), call __schedule_delayed() to ensure we
send out a keepalive after 10 seconds.
Cc: stable@vger.kernel.org # 4.7+
Link: http://tracker.ceph.com/issues/23537
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit facb9f6eba upstream.
This means that if we do some backoff, then authenticate, and are
healthy for an extended period of time, a subsequent failure won't
leave us starting our hunting sequence with a large backoff.
Mirrors ceph.git commit d466bc6e66abba9b464b0b69687cf45c9dccf383.
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c656941df9 upstream.
When the desired ratio is less than 256, the savesub (tolerance)
in the calculation would become 0. This will then fail the loop-
search immediately without reporting any errors.
But if the ratio is smaller enough, there is no need to calculate
the tolerance because PM divisor alone is enough to get the ratio.
So a simple fix could be just to set PM directly instead of going
into the loop-search.
Reported-by: Marek Vasut <marex@denx.de>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Tested-by: Marek Vasut <marex@denx.de>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0cf9b561c upstream.
The NPU has a limited number of address translation shootdown (ATSD)
registers and the GPU has limited bandwidth to process ATSDs. This can
result in contention of ATSD registers leading to soft lockups on some
threads, particularly when invalidating a large address range in
pnv_npu2_mn_invalidate_range().
At some threshold it becomes more efficient to flush the entire GPU
TLB for the given MM context (PID) than individually flushing each
address in the range. This patch will result in ranges greater than
2MB being converted from 32+ ATSDs into a single ATSD which will flush
the TLB for the given PID on each GPU.
Fixes: 1ab66d1fba ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Tested-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75ecfb4951 upstream.
The current code extracts the physical address for UE errors and then
hooks it up into memory failure infrastructure. On successful
extraction of physical address it wrongly sets "handled = 1" which
means this UE error has been recovered. Since MCE handler gets return
value as handled = 1, it assumes that error has been recovered and
goes back to same NIP. This causes MCE interrupt again and again in a
loop leading to hard lockup.
Also, initialize phys_addr to ULONG_MAX so that we don't end up
queuing undesired page to hwpoison.
Without this patch we see:
Severe Machine check interrupt [Recovered]
NIP: [000000001002588c] PID: 7109 Comm: find
Initiator: CPU
Error type: UE [Load/Store]
Effective address: 00007fffd2755940
Physical address: 000020181a080000
...
Severe Machine check interrupt [Recovered]
NIP: [000000001002588c] PID: 7109 Comm: find
Initiator: CPU
Error type: UE [Load/Store]
Effective address: 00007fffd2755940
Physical address: 000020181a080000
Severe Machine check interrupt [Recovered]
NIP: [000000001002588c] PID: 7109 Comm: find
Initiator: CPU
Error type: UE [Load/Store]
Effective address: 00007fffd2755940
Physical address: 000020181a080000
Memory failure: 0x20181a08: recovery action for dirty LRU page: Recovered
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
Memory failure: 0x20181a08: already hardware poisoned
...
Watchdog CPU:38 Hard LOCKUP
After this patch we see:
Severe Machine check interrupt [Not recovered]
NIP: [00007fffaae585f4] PID: 7168 Comm: find
Initiator: CPU
Error type: UE [Load/Store]
Effective address: 00007fffaafe28ac
Physical address: 00002017c0bd0000
find[7168]: unhandled signal 7 at 00007fffaae585f4 nip 00007fffaae585f4 lr 00007fffaae585e0 code 4
Memory failure: 0x2017c0bd: recovery action for dirty LRU page: Recovered
Fixes: 01eaac2b05 ("powerpc/mce: Hookup ierror (instruction) UE errors")
Fixes: ba41e1e1cc ("powerpc/mce: Hookup derror (load/store) UE errors")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb5924fddf upstream.
This patch adds support for flushing potentially dirty cache lines
when memory is hot-plugged/hot-un-plugged. The support is currently
limited to 64 bit systems.
The bug was exposed when mappings for a device were actually
hot-unplugged and plugged in back later. A similar issue was observed
during the development of memtrace, but memtrace does it's own
flushing of region via a custom routine.
These patches do a flush both on hotplug/unplug to clear any stale
data in the cache w.r.t mappings, there is a small race window where a
clean cache line may be created again just prior to tearing down the
mapping.
The patches were tested by disabling the flush routines in memtrace
and doing I/O on the trace file. The system immediately
checkstops (quite reliablly if prior to the hot-unplug of the memtrace
region, we memset the regions we are about to hot unplug). After these
patches no custom flushing is needed in the memtrace code.
Fixes: 9d5171a8f2 ("powerpc/powernv: Enable removal of memory for in memory tracing")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Reza Arbab <arbab@linux.ibm.com>
Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e33bbe6914 upstream.
With gcc-4.1.2:
slimbus/messaging.c: In function ‘slim_slicesize’:
slimbus/messaging.c:186: warning: statement with no effect
Indeed, clamp() is a macro not operating in-place, but returning the
clamped value. Hence the value is not clamped at all, which may lead to
an out-of-bounds access.
Fix this by assigning the clamped value.
Fixes: afbdcc7c38 ("slimbus: Add messaging APIs to slimbus framework")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f0cf47d939 upstream.
Before entering the guest, we check whether our VMID is still
part of the current generation. In order to avoid taking a lock,
we start with checking that the generation is still current, and
only if not current do we take the lock, recheck, and update the
generation and VMID.
This leaves open a small race: A vcpu can bump up the global
generation number as well as the VM's, but has not updated
the VMID itself yet.
At that point another vcpu from the same VM comes in, checks
the generation (and finds it not needing anything), and jumps
into the guest. At this point, we end-up with two vcpus belonging
to the same VM running with two different VMIDs. Eventually, the
VMID used by the second vcpu will get reassigned, and things will
really go wrong...
A simple solution would be to drop this initial check, and always take
the lock. This is likely to cause performance issues. A middle ground
is to convert the spinlock to a rwlock, and only take the read lock
on the fast path. If the check fails at that point, drop it and
acquire the write lock, rechecking the condition.
This ensures that the above scenario doesn't occur.
Cc: stable@vger.kernel.org
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2ffed5185 upstream.
When printing the driver_override parameter when it is 4095 and 4094
bytes long, the printing code would access invalid memory because we
need count + 1 bytes for printing.
Cfr. commits 4efe874aac ("PCI: Don't read past the end of sysfs
"driver_override" buffer") and bf563b01c2 ("driver core: platform:
Don't read past the end of "driver_override" buffer").
Fixes: 3cf3857134 ("ARM: 8256/1: driver coamba: add device binding path 'driver_override'")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Todd Kjos <tkjos@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f53624662 upstream.
For AMBA devices with unconfigured driver override, the
"driver_override" sysfs virtual file is empty, while it contains
"(null)" for platform and PCI devices.
Make AMBA consistent with other buses by dropping the test for a NULL
pointer.
Note that contrary to popular belief, sprintf() handles NULL pointers
fine; they are printed as "(null)".
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3430f924a6 upstream.
The Aardvark has two interrupts sets:
- first set is bit[23:16] of PCIe ISR 0 register(RD0074840h)
- second set is bit[11:8] of PCIe ISR 1 register(RD0074848h)
Only one set should be used, while another set should be masked.
The second set, ISR1, is more advanced, the Legacy INT_X status bit is
asserted once Assert_INTX message is received, and de-asserted after
Deassert_INTX message is received which matches what the driver is
currently doing in the ->irq_mask() and ->irq_unmask() functions.
The ISR0 requires additional work to deassert the interrupt, which the
driver does not currently implement, therefore it needs fixing.
Update the driver to use ISR1 register set, fixing current
implementation.
Fixes: 8c39d71036 ("PCI: aardvark: Add Aardvark PCI host controller driver")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196339
Signed-off-by: Victor Gu <xigu@marvell.com>
[Thomas: tweak commit log.]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
[lorenzo.pieralisi@arm.com: updated the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Evan Wang <xswang@marvell.com>
Reviewed-by: Nadav Haklai <nadavh@marvell.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4fa3999ee6 upstream.
When setting the PIO_ADDR_LS register during a configuration read, we
were properly passing the device number, function number and register
number, but not the bus number, causing issues when reading the
configuration of PCIe devices.
Fixes: 8c39d71036 ("PCI: aardvark: Add Aardvark PCI host controller driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Victor Gu <xigu@marvell.com>
Reviewed-by: Wilson Ding <dingwei@marvell.com>
Reviewed-by: Nadav Haklai <nadavh@marvell.com>
[Thomas: tweak commit log.]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 660661afcd upstream.
The PCI configuration space read/write functions were special casing
the situation where PCI_SLOT(devfn) != 0, and returned
PCIBIOS_DEVICE_NOT_FOUND in this case.
However, while this is what is intended for the root bus, it is not
intended for the child busses, as it prevents discovering devices with
PCI_SLOT(x) != 0. Therefore, we return PCIBIOS_DEVICE_NOT_FOUND only
if we're on the root bus.
Fixes: 8c39d71036 ("PCI: aardvark: Add Aardvark PCI host controller driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Victor Gu <xigu@marvell.com>
Reviewed-by: Wilson Ding <dingwei@marvell.com>
Reviewed-by: Nadav Haklai <nadavh@marvell.com>
[Thomas: tweak commit log.]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d5ec281c0 upstream.
The preauth hash was not being recalculated properly on reconnect
of SMB3.11 dialect mounts (which caused access denied repeatedly
on auto-reconnect).
Fixes: 8bd68c6e47 ("CIFS: implement v3.11 preauth integrity")
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1dc3039bc8 upstream.
When blk_queue_enter() waits for a queue to unfreeze, or unset the
PREEMPT_ONLY flag, do not allow it to be interrupted by a signal.
The PREEMPT_ONLY flag was introduced later in commit 3a0a529971
("block, scsi: Make SCSI quiesce and resume work reliably"). Note the SCSI
device is resumed asynchronously, i.e. after un-freezing userspace tasks.
So that commit exposed the bug as a regression in v4.15. A mysterious
SIGBUS (or -EIO) sometimes happened during the time the device was being
resumed. Most frequently, there was no kernel log message, and we saw Xorg
or Xwayland killed by SIGBUS.[1]
[1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979
Without this fix, I get an IO error in this test:
# dd if=/dev/sda of=/dev/null iflag=direct & \
while killall -SIGUSR1 dd; do sleep 0.1; done & \
echo mem > /sys/power/state ; \
sleep 5; killall dd # stop after 5 seconds
The interruptible wait was added to blk_queue_enter in
commit 3ef28e83ab ("block: generic request_queue reference counting").
Before then, the interruptible wait was only in blk-mq, but I don't think
it could ever have been correct.
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alan Jenkins <alan.christopher.jenkins@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72961c4e60 upstream.
Even if we don't have an IO context attached to a request, we still
need to clear the priv[0..1] pointers, as they could be pointing
to previously used bic/bfqq structures. If we don't do so, we'll
either corrupt memory on dispatching a request, or cause an
imbalance in counters.
Inspired by a fix from Kees.
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Fixes: aee69d78de ("block, bfq: introduce the BFQ-v0 I/O scheduler as an extra scheduler")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f4560231ec upstream.
rq->gstate and rq->aborted_gstate both are zero before rqs are
allocated. If we have a small timeout, when the timer fires,
there could be rqs that are never allocated, and also there could
be rq that has been allocated but not initialized and started. At
the moment, the rq->gstate and rq->aborted_gstate both are 0, thus
the blk_mq_terminate_expired will identify the rq is timed out and
invoke .timeout early.
For scsi, this will cause scsi_times_out to be invoked before the
scsi_cmnd is not initialized, scsi_cmnd->device is still NULL at
the moment, then we will get crash.
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Martin Steigerwald <Martin@Lichtvoll.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccce20fc79 upstream.
Since SCSI scanning occurs asynchronously, since sd_revalidate_disk() is
called from sd_probe_async() and since sd_revalidate_disk() calls
sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called
concurrently with blkdev_report_zones() and/or blkdev_reset_zones(). That can
cause these functions to fail with -EIO because sd_zbc_read_zones() e.g. sets
q->nr_zones to zero before restoring it to the actual value, even if no drive
characteristics have changed. Avoid that this can happen by making the
following changes:
- Protect the code that updates zone information with blk_queue_enter()
and blk_queue_exit().
- Modify sd_zbc_setup_seq_zones_bitmap() and sd_zbc_setup() such that
these functions do not modify struct scsi_disk before all zone
information has been obtained.
Note: since commit 055f6e18e0 ("block: Make q_usage_counter also track
legacy requests"; kernel v4.15) the request queue freezing mechanism also
affects legacy request queues.
Fixes: 89d9475610 ("sd: Implement support for ZBC devices")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: stable@vger.kernel.org # v4.16
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f6997bec6a upstream.
The block responsible of parsing the DT for the number of chip-select
lines uses an 'if/else if/else if' block. The content of the second and
third 'else if' conditions are:
1/ the actual condition to enter the sub-block and
2/ the operation to do in this sub-block.
[...]
else if (condition1_to_enter && action1() == failed)
raise_error();
else if (condition2_to_enter && action2() == failed)
raise_error();
[...]
In case of failure, the sub-block is entered and an error raised.
Otherwise, in case of success, the code would continue erroneously in
the next 'else if' statement because it did not failed (and did not
enter the first 'else if' sub-block).
The first 'else if' refers to legacy bindings while the second 'else if'
refers to new bindings. The second 'else if', which is entered
erroneously, checks for the 'reg' property, which, for old bindings,
does not mean anything because it would not be the number of CS
available, but the regular register map of almost any DT node. This
being said, the content of the 'reg' property being the register map
offset and length, it has '2' values, so the number of CS in this
situation is assumed to be '2'.
When running nand_scan_ident() with 2 CS, the core will check for an
array of chips. It will first issue a RESET and then a READ_ID. Of
course this will trigger two timeouts because there is no chip in front
of the second CS:
[ 1.367460] marvell-nfc f2720000.nand: Timeout on CMDD (NDSR: 0x00000080)
[ 1.474292] marvell-nfc f2720000.nand: Timeout on CMDD (NDSR: 0x00000280)
Indeed, this is harmless and the core will then assume there is only one
valid CS.
Fix the logic in the whole block by entering each sub-block just on the
'is legacy' condition, doing the action inside the sub-block. This way,
when the action succeeds, the whole block is left.
Furthermore, for both the old bindings and the new bindings the same
logic was applied to retrieve the number of CS lines:
using of_get_property() to get a size in bytes, converted in the actual
number of lines by dividing it per sizeof(u32) (4 bytes).
This is fine for the 'reg' property which is a list of the CS IDs but
not for the 'num-cs' property which is directly the value of the number
of CS.
Anyway, no existing DT uses another value than 'num-cs = <1>' and no
other value has ever been supported by the old driver (pxa3xx_nand.c).
Remove this condition and apply a number of 1 CS anyway, as already
described in the bindings.
Finally, the 'reg' property of a 'nand' node (with the new bindings)
gives the IDs of each CS line in use. marvell_nand.c driver first look
at the number of CS lines that are present in this property.
Better use of_property_count_elems_of_size() than dividing by 4 the size
of the number of bytes returned by of_get_property().
Fixes: 02f26ecf8c ("mtd: nand: add reworked Marvell NAND controller driver")
Cc: stable@vger.kernel.org
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 47016b341f upstream.
The current Cadence QSPI driver caused a kernel panic when loading
a Root Filesystem from QSPI. The problem was caused by reading more
bytes than needed because the QSPI operated on 4 bytes at a time.
<snip>
[ 7.947754] spi_nor_read[1048]:from 0x037cad74, len 1 [bfe07fff]
[ 7.956247] cqspi_read[910]:offset 0x58502516, buffer=bfe07fff
[ 7.956247]
[ 7.966046] Unable to handle kernel paging request at virtual
address bfe08002
[ 7.973239] pgd = eebfc000
[ 7.975931] [bfe08002] *pgd=2fffb811, *pte=00000000, *ppte=00000000
</snip>
Notice above how only 1 byte needed to be read but by reading 4 bytes
into the end of a mapped page, an unrecoverable page fault occurred.
This patch uses a temporary buffer to hold the 4 bytes read and then
copies only the bytes required into the buffer. A min() function is
used to limit the length to prevent buffer overflows.
Request testing of this patch on other platforms. This was tested
on the Intel Arria10 SoCFPGA DevKit.
Fixes: 0cf1725676 ("mtd: spi-nor: cqspi: Fix build on arches missing readsl/writesl")
Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65811834ba upstream.
On this Lenovo ThinkCentre machine. There are two front mics,
we change the location for one of them.
Relation: f33f79f3d0 ("ALSA: hda/realtek - change the location for
one of two front microphones")
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea04a1dbf8 upstream.
Fill COEF to change EAPD to verb control.
Assigned codec type.
This is an additional fix over 92f974df34 ("ALSA: hda/realtek - New
vendor ID for ALC233").
[ More notes:
according to Kailang, the chip is 10ec:0235 bonding for ALC233b,
which is equivalent with ALC255. It's only used for Lenovo.
The chip needs no alc_process_coef_fw() for headset unlike ALC255. ]
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69fa6f19b9 upstream.
As recently Smatch suggested, one place in HD-audio hwdep ioctl codes
may expand the array directly from the user-space value with
speculation:
sound/pci/hda/hda_local.h:467 get_wcaps() warn: potential spectre issue 'codec->wcaps'
As get_wcaps() itself is a fairly frequently called inline function,
and there is only one single call with a user-space value, we replace
only the latter one to open-code locally with array_index_nospec()
hardening in this patch.
BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d218dd811 upstream.
As Smatch recently suggested, a few places in OSS sequencer codes may
expand the array directly from the user-space value with speculation,
namely there are a significant amount of references to either
info->ch[] or dp->synths[] array:
sound/core/seq/oss/seq_oss_event.c:315 note_on_event() warn: potential spectre issue 'info->ch' (local cap)
sound/core/seq/oss/seq_oss_event.c:362 note_off_event() warn: potential spectre issue 'info->ch' (local cap)
sound/core/seq/oss/seq_oss_synth.c:470 snd_seq_oss_synth_load_patch() warn: potential spectre issue 'dp->synths' (local cap)
sound/core/seq/oss/seq_oss_event.c:293 note_on_event() warn: potential spectre issue 'dp->synths'
sound/core/seq/oss/seq_oss_event.c:353 note_off_event() warn: potential spectre issue 'dp->synths'
sound/core/seq/oss/seq_oss_synth.c:506 snd_seq_oss_synth_sysex() warn: potential spectre issue 'dp->synths'
sound/core/seq/oss/seq_oss_synth.c:580 snd_seq_oss_synth_ioctl() warn: potential spectre issue 'dp->synths'
Although all these seem doing only the first load without further
reference, we may want to stay in a safer side, so hardening with
array_index_nospec() would still make sense.
We may put array_index_nospec() at each place, but here we take a
different approach:
- For dp->synths[], change the helpers to retrieve seq_oss_synthinfo
pointer directly instead of the array expansion at each place
- For info->ch[], harden in a normal way, as there are only a couple
of places
As a result, the existing helper, snd_seq_oss_synth_is_valid() is
replaced with snd_seq_oss_synth_info(). Also, we cover MIDI device
where a similar array expansion is done, too, although it wasn't
reported by Smatch.
BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f5e94b4c6e upstream.
When get_synthdev() is called for a MIDI device, it returns the fixed
midi_synth_dev without the use refcounting. OTOH, the caller is
supposed to unreference unconditionally after the usage, so this would
lead to unbalanced refcount.
This patch corrects the behavior and keep up the refcount balance also
for the MIDI synth device.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f853dcaae2 upstream.
It looks like a simple mistake that this struct member
was forgotten.
Audio_tstamp isn't used much, and on some archs (such as x86) this
ioctl is not used by default, so that might be the reason why this
has slipped for so long.
Fixes: 4eeaaeaea1 ("ALSA: core: add hooks for audio timestamps")
Signed-off-by: David Henningsson <diwic@ubuntu.com>
Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Cc: <stable@vger.kernel.org> # v3.8+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 912e4c3320 upstream.
The commit c2c86a9717 ("ALSA: pcm: Remove set_fs() in PCM core code")
changed SNDRV_PCM_IOCTL_DELAY to return an inconsistent error instead of a
negative delay. Originally the call would succeed and return the negative
delay. The Chromium OS Audio Server (CRAS) gets confused and hangs when
the error is returned instead of the negative delay.
Help CRAS avoid the issue by rolling back the behavior to return a
negative delay instead of an error.
Fixes: c2c86a9717 ("ALSA: pcm: Remove set_fs() in PCM core code")
Signed-off-by: Jeffery Miller <jmiller@neverware.com>
Cc: <stable@vger.kernel.org> # v4.13+
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 088e861edf upstream.
As recently Smatch suggested, a few places in ALSA control core codes
may expand the array directly from the user-space value with
speculation:
sound/core/control.c:1003 snd_ctl_elem_lock() warn: potential spectre issue 'kctl->vd'
sound/core/control.c:1031 snd_ctl_elem_unlock() warn: potential spectre issue 'kctl->vd'
sound/core/control.c:844 snd_ctl_elem_info() warn: potential spectre issue 'kctl->vd'
sound/core/control.c:891 snd_ctl_elem_read() warn: potential spectre issue 'kctl->vd'
sound/core/control.c:939 snd_ctl_elem_write() warn: potential spectre issue 'kctl->vd'
Although all these seem doing only the first load without further
reference, we may want to stay in a safer side, so hardening with
array_index_nospec() would still make sense.
In this patch, we put array_index_nospec() to the common
snd_ctl_get_ioff*() helpers instead of each caller. These helpers are
also referred from some drivers, too, and basically all usages are to
calculate the array index from the user-space value, hence it's better
to cover there.
BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f526afcd8f upstream.
As recently Smatch suggested, one place in RME9652 driver may expand
the array directly from the user-space value with speculation:
sound/pci/rme9652/rme9652.c:2074 snd_rme9652_channel_info() warn: potential spectre issue 'rme9652->channel_map' (local cap)
This patch puts array_index_nospec() for hardening against it.
BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 10513142a7 upstream.
As recently Smatch suggested, a couple of places in HDSP MADI driver
may expand the array directly from the user-space value with
speculation:
sound/pci/rme9652/hdspm.c:5717 snd_hdspm_channel_info() warn: potential spectre issue 'hdspm->channel_map_out' (local cap)
sound/pci/rme9652/hdspm.c:5734 snd_hdspm_channel_info() warn: potential spectre issue 'hdspm->channel_map_in' (local cap)
This patch puts array_index_nospec() for hardening against them.
BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9d94b57e3 upstream.
As recently Smatch suggested, a couple of places in ASIHPI driver may
expand the array directly from the user-space value with speculation:
sound/pci/asihpi/hpimsginit.c:70 hpi_init_response() warn: potential spectre issue 'res_size' (local cap)
sound/pci/asihpi/hpioctl.c:189 asihpi_hpi_ioctl() warn: potential spectre issue 'adapters'
This patch puts array_index_nospec() for hardening against them.
BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a7d6003df upstream.
When CONFIG_SND_DYNAMIC_MINORS isn't set, there are only limited
number of devices available, and HD-audio, especially with HDMI/DP
codec, will fail to create more than two devices.
The driver warns about the lack of such devices and skips the PCM
device creations, but the HDMI driver still tries to create the
corresponding JACK, SPDIF and ELD controls even for the non-existing
PCM substreams. This results in confusion on user-space, and even may
break the operation.
Similarly, Intel HDMI/DP codec builds the ELD notification from i915
graphics driver, and this may be broken if a notification is sent for
the non-existing PCM stream.
This patch adds the check of the existence of the assigned PCM
substream in the both scenarios above, and skips the further operation
if the PCM substream is not assigned.
Fixes: 9152085def ("ALSA: hda - add DP MST audio support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f925660a7 upstream.
In error path of snd_dice_stream_init_duplex(), stream data for incoming
packet can be left to be initialized.
This commit fixes it.
Fixes: 436b5abe22 ('ALSA: dice: handle whole available isochronous streams')
Cc: <stable@vger.kernel.org> # v4.6+
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb4c041947 upstream.
SMB server will not sign data transferred through RDMA read/write. When
signing is used, it's a good idea to have all the data signed.
In this case, use RDMA send/recv for all data transfers. This will degrade
performance as this is not generally configured in RDMA environemnt. So
warn the user on signing and RDMA send/recv.
Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e9ec225479 upstream.
Commit ea3d8465ab ("tty: n_gsm: Allow ADM response in addition to UA for
control dlci") added support for DLCI to stay in Asynchronous Disconnected
Mode (ADM). But we still get long delays waiting for commands to other
DLCI to complete:
--> 5) C: SABM(P)
Q> 0) C: UIH(F)
Q> 0) C: UIH(F)
Q> 0) C: UIH(F)
...
This happens because gsm_control_send() sets cretries timer to T2 that is
by default set to 34. This will cause resend for T2 times for the control
frame. In ADM mode, we will never get a response so the control frame, so
retries are just delaying all the commands.
Let's fix the issue by setting DLCI_MODE_ADM flag after detecting the ADM
mode for the control DLCI. Then we can use that in gsm_control_send() to
set retries to 1. This means the control frame will be sent once allowing
the other end at an opportunity to switch from ADM to ABM mode.
Note that retries will be decremented in gsm_control_retransmit() so
we don't want to set it to 0 here.
Fixes: ea3d8465ab ("tty: n_gsm: Allow ADM response in addition to UA for control dlci")
Cc: linux-serial@vger.kernel.org
Cc: Alan Cox <alan@llwyncelyn.cymru>
Cc: Dan Williams <dcbw@redhat.com>
Cc: Jiri Prchal <jiri.prchal@aksignal.cz>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Marcel Partap <mpartap@gmx.net>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Michael Nazzareno Trimarchi <michael@amarulasolutions.com>
Cc: Michael Scott <michael.scott@linaro.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Russ Gorby <russ.gorby@intel.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c60300d68 upstream.
When out of memory and we can't add ctrl vq buffers,
probe fails. Unfortunately the error handling is
out of spec: it calls del_vqs without bothering
to reset the device first.
To fix, call the full cleanup function in this case.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7a69ec0d8 upstream.
Console driver is out of spec. The spec says:
A driver MUST NOT decrement the available idx on a live
virtqueue (ie. there is no way to “unexpose” buffers).
and it does exactly that by trying to detach unused buffers
without doing a device reset first.
Defer detaching the buffers until device unplug.
Of course this means we might get an interrupt for
a vq without an attached port now. Handle that by
discarding the consumed buffer.
Reported-by: Tiwei Bie <tiwei.bie@intel.com>
Fixes: b3258ff1d6 ("virtio: Decrement avail idx on buffer detach")
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2855b33514 upstream.
an allocated buffer doesn't need to be tied to a vq -
only vq->vdev is ever used. Pass the function the
just what it needs - the vdev.
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d8d6428d1 upstream.
The Dell Dock USB-audio device with 0bda:4014 is behaving notoriously
bad, and we have already applied some workaround to avoid the firmware
hiccup. Yet we still need to skip one thing, the Extension Unit at ID
4, which doesn't react correctly to the mixer ctl access.
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1090658
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 83a62c51ba upstream.
On chromebooks we depend on wakeup count to identify the wakeup source.
But currently USB devices do not increment the wakeup count when they
trigger the remote wake. This patch addresses the same.
Resume condition is reported differently on USB 2.0 and USB 3.0 devices.
On USB 2.0 devices, a wake capable device, if wake enabled, drives
resume signal to indicate a remote wake (USB 2.0 spec section 7.1.7.7).
The upstream facing port then sets C_PORT_SUSPEND bit and reports a
port change event (USB 2.0 spec section 11.24.2.7.2.3). Thus if a port
has resumed before driving the resume signal from the host and
C_PORT_SUSPEND is set, then the device attached to the given port might
be the reason for the last system wakeup. Increment the wakeup count for
the same.
On USB 3.0 devices, a function may signal that it wants to exit from device
suspend by sending a Function Wake Device Notification to the host (USB3.0
spec section 8.5.6.4) Thus on receiving the Function Wake, increment the
wakeup count.
Signed-off-by: Ravi Chandra Sadineni <ravisadineni@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 46c6975a1f upstream.
Commit 68a0db1d7d reworked the baud rate selection, but also added
a (not so) subtle change in the way the local flags (c_lflag in the
termios structure) are handled, forcing the new flags to always be the
same as the old ones.
The reason for that particular change is both obscure and undocumented.
It also completely breaks userspace. Something as trivial as getty is
unusable:
<example>
Debian GNU/Linux 9 sy-borg ttyMV0
sy-borg login: root
root
[timeout]
Debian GNU/Linux 9 sy-borg ttyMV0
</example>
which is quite obvious in retrospect: getty cannot get in control of
the echo mode, is stuck in canonical mode, and times out without ever
seeing anything valid. It also begs the question of how this change was
ever tested.
The fix is pretty obvious: stop messing with c_lflag, and the world
will be a happier place.
Cc: stable@vger.kernel.org # 4.15+
Fixes: 68a0db1d7d ("serial: mvebu-uart: add function to change baudrate")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 470b5d6f0c upstream.
Arrow USB Blaster integrated on MAX1000 board uses the same vendor ID
(0x0403) and product ID (0x6010) as the "original" FTDI device.
This patch avoids picking up by ftdi_sio of the first interface of this
USB device. After that this device can be used by Arrow user-space JTAG
driver.
Signed-off-by: Vasyl Vavrychuk <vvavrychuk@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 621faf4f6a upstream.
The Dell Inspiron 5775 is a Raven Ridge. The Enable Slot command timed
out when a USB device gets plugged:
[ 212.156326] xhci_hcd 0000:03:00.3: Error while assigning device slot ID
[ 212.156340] xhci_hcd 0000:03:00.3: Max number of devices this xHCI host supports is 64.
[ 212.156348] usb usb2-port3: couldn't allocate usb_device
AMD suggests that a delay before xHC suspends can fix the issue.
I can confirm it fixes the issue, so use the suspend delay quirk for
Raven Ridge's xHC.
Cc: stable@vger.kernel.org
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8ef35c866f upstream.
Until the primary_crng is fully initialized, don't initialize the NUMA
crng nodes. Otherwise users of /dev/urandom on NUMA systems before
the CRNG is fully initialized can get very bad quality randomness. Of
course everyone should move to getrandom(2) where this won't be an
issue, but there's a lot of legacy code out there. This related to
CVE-2018-1108.
Reported-by: Jann Horn <jannh@google.com>
Fixes: 1e7f583af6 ("random: make /dev/urandom scalable for silly...")
Cc: stable@kernel.org # 4.8+
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22be37acce upstream.
Currently in ext4_valid_block_bitmap() we expect the bitmap to be
positioned anywhere between 0 and s_blocksize clusters, but that's
wrong because the bitmap can be placed anywhere in the block group. This
causes false positives when validating bitmaps on perfectly valid file
system layouts. Fix it by checking whether the bitmap is within the group
boundary.
The problem can be reproduced using the following
mkfs -t ext3 -E stride=256 /dev/vdb1
mount /dev/vdb1 /mnt/test
cd /mnt/test
wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.3.tar.xz
tar xf linux-4.16.3.tar.xz
This will result in the warnings in the logs
EXT4-fs error (device vdb1): ext4_validate_block_bitmap:399: comm tar: bg 84: block 2774529: invalid block bitmap
[ Changed slightly for clarity and to not drop a overflow test -- TYT ]
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Ilya Dryomov <idryomov@gmail.com>
Fixes: 7dac4a1726 ("ext4: add validity checks for bitmap block numbers")
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b2569260d5 upstream.
If ext4 tries to start a reserved handle via
jbd2_journal_start_reserved(), and the journal has been aborted, this
can result in a NULL pointer dereference. This is because the fields
h_journal and h_transaction in the handle structure share the same
memory, via a union, so jbd2_journal_start_reserved() will clear
h_journal before calling start_this_handle(). If this function fails
due to an aborted handle, h_journal will still be NULL, and the call
to jbd2_journal_free_reserved() will pass a NULL journal to
sub_reserve_credits().
This can be reproduced by running "kvm-xfstests -c dioread_nolock
generic/475".
Cc: stable@kernel.org # 3.11
Fixes: 8f7d89f368 ("jbd2: transaction reservation support")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 349fa7d6e1 upstream.
During the "insert range" fallocate operation, extents starting at the
range offset are shifted "right" (to a higher file offset) by the range
length. But, as shown by syzbot, it's not validated that this doesn't
cause extents to be shifted beyond EXT_MAX_BLOCKS. In that case
->ee_block can wrap around, corrupting the extent tree.
Fix it by returning an error if the space between the end of the last
extent and EXT4_MAX_BLOCKS is smaller than the range being inserted.
This bug can be reproduced by running the following commands when the
current directory is on an ext4 filesystem with a 4k block size:
fallocate -l 8192 file
fallocate --keep-size -o 0xfffffffe000 -l 4096 -n file
fallocate --insert-range -l 8192 file
Then after unmounting the filesystem, e2fsck reports corruption.
Reported-by: syzbot+06c885be0edcdaeab40c@syzkaller.appspotmail.com
Fixes: 331573febb ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 53fa1f6e8a upstream.
Commit 5928c28152 (ACPI / video: Default lcd_only to true on Win8-ready
and newer machines) made only_lcd default to true on all machines where
acpi_osi_is_win8() returns true, including laptops.
The purpose of this is to avoid the bogus / non-working acpi backlight
interface which many newer BIOS-es define on desktop machines.
But this is causing a regression on some laptops, specifically on the
Dell XPS 13 2013 model, which does not have the LCD flag set for its
fully functional ACPI backlight interface.
Rather then DMI quirking our way out of this, this commits changes the
logic for setting only_lcd to true, to only do this on machines with
a desktop (or server) dmi chassis-type.
Note that we cannot simply only check the chassis-type and not register
the backlight interface based on that as there are some laptops and
tablets which have their chassis-type set to "3" aka desktop. Hopefully
the combination of checking the LCD flag, but only on devices with
a desktop(ish) chassis-type will avoid the needs for DMI quirks for this,
or at least limit the amount of DMI quirks which we need to a minimum.
Fixes: 5928c28152 (ACPI / video: Default lcd_only to true on Win8-ready and newer machines)
Reported-and-tested-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bcbd385b61 upstream.
File /sys/kernel/debug/kprobes/blacklist displays random addresses:
[root@s8360046 linux]# cat /sys/kernel/debug/kprobes/blacklist
0x0000000047149a90-0x00000000bfcb099a print_type_x8
....
This breaks 'perf probe' which uses the blacklist file to prohibit
probes on certain functions by checking the address range.
Fix this by printing the correct (unhashed) address.
The file mode is read all but this is not an issue as the file
hierarchy points out:
# ls -ld /sys/ /sys/kernel/ /sys/kernel/debug/ /sys/kernel/debug/kprobes/
/sys/kernel/debug/kprobes/blacklist
dr-xr-xr-x 12 root root 0 Apr 19 07:56 /sys/
drwxr-xr-x 8 root root 0 Apr 19 07:56 /sys/kernel/
drwx------ 16 root root 0 Apr 19 06:56 /sys/kernel/debug/
drwxr-xr-x 2 root root 0 Apr 19 06:56 /sys/kernel/debug/kprobes/
-r--r--r-- 1 root root 0 Apr 19 06:56 /sys/kernel/debug/kprobes/blacklist
Everything in and below /sys/kernel/debug is rwx to root only,
no group or others have access.
Background:
Directory /sys/kernel/debug/kprobes is created by debugfs_create_dir()
which sets the mode bits to rwxr-xr-x. Maybe change that to use the
parent's directory mode bits instead?
Link: http://lkml.kernel.org/r/20180419105556.86664-1-tmricht@linux.ibm.com
Fixes: ad67b74d24 ("printk: hash addresses printed with %p")
Cc: stable@vger.kernel.org
Cc: <stable@vger.kernel.org> # v4.15+
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: acme@kernel.org
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 783c3b53b9 upstream.
Implement s390 specific arch_uretprobe_is_alive() to avoid SIGSEGVs
observed with uretprobes in combination with setjmp/longjmp.
See commit 2dea1d9c38 ("powerpc/uprobes: Implement
arch_uretprobe_is_alive()") for more details.
With this implemented all test cases referenced in the above commit
pass.
Reported-by: Ziqian SUN <zsun@redhat.com>
Cc: <stable@vger.kernel.org> # v4.3+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d27a2bf6e upstream.
When a new CKD storage volume is defined at the storage server, Linux
may be relying on outdated information about that volume, which leads to
the following errors:
1. Command Reject Errors for minidisk on z/VM:
dasd-eckd.b3193d: 0.0.XXXX: An error occurred in the DASD device driver,
reason=09
dasd(eckd): I/O status report for device 0.0.XXXX:
dasd(eckd): in req: 00000000XXXXXXXX CC:00 FC:04 AC:00 SC:17 DS:02 CS:00
RC:0
dasd(eckd): device 0.0.2046: Failing CCW: 00000000XXXXXXXX
dasd(eckd): Sense(hex) 0- 7: 80 00 00 00 00 00 00 00
dasd(eckd): Sense(hex) 8-15: 00 00 00 00 00 00 00 00
dasd(eckd): Sense(hex) 16-23: 00 00 00 00 e1 00 0f 00
dasd(eckd): Sense(hex) 24-31: 00 00 40 e2 00 00 00 00
dasd(eckd): 24 Byte: 0 MSG 0, no MSGb to SYSOP
2. Equipment Check errors on LPAR or for dedicated devices on z/VM:
dasd(eckd): I/O status report for device 0.0.XXXX:
dasd(eckd): in req: 00000000XXXXXXXX CC:00 FC:04 AC:00 SC:17 DS:0E CS:40
fcxs:01 schxs:00 RC:0
dasd(eckd): device 0.0.9713: Failing TCW: 00000000XXXXXXXX
dasd(eckd): Sense(hex) 0- 7: 10 00 00 00 13 58 4d 0f
dasd(eckd): Sense(hex) 8-15: 67 00 00 00 00 00 00 04
dasd(eckd): Sense(hex) 16-23: e5 18 05 33 97 01 0f 0f
dasd(eckd): Sense(hex) 24-31: 00 00 40 e2 00 04 58 0d
dasd(eckd): 24 Byte: 0 MSG f, no MSGb to SYSOP
Fix this problem by using the up-to-date information provided during
online processing via the device specific SNEQ to detect the case of
outdated LCU data. If there is a difference, perform a re-read of that
data.
Cc: stable@vger.kernel.org
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af2e460ade upstream.
Channel path descriptors have been seen as something stable (as
long as the chpid is configured). Recent tests have shown that the
descriptor can also be altered when the link state of a channel path
changes. Thus it is necessary to update the descriptor during
handling of resource accessibility events.
Cc: <stable@vger.kernel.org>
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b59788979 upstream.
Ryzen 2700X has a temperature offset of 10 degrees C. If bit 19 of the
Temperature Control register is set, there is an additional offset of
49 degrees C. Take this into account as well.
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a13388d7a upstream.
Reading to the end of a 720K disk results in an IO error instead of EOF
because the block layer thinks the disk has 2880 sectors. (Partly this
is a result of inverted logic of the ONEMEG_MEDIA bit that's now fixed.)
Initialize the density and head count in swim_add_floppy() to agree
with the device size passed to set_capacity() during drive probe.
Call set_capacity() again upon device open, after refreshing the density
and head count values.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56a1c5ee54 upstream.
The Sony drive status bits use active-low logic. The swim_readbit()
function converts that to 'C' logic for readability. Hence, the
sense of the names of the status bit macros should not be inverted.
Mostly they are correct. However, the TWOMEG_DRIVE, MFM_MODE and
TWOMEG_MEDIA macros have inverted sense (like MkLinux). Fix this
inconsistency and make the following patches less confusing.
The same problem affects swim3.c so fix that too.
No functional change.
The FDHD drive status bits are documented in sonydriv.cpp from MAME
and in swimiii.h from MkLinux.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e2ab5a4ef upstream.
The 'eject' shell command may send various different ioctl commands.
This leads to error messages on the console even though the FDEJECT
ioctl succeeds.
~# eject floppy
SWIM floppy_ioctl: unknown cmd 21257
SWIM floppy_ioctl: unknown cmd 1
Don't log an error message for an invalid ioctl, just do as the
swim3 driver does and return -ENOTTY.
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: stable@vger.kernel.org # v4.14+
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d90a10e244 upstream.
fsnotify() acquires a reference to a fsnotify_mark_connector through
the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it
appears that no precautions are taken in fsnotify_put_mark() to
ensure that fsnotify() drops its reference to this
fsnotify_mark_connector before assigning a value to its 'destroy_next'
field. This can result in fsnotify_put_mark() assigning a value
to a connector's 'destroy_next' field right before fsnotify() tries to
traverse the linked list referenced by the connector's 'list' field.
Since these two fields are members of the same union, this behavior
results in a kernel panic.
This issue is resolved by moving the connector's 'destroy_next' field
into the object pointer union. This should work since the object pointer
access is protected by both a spinlock and the value of the 'flags'
field, and the 'flags' field is cleared while holding the spinlock in
fsnotify_put_mark() before 'destroy_next' is updated. It shouldn't be
possible for another thread to accidentally read from the object pointer
after the 'destroy_next' field is updated.
The offending behavior here is extremely unlikely; since
fsnotify_put_mark() removes references to a connector (specifically,
it ensures that the connector is unreachable from the inode it was
formerly attached to) before updating its 'destroy_next' field, a
sizeable chunk of code in fsnotify_put_mark() has to execute in the
short window between when fsnotify() acquires the connector reference
and saves the value of its 'list' field. On the HEAD kernel, I've only
been able to reproduce this by inserting a udelay(1) in fsnotify().
However, I've been able to reproduce this issue without inserting a
udelay(1) anywhere on older unmodified release kernels, so I believe
it's worth fixing at HEAD.
References: https://bugzilla.kernel.org/show_bug.cgi?id=199437
Fixes: 08991e83b7
CC: stable@vger.kernel.org
Signed-off-by: Robert Kolchmeyer <rkolchmeyer@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9de4ee4054 upstream.
This cast is wrong. "cdi->capacity" is an int and "arg" is an unsigned
long. The way the check is written now, if one of the high 32 bits is
set then we could read outside the info->slots[] array.
This bug is pretty old and it predates git.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d7fad4c840 ]
Programming vids (adding or removing them) still passes
guest-endian values in the DMA buffer. That's wrong
if guest is big-endian and when virtio 1 is enabled.
Note: this is on top of a previous patch:
virtio_net: split out ctrl buffer
Fixes: 9465a7a6f ("virtio_net: enable v1.0 support")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 12e5716938 ]
When sending control commands, virtio net sets up several buffers for
DMA. The buffers are all part of the net device which means it's
actually allocated by kvmalloc so it's in theory (on extreme memory
pressure) possible to get a vmalloc'ed buffer which on some platforms
means we can't DMA there.
Fix up by moving the DMA buffers into a separate structure.
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9a11aff25f ]
In case netdev is closed at the moment of pci shutdown, aq_nic_stop
gets called second time. napi_disable in that case hangs indefinitely.
In other case, if device was never opened at all, we get oops because
of null pointer access.
We should invoke aq_nic_stop conditionally, only if device is running
at the moment of shutdown.
Reported-by: David Arcari <darcari@redhat.com>
Fixes: 90869ddfef ("net: aquantia: Implement pci shutdown callback")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 81c895072d ]
Bogus trimming in tun_net_xmit() causes truncated vlan packets.
skb->len is correct whether or not skb_vlan_tag_present() is true. There
is no more reason to adjust the skb length on xmit in this driver than
any other driver. tun_put_user() adds 4 bytes to the total for tagged
packets because it transmits the tag inline to userspace. This is
similar to a nic transmitting the tag inline on the wire.
Reproducing the bug by sending any tagged packet through back-to-back
connected tap interfaces:
socat TUN,tun-type=tap,iff-up,tun-name=in TUN,tun-type=tap,iff-up,tun-name=out &
ip link add link in name in.20 type vlan id 20
ip addr add 10.9.9.9/24 dev in.20
ip link set in.20 up
tshark -nxxi in -f arp -c1 2>/dev/null &
tshark -nxxi out -f arp -c1 2>/dev/null &
ping -c 1 10.9.9.5 >/dev/null 2>&1
The output from the 'in' and 'out' interfaces are different when the
bug is present:
Capturing on 'in'
0000 ff ff ff ff ff ff 76 cf 76 37 d5 0a 81 00 00 14 ......v.v7......
0010 08 06 00 01 08 00 06 04 00 01 76 cf 76 37 d5 0a ..........v.v7..
0020 0a 09 09 09 00 00 00 00 00 00 0a 09 09 05 ..............
Capturing on 'out'
0000 ff ff ff ff ff ff 76 cf 76 37 d5 0a 81 00 00 14 ......v.v7......
0010 08 06 00 01 08 00 06 04 00 01 76 cf 76 37 d5 0a ..........v.v7..
0020 0a 09 09 09 00 00 00 00 00 00 ..........
Fixes: aff3d70a07 ("tun: allow to attach ebpf socket filter")
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cce96d1883 ]
On ASUS XG-C100C with 1.5.44 firmware a special mode called "dirty wake"
is active. With this mode when motherboard gets powered (but no poweron
happens yet), NIC automatically enables powersave link and watches
for WOL packet.
This normally allows to powerup the PC after AC power failures.
Not all motherboards or bios settings gives power to PCI slots,
so this mode is not enabled on all the hardware.
4.16 linux driver introduced full hardware reset sequence
This is required since before that we had no NIC hardware
reset implemented and there were side effects of "not clean start".
But this full reset is incompatible with "dirty wake" WOL feature
it keeps the PHY link in a special mode forever. As a consequence,
driver sees no link and no traffic.
To fix this we forcibly change FW state to idle state before doing
the full reset. This makes FW to restore link state.
Fixes: c8c82eb net: aquantia: Introduce global AQC hardware reset sequence
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 05e489b159 ]
Commit c1eef220c1 ("vsock: always call
vsock_init_tables()") introduced a module_init() function without a
corresponding module_exit() function.
Modules with an init function can only be removed if they also have an
exit function. Therefore the vsock module was considered "permanent"
and could not be removed.
This patch adds an empty module_exit() function so that "rmmod vsock"
works. No explicit cleanup is required because:
1. Transports call vsock_core_exit() upon exit and cannot be removed
while sockets are still alive.
2. vsock_diag.ko does not perform any action that requires cleanup by
vsock.ko.
Fixes: c1eef220c1 ("vsock: always call vsock_init_tables()")
Reported-by: Xiumei Mu <xmu@redhat.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9267c430c6 ]
We tends to batch submitting packets during XDP_TX. This requires to
kick virtqueue after a batch, we tried to do it through
xdp_do_flush_map() which only makes sense for devmap not XDP_TX. So
explicitly kick the virtqueue in this case.
Reported-by: Kimitoshi Takahashi <ktaka@nii.ac.jp>
Tested-by: Kimitoshi Takahashi <ktaka@nii.ac.jp>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 186b3c998c ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a60faa60da ]
In some firmware images, the length of BNX_DIR_TYPE_PKG_LOG nvram type
could be greater than the fixed buffer length of 4096 bytes allocated by
the driver. This was causing HWRM_NVM_READ to copy more data to the buffer
than the allocated size, causing general protection fault.
Fix the issue by allocating the exact buffer length returned by
HWRM_NVM_FIND_DIR_ENTRY, instead of 4096. Move the kzalloc() call
into the bnxt_get_pkgver() function.
Fixes: 3ebf6f0a09 ("bnxt_en: Add installed-package firmware version reporting via Ethtool GDRVINFO")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5e391dc5a8 ]
The CPDMA_TX_PRIORITY_MAP in real is vlan pcp field priority mapping
register and basically replaces vlan pcp field for tagged packets.
So, set it to be 1:1 mapping. Otherwise, it will cause unexpected
change of egress vlan tagged packets, like prio 2 -> prio 5.
Fixes: e05107e6b7 ("net: ethernet: ti: cpsw: add multi queue support")
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a936b1ef37 ]
Creating the global workqueue during driver init may fail, deal with it.
Also, destroy the created workqueue on any subsequent error.
Fixes: 0f54761d16 ("qeth: Support VEPA mode")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 901e3f49fa ]
For control IO, qeth currently tracks the index of the buffer that it
expects to complete the next IO on each qeth_channel. If the channel
presents an IRQ while this buffer has not yet completed, no completion
processing for _any_ completed buffer takes place.
So if the 'next buffer' is skipped for any sort of reason* (eg. when it
is released due to error conditions, before the IO is started), the
buffer obviously won't switch to PROCESSED until it is eventually
allocated for a _different_ IO and completes.
Until this happens, all completion processing on that channel stalls
and pending requests possibly time out.
As a fix, remove the whole 'next buffer' logic and simply process any
IO buffer right when it completes. A channel will never have more than
one IO pending, so there's no risk of processing out-of-sequence.
*Note: currently just one location in the code really handles this problem,
by advancing the 'next' index manually.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 686c97ee29 ]
Make sure to check both return code fields before(!) processing the
command response. Otherwise we risk operating on invalid data.
This matches an earlier fix for SETASSPARMS commands, see
commit ad3cbf6133 ("s390/qeth: fix error handling in checksum cmd callback").
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3a04ce7130 ]
For SOCK_ZAPPED socket, we don't need to care about llc->sap,
so we should just skip these refcount functions in this case.
Fixes: f7e4367268 ("llc: hold llc_sap before release_sock()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f7e4367268 ]
syzbot reported we still access llc->sap in llc_backlog_rcv()
after it is freed in llc_sap_remove_socket():
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
print_address_description+0x6c/0x20b mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
__asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:430
llc_conn_ac_send_sabme_cmd_p_set_x+0x3a8/0x460 net/llc/llc_c_ac.c:785
llc_exec_conn_trans_actions net/llc/llc_conn.c:475 [inline]
llc_conn_service net/llc/llc_conn.c:400 [inline]
llc_conn_state_process+0x4e1/0x13a0 net/llc/llc_conn.c:75
llc_backlog_rcv+0x195/0x1e0 net/llc/llc_conn.c:891
sk_backlog_rcv include/net/sock.h:909 [inline]
__release_sock+0x12f/0x3a0 net/core/sock.c:2335
release_sock+0xa4/0x2b0 net/core/sock.c:2850
llc_ui_release+0xc8/0x220 net/llc/af_llc.c:204
llc->sap is refcount'ed and llc_sap_remove_socket() is paired
with llc_sap_add_socket(). This can be amended by holding its refcount
before llc_sap_remove_socket() and releasing it after release_sock().
Reported-by: <syzbot+6e181fc95081c2cf9051@syzkaller.appspotmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5411b6187a ]
Commit 0e0c3fee3a ("l2tp: hold reference on tunnels printed in pppol2tp proc file")
assumed that if pppol2tp_seq_stop() was called with non-NULL private
data (the 'v' pointer), then pppol2tp_seq_start() would not be called
again. It turns out that this isn't guaranteed, and overflowing the
seq_file's buffer in pppol2tp_seq_show() is a way to get into this
situation.
Therefore, pppol2tp_seq_stop() needs to reset pd->tunnel, so that
pppol2tp_seq_start() won't drop a reference again if it gets called.
We also have to clear pd->session, because the rest of the code expects
a non-NULL tunnel when pd->session is set.
The l2tp_debugfs module has the same issue. Fix it in the same way.
Fixes: 0e0c3fee3a ("l2tp: hold reference on tunnels printed in pppol2tp proc file")
Fixes: f726214d9b ("l2tp: hold reference on tunnels printed in l2tp/tunnels debugfs file")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f726214d9b ]
Use l2tp_tunnel_get_nth() instead of l2tp_tunnel_find_nth(), to be safe
against concurrent tunnel deletion.
Use the same mechanism as in l2tp_ppp.c for dropping the reference
taken by l2tp_tunnel_get_nth(). That is, drop the reference just
before looking up the next tunnel. In case of error, drop the last
accessed tunnel in l2tp_dfs_seq_stop().
That was the last use of l2tp_tunnel_find_nth().
Fixes: 0ad6614048 ("l2tp: Add debugfs files for dumping l2tp debug info")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0e0c3fee3a ]
Use l2tp_tunnel_get_nth() instead of l2tp_tunnel_find_nth(), to be safe
against concurrent tunnel deletion.
Unlike sessions, we can't drop the reference held on tunnels in
pppol2tp_seq_show(). Tunnels are reused across several calls to
pppol2tp_seq_start() when iterating over sessions. These iterations
need the tunnel for accessing the next session. Therefore the only safe
moment for dropping the reference is just before searching for the next
tunnel.
Normally, the last invocation of pppol2tp_next_tunnel() doesn't find
any new tunnel, so it drops the last tunnel without taking any new
reference. However, in case of error, pppol2tp_seq_stop() is called
directly, so we have to drop the reference there.
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5846c131c3 ]
l2tp_tunnel_find_nth() is unsafe: no reference is held on the returned
tunnel, therefore it can be freed whenever the caller uses it.
This patch defines l2tp_tunnel_get_nth() which works similarly, but
also takes a reference on the returned tunnel. The caller then has to
drop it after it stops using the tunnel.
Convert netlink dumps to make them safe against concurrent tunnel
deletion.
Fixes: 309795f4be ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d57493d6d1 ]
This patch checks if sk buffer is available to dererence ife header. If
not then NULL will returned to signal an malformed ife packet. This
avoids to crashing the kernel from outside.
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Reviewed-by: Yotam Gigi <yotam.gi@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cc74eddd0f ]
There is currently no handling to check on a invalid tlv length. This
patch adds such handling to avoid killing the kernel with a malformed
ife packet.
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Reviewed-by: Yotam Gigi <yotam.gi@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 117df655f8 ]
The SFP eeprom indicates the transceiver signals (Rx LOS, Tx Fault, etc.)
that it supports. Update the driver to include checking the eeprom data
when deciding whether to use a transceiver signal.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 96f4d430c5 ]
Update xgbe-phy-v2.c to make use of the auto-negotiation (AN) phy hooks
to improve the ability to successfully complete Clause 73 AN when running
at 10gbps. Hardware can sometimes have issues with CDR lock when the
AN DME page exchange is being performed.
The AN and KR training hooks are used as follows:
- The pre AN hook is used to disable CDR tracking in the PHY so that the
DME page exchange can be successfully and consistently completed.
- The post KR training hook is used to re-enable the CDR tracking so that
KR training can successfully complete.
- The post AN hook is used to check for an unsuccessful AN which will
increase a CDR tracking enablement delay (up to a maximum value).
Add two debugfs entries to allow control over use of the CDR tracking
workaround. The debugfs entries allow the CDR tracking workaround to
be disabled and determine whether to re-enable CDR tracking before or
after link training has been initiated.
Also, with these changes the receiver reset cycle that is performed during
the link status check can be performed less often.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4d945663a6 ]
Add hooks to the driver auto-negotiation (AN) flow to allow the different
phy implementations to perform any steps necessary to improve AN.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 65ec0bd1c7 ]
vmxnet3_get_hdr_len() is used to calculate the header length which in
turn is used to calculate the gso_size for skb. When rxvlan offload is
disabled, vlan tag is present in the header and the function references
ip header from sizeof(ethhdr) and leads to incorrect pointer reference.
This patch fixes this issue by taking sizeof(vlan_ethhdr) into account
if vlan tag is present and correctly references the ip hdr.
Signed-off-by: Ronak Doshi <doshir@vmware.com>
Acked-by: Guolin Yang <gyang@vmware.com>
Acked-by: Louis Luo <llouis@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9cf2f437ca ]
The same fix in Commit dbe173079a ("bridge: fix netconsole
setup over bridge") is also needed for team driver.
While at it, remove the unnecessary parameter *team from
team_port_enable_netpoll().
v1->v2:
- fix it in a better way, as does bridge.
Fixes: 0fb52a27a0 ("team: cleanup netpoll clode")
Reported-by: João Avelino Bellomo Filho <jbellomo@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9d0c75bf6e ]
strp_data_ready resets strp->need_bytes to 0 if strp_peek_len indicates
that the remainder of the message has been received. However,
do_strp_work does not reset strp->need_bytes to 0. If do_strp_work
completes a partial message, the value of strp->need_bytes will continue
to reflect the needed bytes of the previous message, causing
future invocations of strp_data_ready to return early if
strp->need_bytes is less than strp_peek_len. Resetting strp->need_bytes
to 0 in __strp_recv on handing a full message to the upper layer solves
this problem.
__strp_recv also calculates strp->need_bytes using stm->accum_len before
stm->accum_len has been incremented by cand_len. This can cause
strp->need_bytes to be equal to the full length of the message instead
of the full length minus the accumulated length. This, in turn, causes
strp_data_ready to return early, even when there is sufficient data to
complete the partial message. Incrementing stm->accum_len before using
it to calculate strp->need_bytes solves this problem.
Found while testing net/tls_sw recv path.
Fixes: 43a0c6751a ("strparser: Stream parser for messages")
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7c5aba211d ]
struct sock's sk_rcvtimeo is initialized to
LONG_MAX/MAX_SCHEDULE_TIMEOUT in sock_init_data. Calling
mod_delayed_work with a timeout of LONG_MAX causes spurious execution of
the work function. timer->expires is set equal to jiffies + LONG_MAX.
When timer_base->clk falls behind the current value of jiffies,
the delta between timer_base->clk and jiffies + LONG_MAX causes the
expiration to be in the past. Returning early from strp_start_timer if
timeo == LONG_MAX solves this problem.
Found while testing net/tls_sw recv path.
Fixes: 43a0c6751a ("strparser: Stream parser for messages")
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1071ec9d45 ]
pf->cmp_addr() is called before binding a v6 address to the sock. It
should not check ports, like in sctp_inet_cmp_addr.
But sctp_inet6_cmp_addr checks the addr by invoking af(6)->cmp_addr,
sctp_v6_cmp_addr where it also compares the ports.
This would cause that setsockopt(SCTP_SOCKOPT_BINDX_ADD) could bind
multiple duplicated IPv6 addresses after Commit 40b4f0fd74 ("sctp:
lack the check for ports in sctp_v6_cmp_addr").
This patch is to remove af->cmp_addr called in sctp_inet6_cmp_addr,
but do the proper check for both v6 addrs and v4mapped addrs.
v1->v2:
- define __sctp_v6_cmp_addr to do the common address comparison
used for both pf and af v6 cmp_addr.
Fixes: 40b4f0fd74 ("sctp: lack the check for ports in sctp_v6_cmp_addr")
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bd28899dd3 ]
This patch is just wrong, sorry. I was trying to fix a static checker
warning and misread the code. The reference taken in macsec_newlink()
is released in macsec_free_netdev() when the netdevice is destroyed.
This reverts commit 5dcd840088.
Reported-by: Laura Abbott <labbott@redhat.com>
Fixes: 5dcd840088 ("macsec: missing dev_put() on error in macsec_newlink()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a6361f0ca4 ]
Updates to the bitfields in struct packet_sock are not atomic.
Serialize these read-modify-write cycles.
Move po->running into a separate variable. Its writes are protected by
po->bind_lock (except for one startup case at packet_create). Also
replace a textual precondition warning with lockdep annotation.
All others are set only in packet_setsockopt. Serialize these
updates by holding the socket lock. Analogous to other field updates,
also hold the lock when testing whether a ring is active (pg_vec).
Fixes: 8dc4194474 ("[PACKET]: Add optional checksum computation for recvmsg")
Reported-by: DaeRyong Jeong <threeearcat@gmail.com>
Reported-by: Byoungyoung Lee <byoungyoung@purdue.edu>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 565020aaee ]
ACS Feature is currently enabled for GMAC >= 4 but the llc_snap status
is never checked in descriptor rx_status callback. This will cause
stmmac to always strip packets even that ACS feature is already
stripping them.
Lets be safe and disable the ACS feature for GMAC >= 4 and always strip
the packets for this GMAC version.
Fixes: 477286b53f ("stmmac: add GMAC4 core support")
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1255fcb2a6 ]
Calling shutdown with SHUT_RD and SHUT_RDWR for a listening SMC socket
crashes, because
commit 127f497058 ("net/smc: release clcsock from tcp_listen_worker")
releases the internal clcsock in smc_close_active() and sets smc->clcsock
to NULL.
For SHUT_RD the smc_close_active() call is removed.
For SHUT_RDWR the kernel_sock_shutdown() call is omitted, since the
clcsock is already released.
Fixes: 127f497058 ("net/smc: release clcsock from tcp_listen_worker")
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit da42bb2713 ]
PPv2 TX/RX descriptors uses 40bits DMA addresses, but 41 bits masks were
used (GENMASK_ULL(40, 0)).
This commit fixes that by using the correct mask.
Fixes: e7c5359f2e ("net: mvpp2: introduce PPv2.2 HW descriptors and adapt accessors")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 53b76cdf7e ]
When coming from ndisc_netdev_event() in net/ipv6/ndisc.c,
neigh_ifdown() is called with &nd_tbl, locking this while
clearing the proxy neighbor entries when eg. deleting an
interface. Calling the table's pndisc_destructor() with the
lock still held, however, can cause a deadlock: When a
multicast listener is available an IGMP packet of type
ICMPV6_MGM_REDUCTION may be sent out. When reaching
ip6_finish_output2(), if no neighbor entry for the target
address is found, __neigh_create() is called with &nd_tbl,
which it'll want to lock.
Move the elements into their own list, then unlock the table
and perform the destruction.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199289
Fixes: 6fd6ce2056 ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5171b37d95 ]
In order to remove the race caught by syzbot [1], we need
to lock the socket before using po->tp_version as this could
change under us otherwise.
This means lock_sock() and release_sock() must be done by
packet_set_ring() callers.
[1] :
BUG: KMSAN: uninit-value in packet_set_ring+0x1254/0x3870 net/packet/af_packet.c:4249
CPU: 0 PID: 20195 Comm: syzkaller707632 Not tainted 4.16.0+ #83
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x185/0x1d0 lib/dump_stack.c:53
kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
__msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
packet_set_ring+0x1254/0x3870 net/packet/af_packet.c:4249
packet_setsockopt+0x12c6/0x5a90 net/packet/af_packet.c:3662
SYSC_setsockopt+0x4b8/0x570 net/socket.c:1849
SyS_setsockopt+0x76/0xa0 net/socket.c:1828
do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x449099
RSP: 002b:00007f42b5307ce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 000000000070003c RCX: 0000000000449099
RDX: 0000000000000005 RSI: 0000000000000107 RDI: 0000000000000003
RBP: 0000000000700038 R08: 000000000000001c R09: 0000000000000000
R10: 00000000200000c0 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000080eecf R14: 00007f42b53089c0 R15: 0000000000000001
Local variable description: ----req_u@packet_setsockopt
Variable was created at:
packet_setsockopt+0x13f/0x5a90 net/packet/af_packet.c:3612
SYSC_setsockopt+0x4b8/0x570 net/socket.c:1849
Fixes: f6fb8f100b ("af-packet: TPACKET_V3 flexible buffer implementation.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b905ef9ab9 ]
The connection timers of an llc sock could be still flying
after we delete them in llc_sk_free(), and even possibly
after we free the sock. We could just wait synchronously
here in case of troubles.
Note, I leave other call paths as they are, since they may
not have to wait, at least we can change them to synchronously
when needed.
Also, move the code to net/llc/llc_conn.c, which is apparently
a better place.
Reported-by: <syzbot+f922284c18ea23a8e457@syzkaller.appspotmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9c438d7a3a ]
Adding a dns_resolver key whose payload contains a very long option name
resulted in that string being printed in full. This hit the WARN_ONCE()
in set_precision() during the printk(), because printk() only supports a
precision of up to 32767 bytes:
precision 1000000 too large
WARNING: CPU: 0 PID: 752 at lib/vsprintf.c:2189 vsnprintf+0x4bc/0x5b0
Fix it by limiting option strings (combined name + value) to a much more
reasonable 128 bytes. The exact limit is arbitrary, but currently the
only recognized option is formatted as "dnserror=%lu" which fits well
within this limit.
Also ratelimit the printks.
Reproducer:
perl -e 'print "#", "A" x 1000000, "\x00"' | keyctl padd dns_resolver desc @s
This bug was found using syzkaller.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 4a2d789267 ("DNS: If the DNS server returns an error, allow that to be cached [ver #2]")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit aa8f877849 ]
KMSAN reported use of uninit-value that I tracked to lack
of proper size check on RTA_TABLE attribute.
I also believe RTA_PREFSRC lacks a similar check.
Fixes: 86872cb579 ("[IPv6] route: FIB6 configuration using struct fib6_config")
Fixes: c3968a857a ("ipv6: RTA_PREFSRC support for ipv6 route source address selection")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ab913455dd ]
The name of the following proc/sysctl entries were incorrectly
documented:
/proc/sys/net/ipv6/conf/<interface>/max_dst_opts_number
/proc/sys/net/ipv6/conf/<interface>/max_hbt_opts_number
/proc/sys/net/ipv6/conf/<interface>/max_dst_opts_length
/proc/sys/net/ipv6/conf/<interface>/max_hbt_length
Their name was set to the name of the symbol in the .data field of the
control table instead of their .proc name.
Signed-off-by: Olivier Gayot <olivier.gayot@sigexec.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ddea788c63 ]
After Commit 8a8efa22f5 ("bonding: sync netpoll code with bridge"), it
would set slave_dev npinfo in slave_enable_netpoll when enslaving a dev
if bond->dev->npinfo was set.
However now slave_dev npinfo is set with bond->dev->npinfo before calling
slave_enable_netpoll. With slave_dev npinfo set, __netpoll_setup called
in slave_enable_netpoll will not call slave dev's .ndo_netpoll_setup().
It causes that the lower dev of this slave dev can't set its npinfo.
One way to reproduce it:
# modprobe bonding
# brctl addbr br0
# brctl addif br0 eth1
# ifconfig bond0 192.168.122.1/24 up
# ifenslave bond0 eth2
# systemctl restart netconsole
# ifenslave bond0 br0
# ifconfig eth2 down
# systemctl restart netconsole
The netpoll won't really work.
This patch is to remove that slave_dev npinfo setting in bond_enslave().
Fixes: 8a8efa22f5 ("bonding: sync netpoll code with bridge")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2fb992d82 upstream.
TPM2 can return TPM2_RC_RETRY to any command and when it does we get
unexpected failures inside the kernel that surprise users (this is
mostly observed in the trusted key handling code). The UEFI 2.6 spec
has advice on how to handle this:
The firmware SHALL not return TPM2_RC_RETRY prior to the completion
of the call to ExitBootServices().
Implementer’s Note: the implementation of this function should check
the return value in the TPM response and, if it is TPM2_RC_RETRY,
resend the command. The implementation may abort if a sufficient
number of retries has been done.
So we follow that advice in our tpm_transmit() code using
TPM2_DURATION_SHORT as the initial wait duration and
TPM2_DURATION_LONG as the maximum wait time. This should fix all the
in-kernel use cases and also means that user space TSS implementations
don't have to have their own retry handling.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 888d867df4 upstream.
The correct sequence is to first request locality and only after
that perform cmd_ready handshake, otherwise the hardware will drop
the subsequent message as from the device point of view the cmd_ready
handshake wasn't performed. Symmetrically locality has to be relinquished
only after going idle handshake has completed, this requires that
go_idle has to poll for the completion and as well locality
relinquish has to poll for completion so it is not overridden
in back to back commands flow.
Two wrapper functions are added (request_locality relinquish_locality)
to simplify the error handling.
The issue is only visible on devices that support multiple localities.
Fixes: 877c57d0d0 ("tpm_crb: request and relinquish locality 0")
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkine@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkine@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkine@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 028daf8011 upstream.
Fix for "Resource temporarily unavailable" problem when virsh is
trying to attach a device to VM. When the VF driver is loaded on
host and virsh is trying to attach it to the VM and set a MAC
address, it ends with a race condition between i40e_reset_vf and
i40e_ndo_set_vf_mac functions. The bug is fixed by adding polling
in i40e_ndo_set_vf_mac function For when the VF is in Reset mode.
Signed-off-by: Paweł Jabłoński <pawel.jablonski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit f5a26acf01
Mike writes:
It seems that commit f5a26acf01 ("pinctrl: intel: Initialize GPIO
properly when used through irqchip") can cause problems on some Skylake
systems with Sunrisepoint PCH-H. Namely on certain systems it may turn
the backlight PWM pin from native mode to GPIO which makes the screen
blank during boot.
There is more information here:
https://bugzilla.redhat.com/show_bug.cgi?id=1543769
The actual reason is that GPIO numbering used in BIOS is using "Windows"
numbers meaning that they don't match the hardware 1:1 and because of
this a wrong pin (backlight PWM) is picked and switched to GPIO mode.
There is a proper fix for this but since it has quite many dependencies
on commits that cannot be considered stable material, I suggest we
revert commit f5a26acf01 from stable trees 4.9, 4.14 and 4.15 to
prevent the backlight issue.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Fixes: f5a26acf01 ("pinctrl: intel: Initialize GPIO properly when used through irqchip")
Cc: Daniel Drake <drake@endlessm.com>
Cc: Chris Chiu <chiu@endlessm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8cfd36a0b5 upstream.
When destroying a net namespace, all hwsim interfaces, which are not
created in default namespace are deleted. But the async deletion of the
interfaces could last longer than the actual destruction of the
namespace, which results to an use after free bug. Therefore use
synchronous deletion in this case.
Fixes: 100cb9ff40 ("mac80211_hwsim: Allow managing radios from non-initial namespaces")
Reported-by: syzbot+70ce058e01259de7bb1d@syzkaller.appspotmail.com
Signed-off-by: Benjamin Beichler <benjamin.beichler@uni-rostock.de>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c151b2544 upstream.
The bug that led to commit 95e057e258
was a benign warning (no adverse affects other than the warning
itself) that was detected by syzkaller. Further inspection shows
that the WARN_ON in question, in handle_ept_misconfig(), is
unnecessary and flawed (this was also briefly discussed in the
original patch: https://patchwork.kernel.org/patch/10204649).
* The WARN_ON is unnecessary as kvm_mmu_page_fault() will WARN
if reserved bits are set in the SPTEs, i.e. it covers the case
where an EPT misconfig occurred because of a KVM bug.
* The WARN_ON is flawed because it will fire on any system error
code that is hit while handling the fault, e.g. -ENOMEM can be
returned by mmu_topup_memory_caches() while handling a legitmate
MMIO EPT misconfig.
The original behavior of returning -EFAULT when userspace munmaps
an HVA without first removing the memslot is correct and desirable,
i.e. KVM is letting userspace know it has generated a bad address.
Returning RET_PF_EMULATE masks the WARN_ON in the EPT misconfig path,
but does not fix the underlying bug, i.e. the WARN_ON is bogus.
Furthermore, returning RET_PF_EMULATE has the unwanted side effect of
causing KVM to attempt to emulate an instruction on any page fault
with an invalid HVA translation, e.g. a not-present EPT violation
on a VM_PFNMAP VMA whose fault handler failed to insert a PFN.
* There is no guarantee that the fault is directly related to the
instruction, i.e. the fault could have been triggered by a side
effect memory access in the guest, e.g. while vectoring a #DB or
writing a tracing record. This could cause KVM to effectively
mask the fault if KVM doesn't model the behavior leading to the
fault, i.e. emulation could succeed and resume the guest.
* If emulation does fail, KVM will return EMULATION_FAILED instead
of -EFAULT, which is a red herring as the user will either debug
a bogus emulation attempt or scratch their head wondering why we
were attempting emulation in the first place.
TL;DR: revert to returning -EFAULT and remove the bogus WARN_ON in
handle_ept_misconfig in a future patch.
This reverts commit 95e057e258.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d5c12a7c0 upstream.
This is a very conservative limit (134217728 rules), but good
enough to not trigger frequent oom from syzkaller.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d7d7e0211 upstream.
no need to bother even trying to allocating huge compat offset arrays,
such ruleset is rejected later on anyway becaus we refuse to allocate
overly large rule blobs.
However, compat translation happens before blob allocation, so we should
add a check there too.
This is supposed to help with fuzzing by avoiding oom-killer.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9782a11efc upstream.
should have no impact, function still always returns 0.
This patch is only to ease review.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c84ca954ac upstream.
allows to have size checks in a single spot.
This is supposed to reduce oom situations when fuzz-testing xtables.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 19926968ea upstream.
Arbitrary limit, however, this still allows huge rulesets
(> 1 million rules). This helps with automated fuzzer as it prevents
oom-killer invocation.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e04944f0e upstream.
syzbot is catching so many bugs triggered by commit 9ee332d99e
("sget(): handle failures of register_shrinker()"). That commit expected
that calling kill_sb() from deactivate_locked_super() without successful
fill_super() is safe, but the reality was different; some callers assign
attributes which are needed for kill_sb() after sget() succeeds.
For example, [1] is a report where sb->s_mode (which seems to be either
FMODE_READ | FMODE_EXCL | FMODE_WRITE or FMODE_READ | FMODE_EXCL) is not
assigned unless sget() succeeds. But it does not worth complicate sget()
so that register_shrinker() failure path can safely call
kill_block_super() via kill_sb(). Making alloc_super() fail if memory
allocation for register_shrinker() failed is much simpler. Let's avoid
calling deactivate_locked_super() from sget_userns() by preallocating
memory for the shrinker and making register_shrinker() in sget_userns()
never fail.
[1] https://syzkaller.appspot.com/bug?id=588996a25a2587be2e3a54e8646728fb9cae44e7
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+5a170e19c963a2e0df79@syzkaller.appspotmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd03143007 upstream.
syszbot reported the following debugobjects splat:
ODEBUG: object is on stack, but not annotated
WARNING: CPU: 0 PID: 4185 at lib/debugobjects.c:328
RIP: 0010:debug_object_is_on_stack lib/debugobjects.c:327 [inline]
debug_object_init+0x17/0x20 lib/debugobjects.c:391
debug_hrtimer_init kernel/time/hrtimer.c:410 [inline]
debug_init kernel/time/hrtimer.c:458 [inline]
hrtimer_init+0x8c/0x410 kernel/time/hrtimer.c:1259
alarm_init kernel/time/alarmtimer.c:339 [inline]
alarm_timer_nsleep+0x164/0x4d0 kernel/time/alarmtimer.c:787
SYSC_clock_nanosleep kernel/time/posix-timers.c:1226 [inline]
SyS_clock_nanosleep+0x235/0x330 kernel/time/posix-timers.c:1204
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
This happens because the hrtimer for the alarm nanosleep is on stack, but
the code does not use the proper debug objects initialization.
Split out the code for the allocated use cases and invoke
hrtimer_init_on_stack() for the nanosleep related functions.
Reported-by: syzbot+a3e0726462b2e346a31d@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: syzkaller-bugs@googlegroups.com
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1803261528270.1585@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7eb2c4dd54 upstream.
LSPCON adapters in low-power state may ignore the first I2C write during
TMDS output buffer enabling, resulting in a blank screen even with an
otherwise enabled pipe. Fix this by reading back and validating the
written value a few times.
The problem was noticed on GLK machines with an onboard LSPCON adapter
after entering/exiting DC5 power state. Doing an I2C read of the adapter
ID as the first transaction - instead of the I2C write to enable the
TMDS buffers - returns the correct value. Based on this we assume that
the transaction itself is sent properly, it's only the adapter that is
not ready for some reason to accept this first write after waking from
low-power state. In my case the second I2C write attempt always
succeeded.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105854
Cc: Clinton Taylor <clinton.a.taylor@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180416155309.11100-1-imre.deak@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3878e164d upstream.
The TSC calibration code uses HPET as reference. The conversion normalizes
the delta of two HPET timestamps:
hpetref = ((tshpet1 - tshpet2) * HPET_PERIOD) / 1e6
and then divides the normalized delta of the corresponding TSC timestamps
by the result to calulate the TSC frequency.
tscfreq = ((tstsc1 - tstsc2 ) * 1e6) / hpetref
This uses do_div() which takes an u32 as the divisor, which worked so far
because the HPET frequency was low enough that 'hpetref' never exceeded
32bit.
On Skylake machines the HPET frequency increased so 'hpetref' can exceed
32bit. do_div() truncates the divisor, which causes the calibration to
fail.
Use div64_u64() to avoid the problem.
[ tglx: Fixes whitespace mangled patch and rewrote changelog ]
Signed-off-by: Xiaoming Gao <newtongao@tencent.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: peterz@infradead.org
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/38894564-4fc9-b8ec-353f-de702839e44e@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3bca5d450 upstream.
Commit a9445e47d8 ("posix-cpu-timers: Make set_process_cpu_timer()
more robust") moved the check into the 'if' statement. Unfortunately,
it did so on the right side of an && which means that it may get short
circuited and never evaluated. This is easily reproduced with:
$ cat loop.c
void main() {
struct rlimit res;
/* set the CPU time limit */
getrlimit(RLIMIT_CPU,&res);
res.rlim_cur = 2;
res.rlim_max = 2;
setrlimit(RLIMIT_CPU,&res);
while (1);
}
Which will hang forever instead of being killed. Fix this by pulling the
evaluation out of the if statement but checking the return value instead.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1568337
Fixes: a9445e47d8 ("posix-cpu-timers: Make set_process_cpu_timer() more robust")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: "Max R . P . Grossmann" <m@max.pm>
Cc: John Stultz <john.stultz@linaro.org>
Link: https://lkml.kernel.org/r/20180417215742.2521-1-labbott@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e388e9581 upstream.
When the delayed refs for a head are all run, eventually
cleanup_ref_head is called which (in case of deletion) obtains a
reference for the relevant btrfs_space_info struct by querying the bg
for the range. This is problematic because when the last extent of a
bg is deleted a race window emerges between removal of that bg and the
subsequent invocation of cleanup_ref_head. This can result in cache being null
and either a null pointer dereference or assertion failure.
task: ffff8d04d31ed080 task.stack: ffff9e5dc10cc000
RIP: 0010:assfail.constprop.78+0x18/0x1a [btrfs]
RSP: 0018:ffff9e5dc10cfbe8 EFLAGS: 00010292
RAX: 0000000000000044 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8d04ffc1f868 RSI: ffff8d04ffc178c8 RDI: ffff8d04ffc178c8
RBP: ffff8d04d29e5ea0 R08: 00000000000001f0 R09: 0000000000000001
R10: ffff9e5dc0507d58 R11: 0000000000000001 R12: ffff8d04d29e5ea0
R13: ffff8d04d29e5f08 R14: ffff8d04efe29b40 R15: ffff8d04efe203e0
FS: 00007fbf58ead500(0000) GS:ffff8d04ffc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe6c6975648 CR3: 0000000013b2a000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__btrfs_run_delayed_refs+0x10e7/0x12c0 [btrfs]
btrfs_run_delayed_refs+0x68/0x250 [btrfs]
btrfs_should_end_transaction+0x42/0x60 [btrfs]
btrfs_truncate_inode_items+0xaac/0xfc0 [btrfs]
btrfs_evict_inode+0x4c6/0x5c0 [btrfs]
evict+0xc6/0x190
do_unlinkat+0x19c/0x300
do_syscall_64+0x74/0x140
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7fbf589c57a7
To fix this, introduce a new flag "is_system" to head_ref structs,
which is populated at insertion time. This allows to decouple the
querying for the spaceinfo from querying the possibly deleted bg.
Fixes: d7eae3403f ("Btrfs: rework delayed ref total_bytes_pinned accounting")
CC: stable@vger.kernel.org # 4.14+
Suggested-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 92d3217084 upstream.
The last update to readdir introduced a temporary buffer to store the
emitted readdir data, but as there are file names of variable length,
there's a lot of unaligned access.
This was observed on a sparc64 machine:
Kernel unaligned access at TPC[102f3080] btrfs_real_readdir+0x51c/0x718 [btrfs]
Fixes: 23b5ec7494 ("btrfs: fix readdir deadlock with pagefault")
CC: stable@vger.kernel.org # 4.14+
Reported-and-tested-by: René Rebe <rene@exactcode.com>
Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d0cffa674 upstream.
RHBZ: 1453123
Since at least the 3.10 kernel and likely a lot earlier we have
not been able to create unix domain sockets in a cifs share
when mounted using the SFU mount option (except when mounted
with the cifs unix extensions to Samba e.g.)
Trying to create a socket, for example using the af_unix command from
xfstests will cause :
BUG: unable to handle kernel NULL pointer dereference at 00000000
00000040
Since no one uses or depends on being able to create unix domains sockets
on a cifs share the easiest fix to stop this vulnerability is to simply
not allow creation of any other special files than char or block devices
when sfu is used.
Added update to Ronnie's patch to handle a tcon link leak, and
to address a buf leak noticed by Gustavo and Colin.
Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
CC: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e898e4c0a upstream.
lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if
the page's memcg is undergoing move accounting, which occurs when a
process leaves its memcg for a new one that has
memory.move_charge_at_immigrate set.
unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if
the given inode is switching writeback domains. Switches occur when
enough writes are issued from a new domain.
This existing pattern is thus suspicious:
lock_page_memcg(page);
unlocked_inode_to_wb_begin(inode, &locked);
...
unlocked_inode_to_wb_end(inode, locked);
unlock_page_memcg(page);
If both inode switch and process memcg migration are both in-flight then
unlocked_inode_to_wb_end() will unconditionally enable interrupts while
still holding the lock_page_memcg() irq spinlock. This suggests the
possibility of deadlock if an interrupt occurs before unlock_page_memcg().
truncate
__cancel_dirty_page
lock_page_memcg
unlocked_inode_to_wb_begin
unlocked_inode_to_wb_end
<interrupts mistakenly enabled>
<interrupt>
end_page_writeback
test_clear_page_writeback
lock_page_memcg
<deadlock>
unlock_page_memcg
Due to configuration limitations this deadlock is not currently possible
because we don't mix cgroup writeback (a cgroupv2 feature) and
memory.move_charge_at_immigrate (a cgroupv1 feature).
If the kernel is hacked to always claim inode switching and memcg
moving_account, then this script triggers lockup in less than a minute:
cd /mnt/cgroup/memory
mkdir a b
echo 1 > a/memory.move_charge_at_immigrate
echo 1 > b/memory.move_charge_at_immigrate
(
echo $BASHPID > a/cgroup.procs
while true; do
dd if=/dev/zero of=/mnt/big bs=1M count=256
done
) &
while true; do
sync
done &
sleep 1h &
SLEEP=$!
while true; do
echo $SLEEP > a/cgroup.procs
echo $SLEEP > b/cgroup.procs
done
The deadlock does not seem possible, so it's debatable if there's any
reason to modify the kernel. I suggest we should to prevent future
surprises. And Wang Long said "this deadlock occurs three times in our
environment", so there's more reason to apply this, even to stable.
Stable 4.4 has minor conflicts applying this patch. For a clean 4.4 patch
see "[PATCH for-4.4] writeback: safer lock nesting"
https://lkml.org/lkml/2018/4/11/146
Wang Long said "this deadlock occurs three times in our environment"
[gthelen@google.com: v4]
Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com
[akpm@linux-foundation.org: comment tweaks, struct initialization simplification]
Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613
Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com
Fixes: 682aa8e1a6 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates")
Signed-off-by: Greg Thelen <gthelen@google.com>
Reported-by: Wang Long <wanglong19@meituan.com>
Acked-by: Wang Long <wanglong19@meituan.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: <stable@vger.kernel.org> [v4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[natechancellor: Adjust context due to lack of b93b016313]
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b658912cb0 upstream.
i2c_hid_command() returns non-zero in error cases (the actual
errno). Error handling in for I2C_HID_QUIRK_RESEND_REPORT_DESCR
case in i2c_hid_resume() had the check inverted; fix that.
Fixes: 3e83eda467 ("HID: i2c-hid: Fix resume issue on Raydium touchscreen device")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd7e61b93d upstream.
There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g,
vgpu can't get the correct default context by updating the registers before
inhibit context submission. It always get back the hardware default value
unless the inhibit context submission happened before the 1st time
forcewake put. With this wrong default context, vgpu will run with
incorrect state and meet unknown issues.
The solution is initialize these mmios by adding lri command in ring buffer
of the inhibit context, then gpu hardware has no chance to go down RC6 when
lri commands are right being executed, and then vgpu can get correct
default context for further use.
v3:
- fix code fault, use 'for' to loop through mmio render list(Zhenyu)
v4:
- save the count of engine mmio need to be restored for inhibit context and
refine some comments. (Kevin)
v5:
- code rebase
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit abc1be13fd upstream.
f2fs specifies the __GFP_ZERO flag for allocating some of its pages.
Unfortunately, the page cache also uses the mapping's GFP flags for
allocating radix tree nodes. It always masked off the __GFP_HIGHMEM
flag, and masks off __GFP_ZERO in some paths, but not all. That causes
radix tree nodes to be allocated with a NULL list_head, which causes
backtraces like:
__list_del_entry+0x30/0xd0
list_lru_del+0xac/0x1ac
page_cache_tree_insert+0xd8/0x110
The __GFP_DMA and __GFP_DMA32 flags would also be able to sneak through
if they are ever used. Fix them all by using GFP_RECLAIM_MASK at the
innermost location, and remove it from earlier in the callchain.
Link: http://lkml.kernel.org/r/20180411060320.14458-2-willy@infradead.org
Fixes: 449dd6984d ("mm: keep page cache radix tree nodes in check")
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reported-by: Chris Fries <cfries@google.com>
Debugged-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef84230223 upstream.
MAP_SYNC is a nop for device-dax. Allow MAP_SYNC to succeed on device-dax
to eliminate special casing between device-dax and fs-dax as to when the
flag can be specified. Device-dax users already implicitly assume that they do
not need to call fsync(), and this enables them to explicitly check for this
capability.
Cc: <stable@vger.kernel.org>
Fixes: b6fb293f24 ("mm: Define MAP_SYNC and VM_SYNC flags")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7c5a571a8 upstream.
The new support for the standard _LSR and _LSW methods neglected to also
update the nvdimm_init_config_data() and nvdimm_set_config_data() to
return the translated error code from failed commands. This precision is
necessary because the locked status that was previously returned on
ND_CMD_GET_CONFIG_SIZE commands is now returned on
ND_CMD_{GET,SET}_CONFIG_DATA commands.
If the kernel misses this indication it can inadvertently fall back to
label-less mode when it should otherwise avoid all access to locked
regions.
Cc: <stable@vger.kernel.org>
Fixes: 4b27db7e26 ("acpi, nfit: add support for the _LSI, _LSR, and...")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4a3877c4ce upstream.
if we ever hit rpc_gssd_dummy_depopulate() dentry passed to
it has refcount equal to 1. __rpc_rmpipe() drops it and
dput() done after that hits an already freed dentry.
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5deae91911 upstream.
Turns out the VLV/CHV fixed function sprite CSC expects full range
data as input. We've been feeding it limited range data to it all
along. To expand the data out to full range we'll use the color
correction registers (brightness, contrast, and saturation).
On CHV pipe B we were actually doing the right thing already because we
progammed the custom CSC matrix to do expect limited range input. Now
that well pre-expand the data out with the color correction unit, we
need to change the CSC matrix to operate with full range input instead.
This should make the sprite output of the other pipes match the sprite
output of pipe B reasonably well. Looking at the resulting pipe CRCs,
there can be a slight difference in the output, but as I don't know
the formula used by the fixed function CSC of the other pipes, I don't
think it's worth the effort to try to match the output exactly. It
might not even be possible due to difference in internal precision etc.
One slight caveat here is that the color correction registers are single
bufferred, so we should really be updating them during vblank, but we
still don't have a mechanism for that, so just toss in another FIXME.
v2: Rebase
v3: s/bri/brightness/ s/con/contrast/ (Shashank)
v4: Clarify the constants and math (Shashank)
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Stone <daniel@fooishbar.org>
Cc: Russell King - ARM Linux <linux@armlinux.org.uk>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Jyri Sarha <jsarha@ti.com>
Cc: "Tang, Jun" <jun.tang@intel.com>
Reported-by: "Tang, Jun" <jun.tang@intel.com>
Cc: stable@vger.kernel.org
Fixes: 7f1f3851fe ("drm/i915: sprite support for ValleyView v4")
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180214192327.3250-5-ville.syrjala@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 300efa9eea upstream.
After
commit dd9f31c7a3
Author: Imre Deak <imre.deak@intel.com>
Date: Wed Aug 16 17:46:07 2017 +0300
drm/i915/gen9+: Set same power state before hibernation image
save/restore
during hibernation/suspend the power domain functionality got disabled,
after which resume could leave it incorrectly disabled if the ACPI
target state was S0 during suspend and i915 was not loaded by the loader
kernel.
This was caused by not considering if we resumed from hibernation as the
condition for power domains reiniting.
Fix this by simply tracking if we suspended power domains during system
suspend and reinit power domains accordingly during resume. This will
result in reiniting power domains always when resuming from hibernation,
regardless of the platform and whether or not i915 is loaded by the
loader kernel.
The reason we didn't catch this earlier is that the enabled/disabled
state of power domains during PMSG_FREEZE/PMSG_QUIESCE is platform
and kernel config dependent: on my SKL the target state is S4
during PMSG_FREEZE and (with the driver loaded in the loader kernel)
S0 during PMSG_QUIESCE. On the reporter's machine it's S0 during
PMSG_FREEZE but (contrary to this) power domains are not initialized
during PMSG_QUIESCE since i915 is not loaded in the loader kernel, or
it's loaded but without the DMC firmware being available.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105196
Reported-and-tested-by: amn-bas@hotmail.com
Fixes: dd9f31c7a3 ("drm/i915/gen9+: Set same power state before hibernation image save/restore")
Cc: amn-bas@hotmail.com
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180322143642.26883-1-imre.deak@intel.com
(cherry picked from commit 0f90603c33)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 300ad89929 upstream.
Commit c31165d740 ("mmc: sdhci-pci: Add support for HS200 tuning mode
on AMD, eMMC-4.5.1") added a HS200 tuning method for use with AMD SDHCI
controllers. As described in the commit subject, this tuning is specific
for HS200. However, as implemented, this method is used for all host
timings, because platform_execute_tuning, if it exists, is called
unconditionally by sdhci_execute_tuning(). This breaks tuning when using
the AMD controller with, for example, a DDR50 SD card.
Instead, we can implement an amd execute_tuning wrapper callback, and
then conditionally do the HS200 specific tuning for HS200, and otherwise
call back to the standard sdhci_execute_tuning().
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: c31165d740 ("mmc: sdhci-pci: Add support for HS200 tuning mode on AMD, eMMC-4.5.1")
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54a307ba8d upstream.
When event on child inodes are sent to the parent inode mark and
parent inode mark was not marked with FAN_EVENT_ON_CHILD, the event
will not be delivered to the listener process. However, if the same
process also has a mount mark, the event to the parent inode will be
delivered regadless of the mount mark mask.
This behavior is incorrect in the case where the mount mark mask does
not contain the specific event type. For example, the process adds
a mark on a directory with mask FAN_MODIFY (without FAN_EVENT_ON_CHILD)
and a mount mark with mask FAN_CLOSE_NOWRITE (without FAN_ONDIR).
A modify event on a file inside that directory (and inside that mount)
should not create a FAN_MODIFY event, because neither of the marks
requested to get that event on the file.
Fixes: 1968f5eed5 ("fanotify: use both marks when possible")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 44f06ba829 upstream.
OSTA UDF specification does not mention whether the CS0 charset in case
of two bytes per character encoding should be treated in UTF-16 or
UCS-2. The sample code in the standard does not treat UTF-16 surrogates
in any special way but on systems such as Windows which work in UTF-16
internally, filenames would be treated as being in UTF-16 effectively.
In Linux it is more difficult to handle characters outside of Base
Multilingual plane (beyond 0xffff) as NLS framework works with 2-byte
characters only. Just make sure we don't leak UTF-16 surrogates into the
resulting string when loading names from the filesystem for now.
CC: stable@vger.kernel.org # >= v4.6
Reported-by: Mingye Wang <arthur200126@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b8858581fe upstream.
When we patch an alternate feature section, we have to adjust any
relative branches that branch out of the alternate section.
But currently we have a bug if we have a branch that points to past
the last instruction of the alternate section, eg:
FTR_SECTION_ELSE
1: b 2f
or 6,6,6
2:
ALT_FTR_SECTION_END(...)
nop
This will result in a relative branch at 1 with a target that equals
the end of the alternate section.
That branch does not need adjusting when it's moved to the non-else
location. Currently we do adjust it, resulting in a branch that goes
off into the link-time location of the else section, which is junk.
The fix is to not patch branches that have a target == end of the
alternate section.
Fixes: d20fe50a7b ("KVM: PPC: Book3S HV: Branch inside feature section")
Fixes: 9b1a735de6 ("powerpc: Add logic to patch alternative feature sections")
Cc: stable@vger.kernel.org # v2.6.27+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b32e56e5a8 upstream.
When setting up a CPU, we "push" (activate) a pool VP for it.
However it's an error to do so if it already has an active
pool VP.
This happens when doing soft CPU hotplug on powernv since we
don't tear down the CPU on unplug. The HW flags the error which
gets captured by the diagnostics.
Fix this by making sure to "pull" out any already active pool
first.
Fixes: 243e25112d ("powerpc/xive: Native exploitation of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13a83eac37 upstream.
On boot we save the configuration space of PCIe bridges. We do this so
when we get an EEH event and everything gets reset that we can restore
them.
Unfortunately we save this state before we've enabled the MMIO space
on the bridges. Hence if we have to reset the bridge when we come back
MMIO is not enabled and we end up taking an PE freeze when the driver
starts accessing again.
This patch forces the memory/MMIO and bus mastering on when restoring
bridges on EEH. Ideally we'd do this correctly by saving the
configuration space writes later, but that will have to come later in
a larger EEH rewrite. For now we have this simple fix.
The original bug can be triggered on a boston machine by doing:
echo 0x8000000000000000 > /sys/kernel/debug/powerpc/PCI0001/err_injct_outbound
On boston, this PHB has a PCIe switch on it. Without this patch,
you'll see two EEH events, 1 expected and 1 the failure we are fixing
here. The second EEH event causes the anything under the PHB to
disappear (i.e. the i40e eth).
With this patch, only 1 EEH event occurs and devices properly recover.
Fixes: 652defed48 ("powerpc/eeh: Check PCIe link after reset")
Cc: stable@vger.kernel.org # v3.11+
Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c96eebf076 upstream.
The label .Llast_fixup\@ is jumped to on page fault within the final
byte set loop of memset (on < MIPSR6 architectures). For some reason, in
this fault handler, the v1 register is randomly set to a2 & STORMASK.
This clobbers v1 for the calling function. This can be observed with the
following test code:
static int __init __attribute__((optimize("O0"))) test_clear_user(void)
{
register int t asm("v1");
char *test;
int j, k;
pr_info("\n\n\nTesting clear_user\n");
test = vmalloc(PAGE_SIZE);
for (j = 256; j < 512; j++) {
t = 0xa5a5a5a5;
if ((k = clear_user(test + PAGE_SIZE - 256, j)) != j - 256) {
pr_err("clear_user (%px %d) returned %d\n", test + PAGE_SIZE - 256, j, k);
}
if (t != 0xa5a5a5a5) {
pr_err("v1 was clobbered to 0x%x!\n", t);
}
}
return 0;
}
late_initcall(test_clear_user);
Which demonstrates that v1 is indeed clobbered (MIPS64):
Testing clear_user
v1 was clobbered to 0x1!
v1 was clobbered to 0x2!
v1 was clobbered to 0x3!
v1 was clobbered to 0x4!
v1 was clobbered to 0x5!
v1 was clobbered to 0x6!
v1 was clobbered to 0x7!
Since the number of bytes that could not be set is already contained in
a2, the andi placing a value in v1 is not necessary and actively
harmful in clobbering v1.
Reported-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/19109/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit daf70d89f8 upstream.
The __clear_user function is defined to return the number of bytes that
could not be cleared. From the underlying memset / bzero implementation
this means setting register a2 to that number on return. Currently if a
page fault is triggered within the memset_partial block, the value
loaded into a2 on return is meaningless.
The label .Lpartial_fixup\@ is jumped to on page fault. In order to work
out how many bytes failed to copy, the exception handler should find how
many bytes left in the partial block (andi a2, STORMASK), add that to
the partial block end address (a2), and subtract the faulting address to
get the remainder. Currently it incorrectly subtracts the partial block
start address (t1), which has additionally been clobbered to generate a
jump target in memset_partial. Fix this by adding the block end address
instead.
This issue was found with the following test code:
int j, k;
for (j = 0; j < 512; j++) {
if ((k = clear_user(NULL, j)) != j) {
pr_err("clear_user (NULL %d) returned %d\n", j, k);
}
}
Which now passes on Creator Ci40 (MIPS32) and Cavium Octeon II (MIPS64).
Suggested-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/19108/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a8158c85e upstream.
The MIPS kernel memset / bzero implementation includes a small_memset
branch which is used when the region to be set is smaller than a long (4
bytes on 32bit, 8 bytes on 64bit). The current small_memset
implementation uses a simple store byte loop to write the destination.
There are 2 issues with this implementation:
1. When EVA mode is active, user and kernel address spaces may overlap.
Currently the use of the sb instruction means kernel mode addressing is
always used and an intended write to userspace may actually overwrite
some critical kernel data.
2. If the write triggers a page fault, for example by calling
__clear_user(NULL, 2), instead of gracefully handling the fault, an OOPS
is triggered.
Fix these issues by replacing the sb instruction with the EX() macro,
which will emit EVA compatible instuctions as required. Additionally
implement a fault fixup for small_memset which sets a2 to the number of
bytes that could not be cleared (as defined by __clear_user).
Reported-by: Chuanhua Lei <chuanhua.lei@intel.com>
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/18975/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a955358d54 upstream.
Doing `ioctl(HIDIOCGFEATURE)` in a tight loop on a hidraw device
and then disconnecting the device, or unloading the driver, can
cause a NULL pointer dereference.
When a hidraw device is destroyed it sets 0 to `dev->exist`.
Most functions check 'dev->exist' before doing its work, but
`hidraw_get_report()` was missing that check.
Cc: stable@vger.kernel.org
Signed-off-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e210bbb74 upstream.
The commit 581c448476 ("HID: input: map digitizer battery usage")
assumed that devices having input (qas opposed to feature) report for
battery strength would report the data on their own, without the need to
be polled by the kernel; unfortunately it is not so. Many wireless mice
do not send unsolicited reports with battery strength data and have to
be polled explicitly. As a complication, stylus devices on digitizers
are not normally connected to the base and thus can not be polled - the
base can only determine battery strength in the stylus when it is in
proximity.
To solve this issue, we add a special flag that tells the kernel
to avoid polling the device (and expect unsolicited reports) and set it
when report field with physical usage of digitizer stylus (HID_DG_STYLUS).
Unless this flag is set, and we have not seen the unsolicited reports,
the kernel will attempt to poll the device when userspace attempts to
read "capacity" and "state" attributes of power_supply object
corresponding to the devices battery.
Fixes: 581c448476 ("HID: input: map digitizer battery usage")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198095
Cc: stable@vger.kernel.org
Reported-and-tested-by: Martin van Es <martin@mrvanes.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e83eda467 upstream.
When Rayd touchscreen resumed from S3, it issues too many errors like:
i2c_hid i2c-RAYD0001:00: i2c_hid_get_input: incomplete report (58/5442)
And all the report data are corrupted, touchscreen is unresponsive.
Fix this by re-sending report description command after resume.
Add device ID as a quirk.
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc12baacb9 upstream.
add_device_randomness() use of crng_fast_load() was highly
problematic. Some callers of add_device_randomness() can pass in a
large amount of static information. This would immediately promote
the crng_init state from 0 to 1, without really doing much to
initialize the primary_crng's internal state with something even
vaguely unpredictable.
Since we don't have the speed constraints of add_interrupt_randomness(),
we can do a better job mixing in the what unpredictability a device
driver or architecture maintainer might see fit to give us, and do it
in a way which does not bump the crng_init_cnt variable.
Also, since add_device_randomness() doesn't bump any entropy
accounting in crng_init state 0, mix the device randomness into the
input_pool entropy pool as well. This is related to CVE-2018-1108.
Reported-by: Jann Horn <jannh@google.com>
Fixes: ee7998c50c ("random: do not ignore early device randomness")
Cc: stable@kernel.org # 4.13+
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 43838a23a0 upstream.
The crng_init variable has three states:
0: The CRNG is not initialized at all
1: The CRNG has a small amount of entropy, hopefully good enough for
early-boot, non-cryptographical use cases
2: The CRNG is fully initialized and we are sure it is safe for
cryptographic use cases.
The crng_ready() function should only return true once we are in the
last state. This addresses CVE-2018-1108.
Reported-by: Jann Horn <jannh@google.com>
Fixes: e192be9d9a ("random: replace non-blocking pool...")
Cc: stable@kernel.org # 4.8+
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3dafb2200 upstream.
There are two front mics on this machine, if we don't adjust the
location for one of them, they will have the same mixer name,
pulseaudio can't handle this situation.
After applying this FIXUP, they will have different mixer name,
then pulseaudio can handle them correctly.
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ce0d5aa26 upstream.
Otherwise, the pin will be regarded as microphone, and the jack name
is "Mic Phantom", it is always on in the pulseaudio even nothing is
plugged into the jack. So the UI is confusing to users since the
microphone always shows up in the UI even there is no microphone
plugged.
After adding this flag, the jack name is "Headset Mic Phantom", then
the pulseaudio can handle its detection correctly.
Fixes: f0ba9d699e ("ALSA: hda/realtek - Fix Dell headset Mic can't record")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a56ef4f3f upstream.
Some rawmidi compat ioctls lack of the input substream checks
(although they do check only for rfile->output). This many eventually
lead to an Oops as NULL substream is passed to the rawmidi core
functions.
Fix it by adding the proper checks before each function call.
The bug was spotted by syzkaller.
Reported-by: syzbot+f7a0348affc3b67bc617@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ecb46e9ee upstream.
Sending MIDI messages to a PODxt through the USB connection shows
"usb_submit_urb failed" in dmesg and the message is not received by
the POD.
The error is caused because in the funcion send_midi_async() in midi.c
there is a call to usb_sndbulkpipe() for endpoint 3 OUT, but the PODxt
USB descriptor shows that this endpoint it's an interrupt endpoint.
Patch tested with PODxt only.
[ The bug has been present from the very beginning in the staging
driver time, but Fixes below points to the commit moving to sound/
directory so that the fix can be cleanly applied -- tiwai ]
Fixes: 61864d844c ("ALSA: move line6 usb driver into sound/usb")
Signed-off-by: Fabián Inostroza <fabianinostroza@udec.cl>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85e290d92b upstream.
Two years ago I tried an AMD Radeon E8860 embedded GPU with the drm driver.
The dmesg output included driver warnings about an invalid PCIe lane width.
Tracking the problem back led to si_set_pcie_lane_width_in_smc().
The calculation of the lane widths via ATOM_PPLIB_PCIE_LINK_WIDTH_MASK and
ATOM_PPLIB_PCIE_LINK_WIDTH_SHIFT macros did not increment the resulting
value, per the comment in pptable.h ("lanes - 1"), and per usage elsewhere.
Applying the increment silenced the warnings.
The code has not changed since, so either my analysis was incorrect or the
bug has gone unnoticed. Hence submitting this as an RFC.
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f9e93fed4 upstream.
Calling request_irq() followed by disable_irq() is usually a bad idea,
specially if the interrupt can be pending, and you're not yet in a
position to handle it.
This is exactly what happens on my kevin system when rebooting in a
second kernel using kexec: Some interrupt is left pending from
the previous kernel, and we take it too early, before disable_irq()
could do anything.
Let's clear the pending interrupts as we initialize the HW, and move
the interrupt request after that point. This ensures that we're in
a sane state when the interrupt is requested.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[adapted to recent rockchip-drm changes]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220130120.5254-2-marc.zyngier@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a20ee0b1f8 upstream.
If these bos are evicted and are in the validated list
things blow up, so do not put them in there. Notably,
that tries to add the bo to the LRU twice, which results
in a BUG_ON in ttm_bo.c.
While for the bo_list an alternative would be to not allow
always valid bos in there, that does not work for the user
fence.
v2: Fixed whitespace issue pointed out by checkpatch.pl
Signed-off-by: Bas Nieuwenhuizen <basni@chromium.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf1ba1d73a upstream.
When device boots with T > T_trip_1 and requests interrupt,
the race condition takes place. The interrupt comes before
THERMAL_DEVICE_ENABLED is set. This leads to an attempt to
reading sensor value from irq and disabling the sensor, based on
the data->mode field, which expected to be THERMAL_DEVICE_ENABLED,
but still stays as THERMAL_DEVICE_DISABLED. Afher this issue
sensor is never re-enabled, as the driver state is wrong.
Fix this problem by setting the 'data' members prior to
requesting the interrupts.
Fixes: 37713a1e8e ("thermal: imx: implement thermal alarm interrupt handling")
Cc: <stable@vger.kernel.org>
Signed-off-by: Mikhail Lappo <mikhail.lappo@esrlabs.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 360cc03656 upstream.
Since the offset for both registers, PWMDWIDTH and PWMTHRES, used to
control PWM4 or PWM5 are distinct from the other PWMs, whose wrong
programming on PWM hardware causes waveform cannot be output as expected.
Thus, the patch adds the extra condition for fixing up the weird case to
let PWM4 or PWM5 able to work on MT7623.
v1 -> v2: use pwm45_fixup naming instead of pwm45_quirk
v2 -> v3: add more tags for Reviewed-by, Fixes, and Cc stable
Cc: stable@vger.kernel.org
Fixes: caf065f8fd ("pwm: Add MediaTek PWM support")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Zhi Mao <zhi.mao@mediatek.com>
Cc: John Crispin <john@phrozen.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6225f9c64b upstream.
This patch fixes an issue that is possible to set mismatch value to duty
for R-Car PWM if we input the following commands:
# cd /sys/class/pwm/<pwmchip>/
# echo 0 > export
# cd pwm0
# echo 30 > period
# echo 30 > duty_cycle
# echo 0 > duty_cycle
# cat duty_cycle
0
# echo 1 > enable
--> Then, the actual duty_cycle is 30, not 0.
So, this patch adds a condition into rcar_pwm_config() to fix this
issue.
Signed-off-by: Ryo Kodama <ryo.kodama.vz@renesas.com>
[shimoda: revise the commit log and add Fixes and Cc tags]
Fixes: ed6c1476bf ("pwm: Add support for R-Car PWM Timer")
Cc: Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 753872373b upstream.
In order to enable a PLL, not only the PLL has to be powered up and
locked, but you also have to de-assert the reset signal. The last part
was missing. Add it so PLLs that were not enabled by the FW/bootloader
can be enabled from Linux.
Fixes: 41691b8862 ("clk: bcm2835: Add support for programming the audio domain clocks")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 89cd7aec21 upstream.
The clock for which all PWM devices on MT7623 or MT2701 actually depending
on has to be divided by four from its parent clock axi_sel in the clock
path prior to PWM devices.
Consequently, adding a fixed-factor clock axisel_d4 as one-fourth of
clock axi_sel allows that PWM devices can have the correct resolution
calculation.
Cc: stable@vger.kernel.org
Fixes: e986211827 ("clk: mediatek: Add MT2701 clock support")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce33f28493 upstream.
When we build this driver with on x86-32, gcc produces a false-positive warning:
drivers/clk/renesas/clk-sh73a0.c: In function 'sh73a0_cpg_clocks_init':
drivers/clk/renesas/clk-sh73a0.c:155:10: error: 'parent_name' may be used uninitialized in this function [-Werror=maybe-uninitialized]
return clk_register_fixed_factor(NULL, name, parent_name, 0,
We can work around that warning by adding a fake initialization, I tried
and failed to come up with any better workaround. This is currently one
of few remaining warnings for a 4.14.y randconfig build, so it would be
good to also have it backported at least to that version. Older versions
have more randconfig warnings, so we might not care.
I had not noticed this earlier, because one patch in my randconfig test
tree removes the '-ffreestanding' option on x86-32, and that avoids
the warning. The -ffreestanding flag was originally global but moved
into arch/i386 by Andi Kleen in commit 6edfba1b33 ("[PATCH] x86_64:
Don't define string functions to builtin") as a 'temporary workaround'.
Like many temporary hacks, this turned out to be rather long-lived, from
all I can tell we still need a simple fix to asm/string_32.h before it
can be removed, but I'm not sure about how to best do that.
Cc: stable@vger.kernel.org
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a4a459580 upstream.
Clearfog boards can come with a CPU clocked at 1600MHz (commercial)
or 1333MHz (industrial).
They have also some dip-switches to select a different clock (666, 800,
1066, 1200).
The funny thing is that the recovery button is on the MPP34 fq selector.
So, when booting an industrial board with this button down, the frequency
666MHz is selected (and the kernel didn't boot).
This patch add all the missing clocks.
The only mode I didn't test is 2GHz (uboot found 4294MHz instead :/ ).
Fixes: 0e85aeced4 ("clk: mvebu: add clock support for Armada 380/385")
Cc: <stable@vger.kernel.org> # 3.16.x: 9593f4f56c: clk: mvebu: armada-38x: add support for 1866MHz variants
Cc: <stable@vger.kernel.org> # 3.16.x
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b30dfd376 upstream.
Per PCIe r3.1, sec 2.2.6.2 and 7.8.4, a Requester may not use 8-bit Tags
unless its Extended Tag Field Enable is set, but all Receivers/Completers
must handle 8-bit Tags correctly regardless of their Extended Tag Field
Enable.
Some devices do not handle 8-bit Tags as Completers, so add a quirk for
them. If we find such a device, we disable Extended Tags for the entire
hierarchy to make peer-to-peer DMA possible.
The Broadcom HT1100/HT2000/HT2100 seems to have issues with handling 8-bit
tags. Mark it as broken.
This fixes Xorg hangs and unresponsive keyboards with errors like this:
radeon 0000:06:00.0: GPU lockup (current fence id 0x000000000000000e last fence id 0x0000000000000
[drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD)
[drm:r600_resume [radeon]] *ERROR* r600 startup failed on resume
Fixes: 60db3a4d8c ("PCI: Enable PCIe Extended Tags if supported")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196197
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <helgaas@kernel.org>
CC: stable@vger.kernel.org # v4.11: 62ce94a7a5 PCI: Mark Broadcom HT2100 Root Port Extended Tags as broken
CC: stable@vger.kernel.org # v4.11
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a04f0017c2 upstream.
A spinlock is held while updating the internal copy of the IRQ mask,
but not while writing it to the actual IMASK register. After the lock
is released, an IRQ can occur before the IMASK register is written.
If handling this IRQ causes the mask to be changed, when the handler
returns back to the middle of the first mask update, a stale value
will be written to the mask register.
If this causes an IRQ to become unmasked that cannot have its status
cleared by writing a 1 to it in the IREG register, e.g. the SDIO IRQ,
then we can end up stuck with the same IRQ repeatedly being fired but
not handled. Normally the MMC IRQ handler attempts to clear any
unexpected IRQs by writing IREG, but for those that cannot be cleared
in this way then the IRQ will just repeatedly fire.
This was resulting in lockups after a while of using Wi-Fi on the
CI20 (GitHub issue #19).
Resolve by holding the spinlock until after the IMASK register has
been updated.
Cc: stable@vger.kernel.org
Link: https://github.com/MIPS/CI20_linux/issues/19
Fixes: 61bfbdb856 ("MMC: Add support for the controller on JZ4740 SoCs.")
Tested-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0a0852b9f upstream.
Upon module load, mmc_block allocates a bus with bus_registeri() in
mmc_blk_init(). This reference never gets freed during module unload, which
leads to subsequent re-insertions of the module fails and a WARN() splat is
triggered.
Fix the bug by dropping the reference for the bus in mmc_blk_exit().
Signed-off-by: Alexander Kappner <agk@godking.net>
Fixes: 97548575be ("mmc: block: Convert RPMB to a character device")
Cc: <stable@vger.kernel.org>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d1a535b8e upstream.
glibc 2.26 removed the 'struct ucontext' to "improve" POSIX compliance
and break programs, including User Mode Linux. Fix User Mode Linux
by using POSIX ucontext_t.
This fixes:
arch/um/os-Linux/signal.c: In function 'hard_handler':
arch/um/os-Linux/signal.c:163:22: error: dereferencing pointer to incomplete type 'struct ucontext'
mcontext_t *mc = &uc->uc_mcontext;
arch/x86/um/stub_segv.c: In function 'stub_segv_handler':
arch/x86/um/stub_segv.c:16:13: error: dereferencing pointer to incomplete type 'struct ucontext'
&uc->uc_mcontext);
Cc: stable@vger.kernel.org
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 530ba6c7cb upstream.
Recent libcs have gotten a bit more strict, so we actually need to
include the right headers and use the right types. This enables UML to
compile again.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a872fa4e9 upstream.
The ring buffer is made up of a link list of pages. When making the ring
buffer bigger, it will allocate all the pages it needs before adding to the
ring buffer, and if it fails, it frees them and returns an error. This makes
increasing the ring buffer size an all or nothing action. When this was
first created, the pages were allocated with "NORETRY". This was to not
cause any Out-Of-Memory (OOM) actions from allocating the ring buffer. But
NORETRY was too strict, as the ring buffer would fail to expand even when
there's memory available, but was taken up in the page cache.
Commit 848618857d ("tracing/ring_buffer: Try harder to allocate") changed
the allocating from NORETRY to RETRY_MAYFAIL. The RETRY_MAYFAIL would
allocate from the page cache, but if there was no memory available, it would
simple fail the allocation and not trigger an OOM.
This worked fine, but had one problem. As the ring buffer would allocate one
page at a time, it could take up all memory in the system before it failed
to allocate and free that memory. If the allocation is happening and the
ring buffer allocates all memory and then tries to take more than available,
its allocation will not trigger an OOM, but if there's any allocation that
happens someplace else, that could trigger an OOM, even though once the ring
buffer's allocation fails, it would free up all the previous memory it tried
to allocate, and allow other memory allocations to succeed.
Commit d02bd27bd3 ("mm/page_alloc.c: calculate 'available' memory in a
separate function") separated out si_mem_availble() as a separate function
that could be used to see how much memory is available in the system. Using
this function to make sure that the ring buffer could be allocated before it
tries to allocate pages we can avoid allocating all memory in the system and
making it vulnerable to OOMs if other allocations are taking place.
Link: http://lkml.kernel.org/r/1522320104-6573-1-git-send-email-zhaoyang.huang@spreadtrum.com
CC: stable@vger.kernel.org
Cc: linux-mm@kvack.org
Fixes: 848618857d ("tracing/ring_buffer: Try harder to allocate")
Requires: d02bd27bd3 ("mm/page_alloc.c: calculate 'available' memory in a separate function")
Reported-by: Zhaoyang Huang <huangzhaoyang@gmail.com>
Tested-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0731de476a upstream.
Per the ACPI specification the only functional purpose for a DIMM
Control Region to be mapped into the system physical address space, from
an OSPM perspective, is to support block-apertures. However, there are
some BIOSen that publish DIMM Control Region SPA entries for pre-boot
environment consumption. Undo the kernel policy of generating disabled
'ndblk' regions when this configuration is detected.
Cc: <stable@vger.kernel.org>
Fixes: 1f7df6f88b ("libnvdimm, nfit: regions (block-data-window...)")
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 78727137fd upstream.
There is a small window whereby ARS scan requests can schedule work that
userspace will miss when polling scrub_show. Hold the init_mutex lock
over calls to report the status to close this potential escape. Also,
make sure that requests to cancel the ARS workqueue are treated as an
idle event.
Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Fixes: 37b137ff8c ("nfit, libnvdimm: allow an ARS scrub...")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e6338cfb5 upstream.
Commit 841a915d20 ("printf: Do not have bprintf dereference pointers")
would preprocess various pointers that are dereferenced in the bprintf()
because the recording and printing are done at two different times. Some
pointers stayed dereferenced in the ring buffer because user space could
handle them (namely "%pS" and friends). Pointers that are not dereferenced
should not be processed immediately but instead just saved directly.
Cc: stable@vger.kernel.org
Fixes: 841a915d20 ("printf: Do not have bprintf dereference pointers")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f8672201b upstream.
The following NULL dereference results from incorrectly assuming that
ndd is valid in this print:
struct nvdimm_drvdata *ndd = to_ndd(&nd_region->mapping[i]);
/*
* Give up if we don't find an instance of a uuid at each
* position (from 0 to nd_region->ndr_mappings - 1), or if we
* find a dimm with two instances of the same uuid.
*/
dev_err(&nd_region->dev, "%s missing label for %pUb\n",
dev_name(ndd->dev), nd_label->uuid);
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
IP: nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 43 PID: 673 Comm: kworker/u609:10 Not tainted 4.16.0-rc4+ #1
[..]
RIP: 0010:nd_region_register_namespaces+0xd67/0x13c0 [libnvdimm]
[..]
Call Trace:
? devres_add+0x2f/0x40
? devm_kmalloc+0x52/0x60
? nd_region_activate+0x9c/0x320 [libnvdimm]
nd_region_probe+0x94/0x260 [libnvdimm]
? kernfs_add_one+0xe4/0x130
nvdimm_bus_probe+0x63/0x100 [libnvdimm]
Switch to using the nvdimm device directly.
Fixes: 0e3b0d123c ("libnvdimm, namespace: allow multiple pmem...")
Cc: <stable@vger.kernel.org>
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c31898c8c7 upstream.
At initialization time the 'dimm' driver caches a copy of the memory
device's label area and reserves address space for each of the
namespaces defined.
However, as can be seen below, the reservation occurs even when the
index blocks are invalid:
nvdimm nmem0: nvdimm_init_config_data: len: 131072 rc: 0
nvdimm nmem0: config data size: 131072
nvdimm nmem0: __nd_label_validate: nsindex0 labelsize 1 invalid
nvdimm nmem0: __nd_label_validate: nsindex1 labelsize 1 invalid
nvdimm nmem0: : pmem-6025e505: 0x1000000000 @ 0xf50000000 reserve <-- bad
Gate dpa reservation on the presence of valid index blocks.
Cc: <stable@vger.kernel.org>
Fixes: 4a826c83db ("libnvdimm: namespace indices: read and validate")
Reported-by: Krzysztof Rusocki <krzysztof.rusocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0803d7befa upstream.
The Acer Acer Veriton X4110G has a TPM device detected as:
tpm_tis 00:0b: 1.2 TPM (device-id 0xFE, rev-id 71)
After the first S3 suspend, the following error appears during resume:
tpm tpm0: A TPM error(38) occurred continue selftest
Any following S3 suspend attempts will now fail with this error:
tpm tpm0: Error (38) sending savestate before suspend
PM: Device 00:0b failed to suspend: error 38
Error 38 is TPM_ERR_INVALID_POSTINIT which means the TPM is
not in the correct state. This indicates that the platform BIOS
is not sending the usual TPM_Startup command during S3 resume.
>From this point onwards, all TPM commands will fail.
The same issue was previously reported on Foxconn 6150BK8MC and
Sony Vaio TX3.
The platform behaviour seems broken here, but we should not break
suspend/resume because of this.
When the unexpected TPM state is encountered, set a flag to skip the
affected TPM_SaveState command on later suspends.
Cc: stable@vger.kernel.org
Signed-off-by: Chris Chiu <chiu@endlessm.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Link: http://lkml.kernel.org/r/CAB4CAwfSCvj1cudi+MWaB5g2Z67d9DwY1o475YOZD64ma23UiQ@mail.gmail.com
Link: https://lkml.org/lkml/2011/3/28/192
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=591031
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad7b4e8022 upstream.
cxllib_handle_fault() is called by an external driver when it needs to
have the host resolve page faults for a buffer. The buffer can cover
several pages and VMAs. The function iterates over all the pages used
by the buffer, based on the page size of the VMA.
To ensure some stability while processing the faults, the thread T1
grabs the mm->mmap_sem semaphore with read access (R1). However, when
processing a page fault for a single page, one of the underlying
functions, copro_handle_mm_fault(), also grabs the same semaphore with
read access (R2). So the thread T1 takes the semaphore twice.
If another thread T2 tries to access the semaphore in write mode W1
(say, because it wants to allocate memory and calls 'brk'), then that
thread T2 will have to wait because there's a reader (R1). If the
thread T1 is processing a new page at that time, it won't get an
automatic grant at R2, because there's now a writer thread
waiting (T2). And we have a deadlock.
The timeline is:
1. thread T1 owns the semaphore with read access R1
2. thread T2 requests write access W1 and waits
3. thread T1 requests read access R2 and waits
The fix is for the thread T1 to release the semaphore R1 once it got
the information it needs from the current VMA. The address space/VMAs
could evolve while T1 iterates over the full buffer, but in the
unlikely case where T1 misses a page, the external driver will raise a
new page fault when retrying the memory access.
Fixes: 3ced8d7300 ("cxl: Export library to support IBM XSL")
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c5637476bb upstream.
Despite the efforts made to correctly read the NDA and CUBC registers,
the order in which the registers are read could sometimes lead to an
inconsistent state.
Re-using the timeline from the comments, this following timing of
registers reads could lead to reading NDA with value "@desc2" and
CUBC with value "MAX desc1":
INITD -------- ------------
|____________________|
_______________________ _______________
NDA @desc2 \/ @desc3
_______________________/\_______________
__________ ___________ _______________
CUBC 0 \/ MAX desc1 \/ MAX desc2
__________/\___________/\_______________
| | | |
Events:(1)(2) (3)(4)
(1) check_nda = @desc2
(2) initd = 1
(3) cur_ubc = MAX desc1
(4) cur_nda = @desc2
This is allowed by the condition ((check_nda == cur_nda) && initd),
despite cur_ubc and cur_nda being in the precise state we don't want.
This error leads to incorrect residue computation.
Fix it by inversing the order in which CUBC and INITD are read. This
makes sure that NDA and CUBC are always read together either _before_
INITD goes to 0 or _after_ it is back at 1.
The case where NDA is read before INITD is at 0 and CUBC is read after
INITD is back at 1 will be rejected by check_nda and cur_nda being
different.
Fixes: 53398f4888 ("dmaengine: at_xdmac: fix residue corruption")
Cc: stable@vger.kernel.org
Signed-off-by: Maxime Jayat <maxime.jayat@mobile-devices.fr>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 880bcce0dc upstream.
Fix a race for "nosync" activations providing "aa.." device health
characters and "0/N" sync ratio rather than "AA..." and "N/N". Occurs
when status for the raid set is retrieved during resume before the MD
sync thread starts and clears the MD_RECOVERY_NEEDED flag.
Cc: stable@vger.kernel.org # 4.16+
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 06892cc190 upstream.
gcc-4.4.4 has issues with initialization of anonymous unions:
drivers/infiniband/ulp/srpt/ib_srpt.c: In function 'srpt_zerolength_write':
drivers/infiniband/ulp/srpt/ib_srpt.c:854: error: unknown field 'wr_cqe' specified in initializer
drivers/infiniband/ulp/srpt/ib_srpt.c:854: warning: initialization makes integer from pointer without a cast
Work aound this.
Fixes: 2a78cb4db4 ("IB/srpt: Fix an out-of-bounds stack access in srpt_zerolength_write()")
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ee687735e upstream.
gcc-4.4.4 has issues with initialization of anonymous unions.
drivers/infiniband/core/verbs.c: In function '__ib_drain_sq':
drivers/infiniband/core/verbs.c:2204: error: unknown field 'wr_cqe' specified in initializer
drivers/infiniband/core/verbs.c:2204: warning: initialization makes integer from pointer without a cast
Work around this.
Fixes: a1ae7d0345 ("RDMA/core: Avoid that ib_drain_qp() triggers an out-of-bounds stack access")
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a148896b2 upstream.
Ensure that cv_end is equal to ibdev->num_comp_vectors for the
NUMA node with the highest index. This patch improves spreading
of RDMA channels over completion vectors and thereby improves
performance, especially on systems with only a single NUMA node.
This patch drops support for the comp_vector login parameter by
ignoring the value of that parameter since I have not found a
good way to combine support for that parameter and automatic
spreading of RDMA channels over completion vectors.
Fixes: d92c0da71a ("IB/srp: Add multichannel support")
Reported-by: Alexander Schmid <alex@modula-shop-systems.de>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Alexander Schmid <alex@modula-shop-systems.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e68088e78d upstream.
Before commit e494f6a728 ("[SCSI] improved eh timeout handler") it
did not really matter whether or not abort handlers like srp_abort()
called .scsi_done() when returning another value than SUCCESS. Since
that commit however this matters. Hence only call .scsi_done() when
returning SUCCESS.
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e15dc99dbb upstream.
The commit 02a5d6925c ("ALSA: pcm: Avoid potential races between OSS
ioctls and read/write") split the PCM preparation code to a locked
version, and it added a sanity check of runtime->oss.prepare flag
along with the change. This leaded to an endless loop when the stream
gets XRUN: namely, snd_pcm_oss_write3() and co call
snd_pcm_oss_prepare() without setting runtime->oss.prepare flag and
the loop continues until the PCM state reaches to another one.
As the function is supposed to execute the preparation
unconditionally, drop the invalid state check there.
The bug was triggered by syzkaller.
Fixes: 02a5d6925c ("ALSA: pcm: Avoid potential races between OSS ioctls and read/write")
Reported-by: syzbot+150189c103427d31a053@syzkaller.appspotmail.com
Reported-by: syzbot+7e3f31a52646f939c052@syzkaller.appspotmail.com
Reported-by: syzbot+4f2016cf5185da7759dc@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a820ccbe21 upstream.
The PCM runtime object is created and freed dynamically at PCM stream
open / close time. This is tracked via substream->runtime, and it's
cleared at snd_pcm_detach_substream().
The runtime object assignment is protected by PCM open_mutex, so for
all PCM operations, it's safely handled. However, each PCM substream
provides also an ALSA timer interface, and user-space can access to
this while closing a PCM substream. This may eventually lead to a
UAF, as snd_pcm_timer_resolution() tries to access the runtime while
clearing it in other side.
Fortunately, it's the only concurrent access from the PCM timer, and
it merely reads runtime->timer_resolution field. So, we can avoid the
race by reordering kfree() and wrapping the substream->runtime
clearance with the corresponding timer lock.
Reported-by: syzbot+8e62ff4e07aa2ce87826@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f6d297df4d upstream.
The previous fix 40cab6e88c ("ALSA: pcm: Return -EBUSY for OSS
ioctls changing busy streams") introduced some mutex unbalance; the
check of runtime->oss.rw_ref was inserted in a wrong place after the
mutex lock.
This patch fixes the inconsistency by rewriting with the helper
functions to lock/unlock parameters with the stream check.
Fixes: 40cab6e88c ("ALSA: pcm: Return -EBUSY for OSS ioctls changing busy streams")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40cab6e88c upstream.
OSS PCM stream management isn't modal but it allows ioctls issued at
any time for changing the parameters. In the previous hardening
patch ("ALSA: pcm: Avoid potential races between OSS ioctls and
read/write"), we covered these races and prevent the corruption by
protecting the concurrent accesses via params_lock mutex. However,
this means that some ioctls that try to change the stream parameter
(e.g. channels or format) would be blocked until the read/write
finishes, and it may take really long.
Basically changing the parameter while reading/writing is an invalid
operation, hence it's even more user-friendly from the API POV if it
returns -EBUSY in such a situation.
This patch adds such checks in the relevant ioctls with the addition
of read/write access refcount.
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02a5d6925c upstream.
Although we apply the params_lock mutex to the whole read and write
operations as well as snd_pcm_oss_change_params(), we may still face
some races.
First off, the params_lock is taken inside the read and write loop.
This is intentional for avoiding the too long locking, but it allows
the in-between parameter change, which might lead to invalid
pointers. We check the readiness of the stream and set up via
snd_pcm_oss_make_ready() at the beginning of read and write, but it's
called only once, by assuming that it remains ready in the rest.
Second, many ioctls that may change the actual parameters
(i.e. setting runtime->oss.params=1) aren't protected, hence they can
be processed in a half-baked state.
This patch is an attempt to plug these holes. The stream readiness
check is moved inside the read/write inner loop, so that the stream is
always set up in a proper state before further processing. Also, each
ioctl that may change the parameter is wrapped with the params_lock
for avoiding the races.
The issues were triggered by syzkaller in a few different scenarios,
particularly the one below appearing as GPF in loopback_pos_update.
Reported-by: syzbot+c4227aec125487ec3efa@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2552428863 upstream.
Michal Kalderon has found some corner cases around device unload
with active NFS mounts that I didn't have the imagination to test
when xprtrdma device removal was added last year.
- The ULP device removal handler is responsible for deallocating
the PD. That wasn't clear to me initially, and my own testing
suggested it was not necessary, but that is incorrect.
- The transport destruction path can no longer assume that there
is a valid ID.
- When destroying a transport, ensure that ib_free_cq() is not
invoked on a CQ that was already released.
Reported-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Fixes: bebd031866 ("xprtrdma: Support unplugging an HCA from ...")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6720a89933 upstream.
With v4.15, on one of my NFS/RDMA clients I measured a nearly
doubling in the latency of small read and write system calls. There
was no change in server round trip time. The extra latency appears
in the whole RPC execution path.
"git bisect" settled on commit ccede75985 ("xprtrdma: Spread reply
processing over more CPUs") .
After some experimentation, I found that leaving the WQ bound and
allowing the scheduler to pick the dispatch CPU seems to eliminate
the long latencies, and it does not introduce any new regressions.
The fix is implemented by reverting only the part of
commit ccede75985 ("xprtrdma: Spread reply processing over more
CPUs") that dispatches RPC replies specifically on the CPU where the
matching RPC call was made.
Interestingly, saving the CPU number and later queuing reply
processing there was effective _only_ for a NFS READ and WRITE
request. On my NUMA client, in-kernel RPC reply processing for
asynchronous RPCs was dispatched on the same CPU where the RPC call
was made, as expected. However synchronous RPCs seem to get their
reply dispatched on some other CPU than where the call was placed,
every time.
Fixes: ccede75985 ("xprtrdma: Spread reply processing over ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5059353df8 upstream.
dm-crypt consumes an excessive amount memory when the user attempts to
zero a dm-crypt device with "blkdiscard -z". The command "blkdiscard -z"
calls the BLKZEROOUT ioctl, it goes to the function __blkdev_issue_zeroout,
__blkdev_issue_zeroout sends a large amount of write bios that contain
the zero page as their payload.
For each incoming page, dm-crypt allocates another page that holds the
encrypted data, so when processing "blkdiscard -z", dm-crypt tries to
allocate the amount of memory that is equal to the size of the device.
This can trigger OOM killer or cause system crash.
Fix this by limiting the amount of memory that dm-crypt allocates to 2%
of total system memory. This limit is system-wide and is divided by the
number of active dm-crypt devices and each device receives an equal
share.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0519c71e8d upstream.
Otherwise, these abnormal IOs would be sent to the DM target
regardless of whether the target advertised support for them.
Factor out __process_abnormal_io() from __split_and_process_non_flush()
so that discards, write same, etc may be conditionally processed.
Fixes: 978e51ba3 ("dm: optimize bio-based NVMe IO submission")
Cc: stable@vger.kernel.org # 4.16
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54dd0e0a1b upstream.
Add explicit checks in ext4_xattr_block_get() just in case the
e_value_offs and e_value_size fields in the the xattr block are
corrupted in memory after the buffer_verified bit is set on the xattr
block.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9496005d6c upstream.
Add some paranoia checks to make sure we don't stray beyond the end of
the valid memory region containing ext4 xattr entries while we are
scanning for a match.
Also rename the function to xattr_find_entry() since it is static and
thus only used in fs/ext4/xattr.c
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de05ca8526 upstream.
Refactor the call to EXT4_ERROR_INODE() into ext4_xattr_check_block().
This simplifies the code, and fixes a problem where not all callers of
ext4_xattr_check_block() were not resulting in ext4_error() getting
called when the xattr block is corrupted.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 18db4b4e6f upstream.
If some metadata block, such as an allocation bitmap, overlaps the
superblock, it's very likely that if the file system is mounted
read/write, the results will not be pretty. So disallow r/w mounts
for file systems corrupted in this particular way.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe23cb65c2 upstream.
ext4_iomap_begin() has a bug where offset returned in the iomap
structure will be truncated to unsigned long size. On 64-bit
architectures this is fine but on 32-bit architectures obviously not.
Not many places actually use the offset stored in the iomap structure
but one of visible failures is in SEEK_HOLE / SEEK_DATA implementation.
If we create a file like:
dd if=/dev/urandom of=file bs=1k seek=8m count=1
then
lseek64("file", 0x100000000ULL, SEEK_DATA)
wrongly returns 0x100000000 on unfixed kernel while it should return
0x200000000. Avoid the overflow by proper type cast.
Fixes: 545052e9e3 ("ext4: Switch to iomap for SEEK_HOLE / SEEK_DATA")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org # v4.15
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73fdad00b2 upstream.
i_disksize update should be protected by i_data_sem, by either taking
the lock explicitly or by using ext4_update_i_disksize() helper. But the
i_disksize updates in ext4_direct_IO_write() are not protected at all,
which may be racing with i_disksize updates in writeback path in
delalloc buffer write path.
This is found by code inspection, and I didn't hit any i_disksize
corruption due to this bug. Thanks to Jan Kara for catching this bug and
suggesting the fix!
Reported-by: Jan Kara <jack@suse.cz>
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 044e6e3d74 upstream.
When reading the inode or block allocation bitmap, if the bitmap needs
to be initialized, do not update the checksum in the block group
descriptor. That's because we're not set up to journal those changes.
Instead, just set the verified bit on the bitmap block, so that it's
not necessary to validate the checksum.
When a block or inode allocation actually happens, at that point the
checksum will be calculated, and update of the bg descriptor block
will be properly journalled.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb7c02445c upstream.
Previously the jbd2 layer assumed that a file system check would be
required after a journal abort. In the case of the deliberate file
system shutdown, this should not be necessary. Allow the jbd2 layer
to distinguish between these two cases by using the ESHUTDOWN errno.
Also add proper locking to __journal_abort_soft().
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6d9946bb9 upstream.
The msleep() when processing EXT4_GOING_FLAGS_NOLOGFLUSH was a hack to
avoid some races (that are now fixed), but in fact it introduced its
own race.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 576d18ed60 upstream.
The ext4 forced shutdown flag needs to prevent new handles from being
started, but it needs to allow existing handles to complete. So the
forced shutdown flag should not force ext4_journal_get_write_access to
fail.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e2fb22103 upstream.
Early alpha processors cannot write a single byte or word; they read 8
bytes, modify the value in registers and write back 8 bytes.
The type blk_status_t is defined as one byte, it is often written
asynchronously by I/O completion routines, this asynchronous modification
can corrupt content of nearby bytes if these nearby bytes can be written
simultaneously by another CPU.
- one example of such corruption is the structure dm_io where
"blk_status_t status" is written by an asynchronous completion routine
and "atomic_t io_count" is modified synchronously
- another example is the structure dm_buffer where "unsigned hold_count"
is modified synchronously from process context and "blk_status_t
write_error" is modified asynchronously from bio completion routine
This patch fixes the bug by changing the type blk_status_t to 32 bits if
we are on Alpha and if we are compiling for a processor that doesn't have
the byte-word-extension.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad49aee401 upstream.
Sometimes (firmware bug?) the V5 boost GPIO is not configured as output
by the BIOS, leading to the 5V boost convertor being permanently on,
Explicitly set the direction and drv flags rather then inheriting them
from the firmware to fix this.
Fixes: 585cb239f4 ("extcon: intel-cht-wc: Disable external 5v boost ...")
Cc: stable@vger.kernel.org
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f886f4d1d upstream.
This fixes a harmless UBSAN where root could potentially end up
causing an overflow while bumping the entropy_total field (which is
ignored once the entropy pool has been initialized, and this generally
is completed during the boot sequence).
This is marginal for the stable kernel series, but it's a really
trivial patch, and it fixes UBSAN warning that might cause security
folks to get overly excited for no reason.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Chen Feng <puck.chen@hisilicon.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa08192a25 upstream.
Most MMIO GIC register accesses use a 1-hot bit scheme that
avoids requiring any form of locking. This isn't true for the
GICD_ICFGRn registers, which require a RMW sequence.
Unfortunately, we seem to be missing a lock for these particular
accesses, which could result in a race condition if changing the
trigger type on any two interrupts within the same set of 16
interrupts (and thus controlled by the same CFGR register).
Introduce a private lock in the GIC common comde for this
particular case, making it cover both GIC implementations
in one go.
Cc: stable@vger.kernel.org
Signed-off-by: Aniruddha Banerjee <aniruddhab@nvidia.com>
[maz: updated changelog]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea9d7bb798 upstream.
On Lenovo ThinkPad Yoga 370 (and possibly some other Lenovo models as
well) the Thunderbolt host controller sometimes comes up in such way
that the ICM firmware is not running properly. This is most likely an
issue in BIOS/firmware but as side-effect driver crashes the kernel due
to NULL pointer dereference:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000980
IP: pci_write_config_dword+0x5/0x20
Call Trace:
pcie2cio_write+0x3b/0x70 [thunderbolt]
icm_driver_ready+0x168/0x260 [thunderbolt]
? tb_ctl_start+0x50/0x70 [thunderbolt]
tb_domain_add+0x73/0xf0 [thunderbolt]
nhi_probe+0x182/0x300 [thunderbolt]
local_pci_probe+0x42/0xa0
? pci_match_device+0xd9/0x100
pci_device_probe+0x146/0x1b0
driver_probe_device+0x315/0x480
...
Instead of crashing update the driver to bail out gracefully if we
encounter such situation.
Fixes: f67cf49117 ("thunderbolt: Add support for Internal Connection Manager (ICM)")
Reported-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Yehezkel Bernat <yehezkel.bernat@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79fae98751 upstream.
If the system is suspended and user disconnects cable to another host
and connects it to a Thunderbolt device instead we get a warning from
driver core about adding duplicate sysfs attribute and adding the new
device fails.
Handle this properly so that we first remove the existing XDomain
connection before adding new devices.
Fixes: d1ff70241a ("thunderbolt: Add support for XDomain discovery protocol")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2a659f7d8 upstream.
The driver misses implementation of PM hook that undoes what
->freeze_noirq() does after the hibernation image is created. This means
the control channel is not resumed properly and the Thunderbolt bus
becomes useless in later stages of hibernation (when the image is stored
or if the operation fails).
Fix this by pointing ->thaw_noirq to driver nhi_resume_noirq(). This
makes sure the control channel is resumed properly.
Fixes: 23dd5bb49d ("thunderbolt: Add suspend/hibernate support")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a03e828915 upstream.
We need to make sure a new PCIe tunnel is not created in a middle of
previous PCI rescan because otherwise the rescan code might find too
much and fail to reconfigure devices properly. This is important when
native PCIe hotplug is used. In BIOS assisted hotplug there should be no
such issue.
Fixes: f67cf49117 ("thunderbolt: Add support for Internal Connection Manager (ICM)")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4be8c9b6a upstream.
Sometimes during cold boot ICM has not yet authenticated the active NVM
image leading to timeout and failing the driver probe. Allow ICM to take
some more time and increase the timeout to 3 seconds before we give up.
While there fix icm_firmware_init() to return the real error code
without overwriting it with -ENODEV.
Fixes: f67cf49117 ("thunderbolt: Add support for Internal Connection Manager (ICM)")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 267e2c6fd7 upstream.
Fix the topology kcontrol string handling so that string pointer
references are strdup()ed instead of being copied. This fixes issues
with kcontrol templates on the stack or ones that are freed. Remember
and free the strings too when topology is unloaded.
Signed-off-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a01df75ce7 upstream.
SSM2602 driver is broken on recent kernels (at least
since 4.9). User space applications such as amixer or
alsamixer get EIO when attempting to access codec
controls via the relevant IOCTLs.
Root cause of these failures is the regcache_hw_init
function in drivers/base/regmap/regcache.c, which
prevents regmap cache initalization from the
reg_defaults_raw element of the regmap_config structure
when registers are write only. It also disables the
regmap cache entirely when all registers are write only
or volatile as is the case for the SSM2602 driver.
Using the reg_defaults element of the regmap_config
structure rather than the reg_defaults_raw element to
initalize the regmap cache avoids the logic in the
regcache_hw_init function entirely. It also makes this
driver consistent with other ASoC codec drivers, as
this driver was the ONLY codec driver that used the
reg_defaults_raw element to initalize the cache.
Tested on Digilent Zybo Z7 development board which has
a SSM2603 codec chip connected to a Xilinx Zynq SoC.
Signed-off-by: James Kelly <jamespeterkelly@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6de0b13cc0 upstream.
When size is negative, calling memset will make segment fault.
Declare the size as type u32 to keep memset safe.
size in struct hid_report is unsigned, fix return type of
hid_report_len to u32.
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3064a03b94 upstream.
Follow the change of return type u32 of hid_report_len,
fix all the types of variables those get the return value of
hid_report_len to u32, and all other code already uses u32.
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2675c13b29 upstream.
In tlbiel_radix_set_isa300() we use the PPC_TLBIEL() macro to
construct tlbiel instructions. The instruction takes 5 fields, two of
which are registers, and the others are constants. But because it's
constructed with inline asm the compiler doesn't know that.
We got the constraint wrong on the 'r' field, using "r" tells the
compiler to put the value in a register. The value we then get in the
macro is the *register number*, not the value of the field.
That means when we mask the register number with 0x1 we get 0 or 1
depending on which register the compiler happens to put the constant
in, eg:
li r10,1
tlbiel r8,r9,2,0,0
li r7,1
tlbiel r10,r6,0,0,1
If we're unlucky we might generate an invalid instruction form, for
example RIC=0, PRS=1 and R=0, tlbiel r8,r7,0,1,0, this has been
observed to cause machine checks:
Oops: Machine check, sig: 7 [#1]
CPU: 24 PID: 0 Comm: swapper
NIP: 00000000000385f4 LR: 000000000100ed00 CTR: 000000000000007f
REGS: c00000000110bb40 TRAP: 0200
MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48002222 XER: 20040000
CFAR: 00000000000385d0 DAR: 0000000000001c00 DSISR: 00000200 SOFTE: 1
If the machine check happens early in boot while we have MSR_ME=0 it
will escalate into a checkstop and kill the box entirely.
To fix it we could change the inline asm constraint to "i" which
tells the compiler the value is a constant. But a better fix is to just
pass a literal 1 into the macro, which bypasses any problems with inline
asm constraints.
Fixes: d4748276ae ("powerpc/64s: Improve local TLB flush for boot and MCE on POWER9")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b8070335f upstream.
The OPAL NVRAM driver does not sleep in case it gets OPAL_BUSY or
OPAL_BUSY_EVENT from firmware, which causes large scheduling
latencies, and various lockup errors to trigger (again, BMC reboot
can cause it).
Fix this by converting it to the standard form OPAL_BUSY loop that
sleeps.
Fixes: 628daa8d5a ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Depends-on: 34dd25de9f ("powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 34dd25de9f upstream.
This is the start of an effort to tidy up and standardise all the
delays. Existing loops have a range of delay/sleep periods from 1ms
to 20ms, and some have no delay. They all loop forever except rtc,
which times out after 10 retries, and that uses 10ms delays. So use
10ms as our standard delay. The OPAL maintainer agrees 10ms is a
reasonable starting point.
The idea is to use the same recipe everywhere, once this is proven to
work then it will be documented as an OPAL API standard. Then both
firmware and OS can agree, and if a particular call needs something
else, then that can be documented with reasoning.
This is not the end-all of this effort, it's just a relatively easy
change that fixes some existing high latency delays. There should be
provision for standardising timeouts and/or interruptible loops where
possible, so non-fatal firmware errors don't cause hangs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf8a1abc3d upstream.
kexec_file_load() on powerpc doesn't support kdump kernels yet, so it
returns -ENOTSUPP in that case.
I've recently learned that this errno is internal to the kernel and
isn't supposed to be exposed to userspace. Therefore, change to
-EOPNOTSUPP which is defined in an uapi header.
This does indeed make kexec-tools happier. Before the patch, on
ppc64le:
# ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz
kexec_file_load failed: Unknown error 524
After the patch:
# ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz
kexec_file_load failed: Operation not supported
Fixes: a0458284f0 ("powerpc: Add support code for kexec_file_load()")
Cc: stable@vger.kernel.org # v4.10+
Reported-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Reviewed-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6e133c47e upstream.
Michael Ellerman reported the following call trace when running
ftracetest:
BUG: using __this_cpu_write() in preemptible [00000000] code: ftracetest/6178
caller is opt_pre_handler+0xc4/0x110
CPU: 1 PID: 6178 Comm: ftracetest Not tainted 4.15.0-rc7-gcc6x-gb2cd1df #1
Call Trace:
[c0000000f9ec39c0] [c000000000ac4304] dump_stack+0xb4/0x100 (unreliable)
[c0000000f9ec3a00] [c00000000061159c] check_preemption_disabled+0x15c/0x170
[c0000000f9ec3a90] [c000000000217e84] opt_pre_handler+0xc4/0x110
[c0000000f9ec3af0] [c00000000004cf68] optimized_callback+0x148/0x170
[c0000000f9ec3b40] [c00000000004d954] optinsn_slot+0xec/0x10000
[c0000000f9ec3e30] [c00000000004bae0] kretprobe_trampoline+0x0/0x10
This is showing up since OPTPROBES is now enabled with CONFIG_PREEMPT.
trampoline_probe_handler() considers itself to be a special kprobe
handler for kretprobes. In doing so, it expects to be called from
kprobe_handler() on a trap, and re-enables preemption before returning a
non-zero return value so as to suppress any subsequent processing of the
trap by the kprobe_handler().
However, with optprobes, we don't deal with special handlers (we ignore
the return code) and just try to re-enable preemption causing the above
trace.
To address this, modify trampoline_probe_handler() to not be special.
The only additional processing done in kprobe_handler() is to emulate
the instruction (in this case, a 'nop'). We adjust the value of
regs->nip for the purpose and delegate the job of re-enabling
preemption and resetting current kprobe to the probe handlers
(kprobe_handler() or optimized_callback()).
Fixes: 8a2d71a3f2 ("powerpc/kprobes: Disable preemption before invoking probe handler for optprobes")
Cc: stable@vger.kernel.org # v4.15+
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0bfdf59890 upstream.
asm/barrier.h is not always included after asm/synch.h, which meant
it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would
be eieio when it should be lwsync. kernel/time/hrtimer.c is one case.
__SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in
to where it's used. Previously with my small simulator config, 377
instances of eieio in the tree. After this patch there are 55.
Fixes: 46d075be58 ("powerpc: Optimise smp_wmb")
Cc: stable@vger.kernel.org # v2.6.29+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dbfcf3cb9c upstream.
On POWER9, since commit cc3d294013 ("powerpc/64: Enable use of radix
MMU under hypervisor on POWER9", 2017-01-30), we set both the radix and
HPT bits in the client-architecture-support (CAS) vector, which tells
the hypervisor that we can do either radix or HPT. According to PAPR,
if we use this combination we are promising to do a H_REGISTER_PROC_TBL
hcall later on to let the hypervisor know whether we are doing radix
or HPT. We currently do this call if we are doing radix but not if
we are doing HPT. If the hypervisor is able to support both radix
and HPT guests, it would be entitled to defer allocation of the HPT
until the H_REGISTER_PROC_TBL call, and to fail any attempts to create
HPTEs until the H_REGISTER_PROC_TBL call. Thus we need to do a
H_REGISTER_PROC_TBL call when we are doing HPT; otherwise we may
crash at boot time.
This adds the code to call H_REGISTER_PROC_TBL in this case, before
we attempt to create any HPT entries using H_ENTER.
Fixes: cc3d294013 ("powerpc/64: Enable use of radix MMU under hypervisor on POWER9")
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a57ac41183 upstream.
Presently the dt_cpu_ftrs restore_cpu will only add bits to the LPCR
for secondaries, but some bits must be removed (e.g., UPRT for HPT).
Not clearing these bits on secondaries causes checkstops when booting
with disable_radix.
restore_cpu can not just set LPCR, because it is also called by the
idle wakeup code which relies on opal_slw_set_reg to restore the value
of LPCR, at least on P8 which does not save LPCR to stack in the idle
code.
Fix this by including a mask of bits to clear from LPCR as well, which
is used by restore_cpu.
This is a little messy now, but it's a minimal fix that can be
backported. Longer term, the idle SPR save/restore code can be
reworked to completely avoid calls to restore_cpu, then restore_cpu
would be able to unconditionally set LPCR to match boot processor
environment.
Fixes: 5a61ef74f2 ("powerpc/64s: Support new device tree binding for discovering CPU features")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f0295e047f upstream.
The current EEH callbacks can race with a driver unbind. This can
result in a backtraces like this:
EEH: Frozen PHB#0-PE#1fc detected
EEH: PE location: S000009, PHB location: N/A
CPU: 2 PID: 2312 Comm: kworker/u258:3 Not tainted 4.15.6-openpower1 #2
Workqueue: nvme-wq nvme_reset_work [nvme]
Call Trace:
dump_stack+0x9c/0xd0 (unreliable)
eeh_dev_check_failure+0x420/0x470
eeh_check_failure+0xa0/0xa4
nvme_reset_work+0x138/0x1414 [nvme]
process_one_work+0x1ec/0x328
worker_thread+0x2e4/0x3a8
kthread+0x14c/0x154
ret_from_kernel_thread+0x5c/0xc8
nvme nvme1: Removing after probe failure status: -19
<snip>
cpu 0x23: Vector: 300 (Data Access) at [c000000ff50f3800]
pc: c0080000089a0eb0: nvme_error_detected+0x4c/0x90 [nvme]
lr: c000000000026564: eeh_report_error+0xe0/0x110
sp: c000000ff50f3a80
msr: 9000000000009033
dar: 400
dsisr: 40000000
current = 0xc000000ff507c000
paca = 0xc00000000fdc9d80 softe: 0 irq_happened: 0x01
pid = 782, comm = eehd
Linux version 4.15.6-openpower1 (smc@smc-desktop) (gcc version 6.4.0 (Buildroot 2017.11.2-00008-g4b6188e)) #2 SM P Tue Feb 27 12:33:27 PST 2018
enter ? for help
eeh_report_error+0xe0/0x110
eeh_pe_dev_traverse+0xc0/0xdc
eeh_handle_normal_event+0x184/0x4c4
eeh_handle_event+0x30/0x288
eeh_event_handler+0x124/0x170
kthread+0x14c/0x154
ret_from_kernel_thread+0x5c/0xc8
The first part is an EEH (on boot), the second half is the resulting
crash. nvme probe starts the nvme_reset_work() worker thread. This
worker thread starts touching the device which see a device error
(EEH) and hence queues up an event in the powerpc EEH worker
thread. nvme_reset_work() then continues and runs
nvme_remove_dead_ctrl_work() which results in unbinding the driver
from the device and hence releases all resources. At the same time,
the EEH worker thread starts doing the EEH .error_detected() driver
callback, which no longer works since the resources have been freed.
This fixes the problem in the same way the generic PCIe AER code (in
drivers/pci/pcie/aer/aerdrv_core.c) does. It makes the EEH code hold
the device_lock() while performing the driver EEH callbacks and
associated code. This ensures either the callbacks are no longer
register, or if they are registered the driver will not be removed
from underneath us.
This has been broken forever. The EEH call backs were first introduced
in 2005 (in 77bd741561) but it's not clear if a lock was needed back
then.
Fixes: 77bd741561 ("[PATCH] powerpc: PCI Error Recovery: PPC64 core recovery routines")
Cc: stable@vger.kernel.org # v2.6.16+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c130153e45 upstream.
The pkey code added a CPU_FTR_PKEY bit, but did not add it to the
dt_cpu_ftrs feature set. Although capability is supported by all
processors in the base dt_cpu_ftrs set for 64s, it's a significant
and sufficiently well defined feature to make it optional. So add
it as a quirk for now, which can be versioned out then controlled
by the firmware (once dt_cpu_ftrs gains versioning support).
Fixes: cf43d3b264 ("powerpc: Enable pkey subsystem")
Cc: stable@vger.kernel.org # v4.16+
Cc: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 70e80655f5 upstream.
It seems this is a copy-paste error and that the proper variable to use
in this particular case is _sha512_ instead of _md5_.
Addresses-Coverity-ID: 1465358 ("Copy-paste error")
Fixes: 1c6614d229e7 ("CIFS: add sha512 secmech")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82fb82be05 upstream.
shash and sdesc and always allocated and freed together.
* abstract this in new functions cifs_alloc_hash() and cifs_free_hash().
* make smb2/3 crypto allocation independent from each other.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7f6d915a1 upstream.
On some systems, the BIOS expects certain SMBus register values to
match the hardware defaults. Restore these configuration registers at
shutdown time to avoid confusing the BIOS. This avoids hard-locking
such systems upon reboot.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a086bb8317 upstream.
Saving the original value of register SMBSLVCMD in
i801_enable_host_notify() doesn't work, because this function is
called not only at probe time but also at resume time. Do it in
i801_probe() instead, so that the saved value is not overwritten at
resume time.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 22e94bd677 ("i2c: i801: store and restore the SLVCMD register at load and unload")
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@vger.kernel.org # v4.10+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac75a04104 upstream.
When convert char array with signed int, if the inbuf[x] is negative then
upper bits will be set to 1. Fix this by using u8 instead of char.
ret_size has to be at least 3, hid_input_report use it after minus 2 bytes.
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ea884c77e upstream.
Some servers return inode number zero for the root directory, which
causes ls to display incorrect data (missing "." and "..").
If the server returns zero for the inode number of the root directory,
fake an inode number for it.
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48f238a79f upstream.
During transport reconnect, other processes may have registered memory
and blocked on transport. This creates a deadlock situation because the
transport resources can't be freed, and reconnect is blocked.
Fix this by returning to upper layer on timeout. Before returning,
transport status is set to reconnecting so other processes will release
memory registration resources.
Upper layer will retry the reconnect. This is not in fast I/O path so
setting the timeout to 5 seconds.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 262916bc69 upstream.
We can not use the standard sg_set_buf() fucntion since when
CONFIG_DEBUG_SG=y this adds a check that will BUG_ON for cifs.ko
when we pass it an object from the stack.
Create a new wrapper smb2_sg_set_buf() which avoids doing that particular check
and use it for smb3 encryption instead.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c91815b596 upstream.
This is a requirement which has always existed but, somehow, wasn't
reflected in the documentation and problems weren't found until now
when Tuba Yavuz found a possible deadlock happening between dwc3 and
f_hid. She described the situation as follows:
spin_lock_irqsave(&hidg->write_spinlock, flags); // first acquire
/* we our function has been disabled by host */
if (!hidg->req) {
free_ep_req(hidg->in_ep, hidg->req);
goto try_again;
}
[...]
status = usb_ep_queue(hidg->in_ep, hidg->req, GFP_ATOMIC);
=>
[...]
=> usb_gadget_giveback_request
=>
f_hidg_req_complete
=>
spin_lock_irqsave(&hidg->write_spinlock, flags); // second acquire
Note that this happens because dwc3 would call ->complete() on a
failed usb_ep_queue() due to failed Start Transfer command. This is,
anyway, a theoretical situation because dwc3 currently uses "No
Response Update Transfer" command for Bulk and Interrupt endpoints.
It's still good to make this case impossible to happen even if the "No
Reponse Update Transfer" command is changed.
Reported-by: Tuba Yavuz <tuba@ece.ufl.edu>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64627388b5 upstream.
USB3 hubs don't support global suspend.
USB3 specification 10.10, Enhanced SuperSpeed hubs only support selective
suspend and resume, they do not support global suspend/resume where the
hub downstream facing ports states are not affected.
When system enters hibernation it first enters freeze process where only
the root hub enters suspend, usb_port_suspend() is not called for other
devices, and suspend status flags are not set for them. Other devices are
expected to suspend globally. Some external USB3 hubs will suspend the
downstream facing port at global suspend. These devices won't be resumed
at thaw as the suspend status flag is not set.
A USB3 removable hard disk connected through a USB3 hub that won't resume
at thaw will fail to synchronize SCSI cache, return “cmd cmplt err -71”
error, and needs a 60 seconds timeout which causing system hang for 60s
before the USB host reset the port for the USB3 removable hard disk to
recover.
Fix this by always calling usb_port_suspend() during freeze for USB3
devices.
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7fafcfdf63 upstream.
It looks like there is a possibility of a double-free vulnerability on an
error path of the f_midi_set_alt function in the f_midi driver. If the
path is feasible then free_ep_req gets called twice:
req->complete = f_midi_complete;
err = usb_ep_queue(midi->out_ep, req, GFP_ATOMIC);
=> ...
usb_gadget_giveback_request
=>
f_midi_complete (CALLBACK)
(inside f_midi_complete, for various cases of status)
free_ep_req(ep, req); // first kfree
if (err) {
ERROR(midi, "%s: couldn't enqueue request: %d\n",
midi->out_ep->name, err);
free_ep_req(midi->out_ep, req); // second kfree
return err;
}
The double-free possibility was introduced with commit ad0d1a058e
("usb: gadget: f_midi: fix leak on failed to enqueue out requests").
Found by MOXCAFE tool.
Signed-off-by: Tuba Yavuz <tuba@ece.ufl.edu>
Fixes: ad0d1a058e ("usb: gadget: f_midi: fix leak on failed to enqueue out requests")
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 466d1493ea upstream.
Some BIOSen do not handle 0-byte transfer lengths for the _LSR and _LSW
(label storage read/write) methods. This causes Linux to fallback to the
deprecated _DSM path, or otherwise disable label support.
Introduce acpi_nvdimm_has_method() to detect whether a method is
available rather than calling the method, require _LSI and _LSR to be
paired, and require read support before enabling write support.
Cc: <stable@vger.kernel.org>
Fixes: 4b27db7e26 ("acpi, nfit: add support for the _LS...")
Suggested-by: Erik Schmauss <erik.schmauss@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13d3047c81 upstream.
Mike Lothian reported that plugging in a USB-C device does not work
properly in his Dell Alienware system. This system has an Intel Alpine
Ridge Thunderbolt controller providing USB-C functionality. In these
systems the USB controller (xHCI) is hotplugged whenever a device is
connected to the port using ACPI-based hotplug.
The ACPI description of the root port in question is as follows:
Device (RP01)
{
Name (_ADR, 0x001C0000)
Device (PXSX)
{
Name (_ADR, 0x02)
Method (_RMV, 0, NotSerialized)
{
// ...
}
}
Here _ADR 0x02 means device 0, function 2 on the bus under root port (RP01)
but that seems to be incorrect because device 0 is the upstream port of the
Alpine Ridge PCIe switch and it has no functions other than 0 (the bridge
itself). When we get ACPI Notify() to the root port resulting from
connecting a USB-C device, Linux tries to read PCI_VENDOR_ID from device 0,
function 2 which of course always returns 0xffffffff because there is no
such function and we never find the device.
In Windows this works fine.
Now, since we get ACPI Notify() to the root port and not to the PXSX device
we should actually start our scan from there as well and not from the
non-existent PXSX device. Fix this by checking presence of the slot itself
(function 0) if we fail to do that otherwise.
While there use pci_bus_read_dev_vendor_id() in get_slot_status(), which is
the recommended way to read Device and Vendor IDs of devices on PCI buses.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198557
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f00e71091a upstream.
We're supposed to be checking that "val_len" is not too large but
instead we check if it is smaller than the max.
The only function affected would be regmap_i2c_smbus_i2c_write() in
drivers/base/regmap/regmap-i2c.c. Strangely that function has its own
limit check which returns an error if (count >= I2C_SMBUS_BLOCK_MAX) so
it doesn't look like it has ever been able to do anything except return
an error.
Fixes: c335931ed9 ("regmap: Add raw_write/read checks for max_raw_write/read sizes")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 36104cb901 upstream.
Commit 2cc42bac1c ("x86-64/Xen: eliminate W+X mappings") introduced a
call to get_cpu_cap, which is fstack-protected. This is works on x86-64
as commit 4f277295e5 ("x86/xen: init %gs very early to avoid page
faults with stack protector") ensures the stack protector is configured,
but it it did not cover x86-32.
Delay calling get_cpu_cap until after xen_setup_gdt has initialized the
stack canary. Without this, a 32bit PV machine crashes early
in boot.
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.6.6-xc x86_64 debug=n Tainted: C ]----
(XEN) CPU: 0
(XEN) RIP: e019:[<00000000c10362f8>]
And the PV kernel IP corresponds to init_scattered_cpuid_features
0xc10362f8 <+24>: mov %gs:0x14,%eax
Fixes 2cc42bac1c ("x86-64/Xen: eliminate W+X mappings")
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 639fa43d59 upstream.
When a BRx is provided by a pipeline, the WPF must determine the master
layer. Currently the condition to check this identifies pipe->bru ||
pipe->num_inputs > 1.
The code then moves on to dereference pipe->bru, thus the check fails
static analysers on the possibility that pipe->num_inputs could be
greater than 1 without pipe->bru being set.
The reality is that the pipeline must have a BRx to support more than
one input, thus this could never cause a fault - however it also
identifies that the num_inputs > 1 check is redundant.
Remove the redundant check - and always configure the master layer
appropriately when we have a BRx configured in our pipeline.
Fixes: 6134148f60 ("v4l: vsp1: Add support for the BRS entity")
Cc: stable@vger.kernel.org
Suggested-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03703ed1de upstream.
If buffers were prepared or queued and the buffers were released without
starting the queue, the finish mem op (corresponding to the prepare mem
op) was never called to the buffers.
Before commit a136f59c0a there was no need to do this as in such a case
the prepare mem op had not been called yet. Address the problem by
explicitly calling finish mem op when the queue is stopped if the buffer
is in either prepared or queued state.
Fixes: a136f59c0a ("[media] vb2: Move buffer cache synchronisation to prepare from queue")
Cc: stable@vger.kernel.org # for v4.13 and up
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Devin Heitmueller <dheitmueller@kernellabs.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d4068810d upstream.
If there is IR in the raw kfifo when ir_raw_event_unregister() is called,
then kthread_stop() causes ir_raw_event_thread to be scheduled, decode
some scancodes and re-arm timer_keyup. The timer_keyup then fires when
the rc device is long gone.
Cc: stable@vger.kernel.org
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 613bd1ea38 upstream.
Commit 9b61e30221 (spi: Pick spi bus number from Linux idr or spi alias)
ceased to unregister SPI buses with fixed bus numbers. Moreover this is
visible only if CONFIG_SPI_DEBUG=y is set or when trying to re-register
the same SPI controller.
rmmod spi_pxa2xx_platform (with CONFIG_SPI_DEBUG=y):
[ 26.788362] spi_master spi1: attempting to delete unregistered controller [spi1]
modprobe spi_pxa2xx_platform:
[ 37.883137] sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:19.0/pxa2xx-spi.12/spi_master/spi1'
[ 37.894984] CPU: 1 PID: 1467 Comm: modprobe Not tainted 4.16.0-rc4+ #21
[ 37.902384] Call Trace:
...
[ 38.122680] kobject_add_internal failed for spi1 with -EEXIST, don't try to register things with the same name in the same directory.
[ 38.136154] WARNING: CPU: 1 PID: 1467 at lib/kobject.c:238 kobject_add_internal+0x2a5/0x2f0
...
[ 38.513817] pxa2xx-spi pxa2xx-spi.12: problem registering spi master
[ 38.521036] pxa2xx-spi: probe of pxa2xx-spi.12 failed with error -17
Fix this by not returning immediately from spi_unregister_controller() if
idr_find() doesn't find controller with given ID/bus number. It finds
only those controllers that were registered with dynamic SPI bus
numbers. Only conditional cleanup between dynamic and fixed bus numbers
is to remove allocated IDR.
Fixes: 9b61e30221 (spi: Pick spi bus number from Linux idr or spi alias)
Cc: stable@vger.kernel.org
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce99319a18 upstream.
When SPI transfers can be offloaded using DMA, the SPI core need to
build a scatterlist to make sure that the buffer to be transferred is
dma-able.
This patch fixes the scatterlist entry size computation in the case
where the maximum acceptable scatterlist entry supported by the DMA
controller is less than PAGE_SIZE, when the buffer is vmalloced.
For each entry, the actual size is given by the minimum between the
desc_len (which is the max buffer size supported by the DMA controller)
and the remaining buffer length until we cross a page boundary.
Fixes: 65598c13fd ("spi: Fix per-page mapping of unaligned vmalloc-ed buffer")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9581329eff upstream.
The datasheet recommends initializing FIFOs before
SPI enable. If we do not do it like this, there may be
a strange behavior. We noticed that DMA does not work properly
with FIFOs if we do not clear them beforehand or enable them
before SPIEN.
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f4870753f upstream.
The proper name for the property, which assign given device to IOMMU is
'iommus', not 'iommu'. Fix incorrect name and let all GScaler devices
to be properly handled when IOMMU support is enabled.
Reported-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 6cbfdd73a9 ("ARM: dts: add sysmmu nodes for exynos5250")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0629a01920 upstream.
Fix that USB initialization fails as below runtime log is present during
booting on bananapi-r2 board by adding missing regulators the USB device
requires. Current regulators USB device uses are being updated with the
correct ones to reflect real configurations which are all from fixed
regulators rather than MT6323 one's output.
xhci-mtk 1a1c0000.usb: 1a1c0000.usb supply vbus not found, using dummy regulator
xhci-mtk 1a240000.usb: 1a240000.usb supply vbus not found, using dummy regulator
Cc: stable@vger.kernel.org
Fixes: f4ff257cd1 ("arm: dts: mt7623: add support for Bananapi R2 (BPI-R2) board")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
[mb: update kernel log in commit message]
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7480dbcf9 upstream.
Since commit 04c8b0f82c ("irqchip/gic: Make locking a BL_SWITCHER only
feature") coupled CPU idle freezes from time to time on Exynos4210. Later
commit 313c8c16ee ("PM / CPU: replace raw_notifier with atomic_notifier")
changed the context in which the CPU idle code is executed, what results
in fully reproducible freeze all the time. However, almost the same coupled
CPU idle code works fine on Exynos3250 regardless of the changes made in
the mentioned commits.
It turned out that the IPI call used on Exynos4210 is conflicting with the
change done in the first mentioned commit in GIC. Fix this by using the
same code path as for Exynos3250, instead of the IPI call for
synchronization with second CPU core, call dsb_sev() directly.
Tested on Exynos4210-based Trats and Origen boards.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
CC: <stable@vger.kernel.org> # v4.13+
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d8b44c54e upstream.
vgic_copy_lpi_list() parses the LPI list and picks LPIs targeting
a given vcpu. We allocate the array containing the intids before taking
the lpi_list_lock, which means we can have an array size that is not
equal to the number of LPIs.
This is particularly obvious when looking at the path coming from
vgic_enable_lpis, which is not a command, and thus can run in parallel
with commands:
vcpu 0: vcpu 1:
vgic_enable_lpis
its_sync_lpi_pending_table
vgic_copy_lpi_list
intids = kmalloc_array(irq_count)
MAPI(lpi targeting vcpu 0)
list_for_each_entry(lpi_list_head)
intids[i++] = irq->intid;
At that stage, we will happily overrun the intids array. Boo. An easy
fix is is to break once the array is full. The MAPI command will update
the config anyway, and we won't miss a thing. We also make sure that
lpi_list_count is read exactly once, so that further updates of that
value will not affect the array bound check.
Cc: stable@vger.kernel.org
Fixes: ccb1d791ab ("KVM: arm64: vgic-its: Fix pending table sync")
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c04ffa71ff upstream.
Different modules maybe installed by the user on the eMMC connector
of the odroid-c2. While the red modules are working without an issue,
it seems some black modules (apparently Samsung based) are having
issue at 200MHz
While the tuning algorithm introduced in v4.14 enables high speed modes
on every other tested designs, it seems a problem remains for this
particular combination of board and eMMC module.
Lowering the maximum frequency of the eMMC on this board until we can
figure out a better solution.
Fixes: d341ca88ee ("mmc: meson-gx: rework tuning function")
Suggested-by: Ellie Reeves <ellierevves@gmail.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Cc: stable@vger.kernel.org
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d7119224bf upstream.
The AXP223 PMIC, like the AXP221, does not generate VBUS change
interrupts when N_VBUSEN is used to drive VBUS for the OTG port
on the board.
This was not noticed until recently, as most A23/A33 boards use
a GPIO pin that does not support interrupts for OTG ID detection.
This forces the driver to use polling. However the A33-OlinuXino
uses a pin that does support interrupts, so the driver uses them.
However the VBUS interrupt never fires, and the driver never gets
to update the VBUS status. This results in musb timing out waiting
for VBUS to rise.
This was worked around for the AXP221 by resorting to polling
changes in commit 91d96f06a7 ("phy-sun4i-usb: Add workaround for
missing Vbus det interrupts on A31"). This patch adds the A23 and
A33 to the list of SoCs that need the workaround.
Fixes: fc1f45ed30 ("phy-sun4i-usb: Add support for the usb-phys on the
sun8i-a33 SoC")
Fixes: 123dfdbcfa ("phy-sun4i-usb: Add support for the usb-phys on the
sun8i-a23 SoC")
Cc: <stable@vger.kernel.org> # 4.3.x: 68dbc2ce77 phy-sun4i-usb:
Use of_match_node to get model specific config data
Cc: <stable@vger.kernel.org> # 4.3.x: 5cf700ac9d phy: phy-sun4i-usb:
Fix optional gpios failing probe
Cc: <stable@vger.kernel.org> # 4.3.x: 04e59a0211 phy-sun4i-usb:
Fix irq free conditions to match request conditions
Cc: <stable@vger.kernel.org> # 4.3.x: 91d96f06a7 phy-sun4i-usb:
Add workaround for missing Vbus det interrupts on A31
Cc: <stable@vger.kernel.org> # 4.3.x
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
commit a9f2a846f0 upstream.
cache_reap() is initially scheduled in start_cpu_timer() via
schedule_delayed_work_on(). But then the next iterations are scheduled
via schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.
Thus since commit ef55718044 ("workqueue: schedule WORK_CPU_UNBOUND
work on wq_unbound_cpumask CPUs") there is no guarantee the future
iterations will run on the originally intended cpu, although it's still
preferred. I was able to demonstrate this with
/sys/module/workqueue/parameters/debug_force_rr_cpu. IIUC, it may also
happen due to migrating timers in nohz context. As a result, some cpu's
would be calling cache_reap() more frequently and others never.
This patch uses schedule_delayed_work_on() with the current cpu when
scheduling the next iteration.
Link: http://lkml.kernel.org/r/20180411070007.32225-1-vbabka@suse.cz
Fixes: ef55718044 ("workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f05317d98 upstream.
syzbot reported a use-after-free of shm_file_data(file)->file->f_op in
shm_get_unmapped_area(), called via sys_remap_file_pages().
Unfortunately it couldn't generate a reproducer, but I found a bug which
I think caused it. When remap_file_pages() is passed a full System V
shared memory segment, the memory is first unmapped, then a new map is
created using the ->vm_file. Between these steps, the shm ID can be
removed and reused for a new shm segment. But, shm_mmap() only checks
whether the ID is currently valid before calling the underlying file's
->mmap(); it doesn't check whether it was reused. Thus it can use the
wrong underlying file, one that was already freed.
Fix this by making the "outer" shm file (the one that gets put in
->vm_file) hold a reference to the real shm file, and by making
__shm_open() require that the file associated with the shm ID matches
the one associated with the "outer" file.
Taking the reference to the real shm file is needed to fully solve the
problem, since otherwise sfd->file could point to a freed file, which
then could be reallocated for the reused shm ID, causing the wrong shm
segment to be mapped (and without the required permission checks).
Commit 1ac0b6dec6 ("ipc/shm: handle removed segments gracefully in
shm_mmap()") almost fixed this bug, but it didn't go far enough because
it didn't consider the case where the shm ID is reused.
The following program usually reproduces this bug:
#include <stdlib.h>
#include <sys/shm.h>
#include <sys/syscall.h>
#include <unistd.h>
int main()
{
int is_parent = (fork() != 0);
srand(getpid());
for (;;) {
int id = shmget(0xF00F, 4096, IPC_CREAT|0700);
if (is_parent) {
void *addr = shmat(id, NULL, 0);
usleep(rand() % 50);
while (!syscall(__NR_remap_file_pages, addr, 4096, 0, 0, 0));
} else {
usleep(rand() % 50);
shmctl(id, IPC_RMID, NULL);
}
}
}
It causes the following NULL pointer dereference due to a 'struct file'
being used while it's being freed. (I couldn't actually get a KASAN
use-after-free splat like in the syzbot report. But I think it's
possible with this bug; it would just take a more extraordinary race...)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 9 PID: 258 Comm: syz_ipc Not tainted 4.16.0-05140-gf8cf2f16a7c95 #189
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
RIP: 0010:d_inode include/linux/dcache.h:519 [inline]
RIP: 0010:touch_atime+0x25/0xd0 fs/inode.c:1724
[...]
Call Trace:
file_accessed include/linux/fs.h:2063 [inline]
shmem_mmap+0x25/0x40 mm/shmem.c:2149
call_mmap include/linux/fs.h:1789 [inline]
shm_mmap+0x34/0x80 ipc/shm.c:465
call_mmap include/linux/fs.h:1789 [inline]
mmap_region+0x309/0x5b0 mm/mmap.c:1712
do_mmap+0x294/0x4a0 mm/mmap.c:1483
do_mmap_pgoff include/linux/mm.h:2235 [inline]
SYSC_remap_file_pages mm/mmap.c:2853 [inline]
SyS_remap_file_pages+0x232/0x310 mm/mmap.c:2769
do_syscall_64+0x64/0x1a0 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ebiggers@google.com: add comment]
Link: http://lkml.kernel.org/r/20180410192850.235835-1-ebiggers3@gmail.com
Link: http://lkml.kernel.org/r/20180409043039.28915-1-ebiggers3@gmail.com
Reported-by: syzbot+d11f321e7f1923157eac80aa990b446596f46439@syzkaller.appspotmail.com
Fixes: c8d78c1823 ("mm: replace remap_file_pages() syscall with emulation")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60bb83b811 upstream.
We've got a bug report indicating a kernel panic at booting on an x86-32
system, and it turned out to be the invalid PCI resource assigned after
reallocation. __find_resource() first aligns the resource start address
and resets the end address with start+size-1 accordingly, then checks
whether it's contained. Here the end address may overflow the integer,
although resource_contains() still returns true because the function
validates only start and end address. So this ends up with returning an
invalid resource (start > end).
There was already an attempt to cover such a problem in the commit
47ea91b405 ("Resource: fix wrong resource window calculation"), but
this case is an overseen one.
This patch adds the validity check of the newly calculated resource for
avoiding the integer overflow problem.
Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739
Link: http://lkml.kernel.org/r/s5hpo37d5l8.wl-tiwai@suse.de
Fixes: 23c570a674 ("resource: ability to resize an allocated resource")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reported-by: Michael Henders <hendersm@shaw.ca>
Tested-by: Michael Henders <hendersm@shaw.ca>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ram Pai <linuxram@us.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5094b7f13 upstream.
While UBI and UBIFS seem to work at first sight with MLC NAND, you will
most likely lose all your data upon a power-cut or due to read/write
disturb.
In order to protect users from bad surprises, refuse to attach to MLC
NAND.
Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Boris Brezillon <boris.brezillon@bootlin.com>
Acked-by: Artem Bityutskiy <dedekind1@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 78a8dfbabb upstream.
When opening a device with write access, ubiblock_open returns an error
code. Currently, this error code is -EPERM, but this is not the right
value.
The open function for other block devices returns -EROFS when opening
read-only devices with FMODE_WRITE set. When used with dm-verity, the
veritysetup userspace tool is expecting EROFS, and refuses to use the
ubiblock device.
Use -EROFS for ubiblock as well. As a result, veritysetup accepts the
ubiblock device as valid.
Cc: stable@vger.kernel.org
Fixes: 9d54c8a33e (UBI: R/O block driver on top of UBI volumes)
Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aac17948a7 upstream.
If ubifs_wbuf_sync() fails we must not write a master node with the
dirty marker cleared.
Otherwise it is possible that in case of an IO error while syncing we
mark the filesystem as clean and UBIFS refuses to recover upon next
mount.
Cc: <stable@vger.kernel.org>
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d41386d55 upstream.
With commit e948bc8fbe (cpufreq: Cap the default transition delay
value to 10 ms) the cpufreq was not honouring the delay passed via
ACPI (PCCT). Due to which on ARM based platforms using CPPC the
cpufreq governor tries to change the frequency of CPUs faster than
expected.
This leads to continuous error messages like the following.
" ACPI CPPC: PCC check channel failed. Status=0 "
Earlier (without above commit) the default transition delay was
taken form the value passed from PCCT. Use the same value provided
by PCCT to set the transition_delay_us.
Fixes: e948bc8fbe (cpufreq: Cap the default transition delay value to 10 ms)
Signed-off-by: George Cherian <george.cherian@cavium.com>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7972326a26 upstream.
This can be reproduced by bind/unbind the driver multiple times
in AM3517 board.
Analysis revealed that rtl8187_start() was invoked before probe
finishes(ie. before the mutex is initialized).
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 0 PID: 821 Comm: wpa_supplicant Not tainted 4.9.80-dirty #250
Hardware name: Generic AM3517 (Flattened Device Tree)
[<c010e0d8>] (unwind_backtrace) from [<c010beac>] (show_stack+0x10/0x14)
[<c010beac>] (show_stack) from [<c017401c>] (register_lock_class+0x4f4/0x55c)
[<c017401c>] (register_lock_class) from [<c0176fe0>] (__lock_acquire+0x74/0x1938)
[<c0176fe0>] (__lock_acquire) from [<c0178cfc>] (lock_acquire+0xfc/0x23c)
[<c0178cfc>] (lock_acquire) from [<c08aa2f8>] (mutex_lock_nested+0x50/0x3b0)
[<c08aa2f8>] (mutex_lock_nested) from [<c05f5bf8>] (rtl8187_start+0x2c/0xd54)
[<c05f5bf8>] (rtl8187_start) from [<c082dea0>] (drv_start+0xa8/0x320)
[<c082dea0>] (drv_start) from [<c084d1d4>] (ieee80211_do_open+0x2bc/0x8e4)
[<c084d1d4>] (ieee80211_do_open) from [<c069be94>] (__dev_open+0xb8/0x120)
[<c069be94>] (__dev_open) from [<c069c11c>] (__dev_change_flags+0x88/0x14c)
[<c069c11c>] (__dev_change_flags) from [<c069c1f8>] (dev_change_flags+0x18/0x48)
[<c069c1f8>] (dev_change_flags) from [<c0710b08>] (devinet_ioctl+0x738/0x840)
[<c0710b08>] (devinet_ioctl) from [<c067925c>] (sock_ioctl+0x164/0x2f4)
[<c067925c>] (sock_ioctl) from [<c02883f8>] (do_vfs_ioctl+0x8c/0x9d0)
[<c02883f8>] (do_vfs_ioctl) from [<c0288da8>] (SyS_ioctl+0x6c/0x7c)
[<c0288da8>] (SyS_ioctl) from [<c0107760>] (ret_fast_syscall+0x0/0x1c)
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = cd1ec000
[00000000] *pgd=8d1de831, *pte=00000000, *ppte=00000000
Internal error: Oops: 817 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 821 Comm: wpa_supplicant Not tainted 4.9.80-dirty #250
Hardware name: Generic AM3517 (Flattened Device Tree)
task: ce73eec0 task.stack: cd1ea000
PC is at mutex_lock_nested+0xe8/0x3b0
LR is at mutex_lock_nested+0xd0/0x3b0
Cc: stable@vger.kernel.org
Signed-off-by: Sudhir Sreedharan <ssreedharan@mvista.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb5208b314 upstream.
Older devices with a serdev attached bcm bt hci, use an Interrupt ACPI
resource to describe the IRQ (rather then a GpioInt resource).
These device seem to all claim the IRQ is active-high and seem to all need
a DMI quirk to treat it as active-low. Instead simply always assume that
Interrupt resource specified IRQs are always active-low.
This fixes the bt device not being able to wake the host from runtime-
suspend on the: Asus T100TAM, Asus T200TA, Lenovo Yoga2 and the Toshiba
Encore, without the need to add 4 new DMI quirks for these models.
This also allows us to remove 2 DMI quirks for the Asus T100TA and Asus
T100CHI series. Likely the 2 remaining quirks can also be removed but I
could not find a DSDT of these devices to verify this.
Cc: stable@vger.kernel.org
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198953
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1554835
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09e35a4a1c upstream.
Patch series "mm/get_user_pages_fast fixes, cleanups", v2.
Turns out get_user_pages_fast and __get_user_pages_fast return different
values on error when given a single page: __get_user_pages_fast returns
0. get_user_pages_fast returns either 0 or an error.
Callers of get_user_pages_fast expect an error so fix it up to return an
error consistently.
Stress the difference between get_user_pages_fast and
__get_user_pages_fast to make sure callers aren't confused.
This patch (of 3):
__gup_benchmark_ioctl does not handle the case where get_user_pages_fast
fails:
- a negative return code will cause a buffer overrun
- returning with partial success will cause use of uninitialized
memory.
[akpm@linux-foundation.org: simplification]
Link: http://lkml.kernel.org/r/1522962072-182137-3-git-send-email-mst@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c61611f709 upstream.
get_user_pages_fast is supposed to be a faster drop-in equivalent of
get_user_pages. As such, callers expect it to return a negative return
code when passed an invalid address, and never expect it to return 0
when passed a positive number of pages, since its documentation says:
* Returns number of pages pinned. This may be fewer than the number
* requested. If nr_pages is 0 or negative, returns 0. If no pages
* were pinned, returns -errno.
When get_user_pages_fast fall back on get_user_pages this is exactly
what happens. Unfortunately the implementation is inconsistent: it
returns 0 if passed a kernel address, confusing callers: for example,
the following is pretty common but does not appear to do the right thing
with a kernel address:
ret = get_user_pages_fast(addr, 1, writeable, &page);
if (ret < 0)
return ret;
Change get_user_pages_fast to return -EFAULT when supplied a kernel
address to make it match expectations.
All callers have been audited for consistency with the documented
semantics.
Link: http://lkml.kernel.org/r/1522962072-182137-4-git-send-email-mst@redhat.com
Fixes: 5b65c4677a ("mm, x86/mm: Fix performance regression in get_user_pages_fast()")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0cf1e05157 upstream.
On an Output queue, both EMPTY and PENDING buffer states imply that the
buffer is ready for completion-processing by the upper-layer drivers.
So for a non-QEBSM Output queue, get_buf_states() merges mixed
batches of PENDING and EMPTY buffers into one large batch of EMPTY
buffers. The upper-layer driver (ie. qeth) later distuingishes PENDING
from EMPTY by inspecting the slsb_state for
QDIO_OUTBUF_STATE_FLAG_PENDING.
But the merge logic in get_buf_states() contains a bug that causes us to
erronously also merge ERROR buffers into such a batch of EMPTY buffers
(ERROR is 0xaf, EMPTY is 0xa1; so ERROR & EMPTY == EMPTY).
Effectively, most outbound ERROR buffers are currently discarded
silently and processed as if they had succeeded.
Note that this affects _all_ non-QEBSM device types, not just IQD with CQ.
Fix it by explicitly spelling out the exact conditions for merging.
For extracting the "get initial state" part out of the loop, this relies
on the fact that get_buf_states() is never called with a count of 0. The
QEBSM path already strictly requires this, and the two callers with
variable 'count' make sure of it.
Fixes: 104ea556ee ("qdio: support asynchronous delivery of storage blocks")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dae55b6fef upstream.
Immediate retry of EQBS after CCQ 96 means that we potentially misreport
the state of buffers inspected during the first EQBS call.
This occurs when
1. the first EQBS finds all inspected buffers still in the initial state
set by the driver (ie INPUT EMPTY or OUTPUT PRIMED),
2. the EQBS terminates early with CCQ 96, and
3. by the time that the second EQBS comes around, the state of those
previously inspected buffers has changed.
If the state reported by the second EQBS is 'driver-owned', all we know
is that the previous buffers are driver-owned now as well. But we can't
tell if they all have the same state. So for instance
- the second EQBS reports OUTPUT EMPTY, but any number of the previous
buffers could be OUTPUT ERROR by now,
- the second EQBS reports OUTPUT ERROR, but any number of the previous
buffers could be OUTPUT EMPTY by now.
Effectively, this can result in both over- and underreporting of errors.
If the state reported by the second EQBS is 'HW-owned', that doesn't
guarantee that the previous buffers have not been switched to
driver-owned in the mean time. So for instance
- the second EQBS reports INPUT EMPTY, but any number of the previous
buffers could be INPUT PRIMED (or INPUT ERROR) by now.
This would result in failure to process pending work on the queue. If
it's the final check before yielding initiative, this can cause
a (temporary) queue stall due to IRQ avoidance.
Fixes: 25f269f173 ("[S390] qdio: EQBS retry after CCQ 96")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d0d8ed335 upstream.
Commit 1cf03c00e7 "nfit: scrub and register regions in a workqueue"
mistakenly attempts to register a region per BLK aperture. There is
nothing to register for individual apertures as they belong as a set to
a BLK aperture group that are registered with a corresponding
DIMM-control-region. Filter them for registration to prevent some
needless devm_kzalloc() allocations.
Cc: <stable@vger.kernel.org>
Fixes: 1cf03c00e7 ("nfit: scrub and register regions in a workqueue")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5beb07ad3 upstream.
Resource auditing is using the peer field which is not available
when the rlim data struct is used, because it is a different element
of the same union. Accessing peer during resource auditing could
cause garbage log entries or even oops the kernel.
Move the rlim data block into the same struct as the peer field
so they can be used together.
CC: <stable@vger.kernel.org>
Fixes: 86b92cb782 ("apparmor: move resource checks to using labels")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98cf5bbff4 upstream.
The existence test is not being properly logged as the signal mapping
maps it to the last entry in the named signal table. This is done
to help catch bugs by making the 0 mapped signal value invalid so
that we can catch the signal value not being filled in.
When fixing the off-by-one comparision logic the reporting of the
existence test was broken, because the logic behind the mapped named
table was hidden. Fix this by adding a define for the name lookup
and using it.
Cc: Stable <stable@vger.kernel.org>
Fixes: f7dc4c9a85 ("apparmor: fix off-by-one comparison on MAXMAPPED_SIG")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d6340672b upstream.
The code that fixes the crashes in the following commit introduced a small
memory leak:
commit 6a2cf8d366 ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure")
Fixing this requires a bit of reworking, which I've explained. Also provide
some code cleanup.
There is a small window in qla2x00_probe_one where if qla2x00_alloc_queues
fails, we end up never freeing req and rsp and leak 0xc0 and 0xc8 bytes
respectively (the sizes of req and rsp).
I originally put in checks to test for this condition which were based on
the incorrect assumption that if ha->rsp_q_map and ha->req_q_map were
allocated, then rsp and req were allocated as well. This is incorrect.
There is a window between these allocations:
ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp);
goto probe_hw_failed;
[if successful, both rsp and req allocated]
base_vha = qla2x00_create_host(sht, ha);
goto probe_hw_failed;
ret = qla2x00_request_irqs(ha, rsp);
goto probe_failed;
if (qla2x00_alloc_queues(ha, req, rsp)) {
goto probe_failed;
[if successful, now ha->rsp_q_map and ha->req_q_map allocated]
To simplify this, we should just set req and rsp to NULL after we free
them. Sounds simple enough? The problem is that req and rsp are pointers
defined in the qla2x00_probe_one and they are not always passed by reference
to the routines that free them.
Here are paths which can free req and rsp:
PATH 1:
qla2x00_probe_one
ret = qla2x00_mem_alloc(ha, req_length, rsp_length, &req, &rsp);
[req and rsp are passed by reference, but if this fails, we currently
do not NULL out req and rsp. Easily fixed]
PATH 2:
qla2x00_probe_one
failing in qla2x00_request_irqs or qla2x00_alloc_queues
probe_failed:
qla2x00_free_device(base_vha);
qla2x00_free_req_que(ha, req)
qla2x00_free_rsp_que(ha, rsp)
PATH 3:
qla2x00_probe_one:
failing in qla2x00_mem_alloc or qla2x00_create_host
probe_hw_failed:
qla2x00_free_req_que(ha, req)
qla2x00_free_rsp_que(ha, rsp)
PATH 1: This should currently work, but it doesn't because rsp and rsp are
not set to NULL in qla2x00_mem_alloc. Easily remedied.
PATH 2: req and rsp aren't passed in at all to qla2x00_free_device but are
derived from ha->req_q_map[0] and ha->rsp_q_map[0]. These are only set up if
qla2x00_alloc_queues succeeds.
In qla2x00_free_queues, we are protected from crashing if these don't exist
because req_qid_map and rsp_qid_map are only set on their allocation. We are
guarded in this way:
for (cnt = 0; cnt < ha->max_req_queues; cnt++) {
if (!test_bit(cnt, ha->req_qid_map))
continue;
PATH 3: This works. We haven't freed req or rsp yet (or they were never
allocated if qla2x00_mem_alloc failed), so we'll attempt to free them here.
To summarize, there are a few small changes to make this work correctly and
(and for some cleanup):
1) (For PATH 1) Set *rsp and *req to NULL in case of failure in
qla2x00_mem_alloc so these are correctly set to NULL back in
qla2x00_probe_one
2) After jumping to probe_failed: and calling qla2x00_free_device,
explicitly set rsp and req to NULL so further calls with these pointers do
not crash, i.e. the free queue calls in the probe_hw_failed section we fall
through to.
3) Fix return code check in the call to qla2x00_alloc_queues. We currently
drop the return code on the floor. The probe fails but the caller of the
probe doesn't have an error code, so it attaches to pci. This can result in
a crash on module shutdown.
4) Remove unnecessary NULL checks in qla2x00_free_req_que,
qla2x00_free_rsp_que, and the egregious NULL checks before kfrees and vfrees
in qla2x00_mem_free.
I tested this out running a scenario where the card breaks at various times
during initialization. I made sure I forced every error exit path in
qla2x00_probe_one.
Cc: <stable@vger.kernel.org> # v4.16
Fixes: 6a2cf8d366 ("scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ee5671e3a upstream.
Currently scsi_dh_lookup() doesn't check for NULL as a device name. This
combined with nvme over dm-mpath results in the following messages
emitted by device-mapper:
device-mapper: multipath: Could not failover device 259:67: Handler scsi_dh_(null) error 14.
Let scsi_dh_lookup() fail fast on NULL names.
[mkp: typo fix]
Cc: <stable@vger.kernel.org> # v4.16
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 880a3a5325 upstream.
We're neglecting to clear the umask after it's set, which can cause a
later unrelated rpc to (incorrectly) use the same umask if it happens to
be processed by the same thread.
There's a more subtle problem here too:
An NFSv4 compound request is decoded all in one pass before any
operations are executed.
Currently we're setting current->fs->umask at the time we decode the
compound. In theory a single compound could contain multiple creates
each setting a umask. In that case we'd end up using whichever umask
was passed in the *last* operation as the umask for all the creates,
whether that was correct or not.
So, we should just be saving the umask at decode time and waiting to set
it until we actually process the corresponding operation.
In practice it's unlikely any client would do multiple creates in a
single compound. And even if it did they'd likely be from the same
process (hence carry the same umask). So this is a little academic, but
we should get it right anyway.
Fixes: 47057abde5 (nfsd: add support for the umask attribute)
Cc: stable@vger.kernel.org
Reported-by: Lucash Stach <l.stach@pengutronix.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a22ee6c3a upstream.
Commit fd8aa9095a ("xen: optimize xenbus driver for multiple
concurrent xenstore accesses") made a subtle change to the semantic of
xenbus_dev_request_and_reply() and xenbus_transaction_end().
Before on an error response to XS_TRANSACTION_END
xenbus_dev_request_and_reply() would not decrement the active
transaction counter. But xenbus_transaction_end() has always counted the
transaction as finished regardless of the response.
The new behavior is that xenbus_dev_request_and_reply() and
xenbus_transaction_end() will always count the transaction as finished
regardless the response code (handled in xs_request_exit()).
But xenbus_dev_frontend tries to end a transaction on closing of the
device if the XS_TRANSACTION_END failed before. Trying to close the
transaction twice corrupts the reference count. So fix this by also
considering a transaction closed if we have sent XS_TRANSACTION_END once
regardless of the return code.
Cc: <stable@vger.kernel.org> # 4.11
Fixes: fd8aa9095a ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 695b46e76b upstream.
Eddie Horng reported that readdir of an overlayfs directory that
was exported via NFSv3 returns entries with d_type set to DT_UNKNOWN.
The reason is that while preparing the response for readdirplus, nfsd
checks inside encode_entryplus_baggage() that a child dentry's inode
number matches the value of d_ino returns by overlayfs readdir iterator.
Because the overlayfs inodes use arbitrary inode numbers that are not
correlated with the values of st_ino/d_ino, NFSv3 falls back to not
encoding d_type. Although this is an allowed behavior, we can fix it for
the case of all overlayfs layers on the same underlying filesystem.
When NFS export is enabled and d_ino is consistent with st_ino
(samefs), set the same value also to i_ino in ovl_fill_inode() for all
overlayfs inodes, nfsd readdirplus sanity checks will pass.
ovl_fill_inode() may be called from ovl_new_inode(), before real inode
was created with ino arg 0. In that case, i_ino will be updated to real
upper inode i_ino on ovl_inode_init() or ovl_inode_update().
Reported-by: Eddie Horng <eddiehorng.tw@gmail.com>
Tested-by: Eddie Horng <eddiehorng.tw@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Fixes: 8383f17488 ("ovl: wire up NFS export operations")
Cc: <stable@vger.kernel.org> #v4.16
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ec9b3fafc upstream.
As of now if we encounter an opaque dir while looking for a dentry, we set
d->last=true. This means that there is no need to look further in any of
the lower layers. This works fine as long as there are no redirets or
relative redircts. But what if there is an absolute redirect on the
children dentry of opaque directory. We still need to continue to look into
next lower layer. This patch fixes it.
Here is an example to demonstrate the issue. Say you have following setup.
upper: /redirect (redirect=/a/b/c)
lower1: /a/[b]/c ([b] is opaque) (c has absolute redirect=/a/b/d/)
lower0: /a/b/d/foo
Now "redirect" dir should merge with lower1:/a/b/c/ and lower0:/a/b/d.
Note, despite the fact lower1:/a/[b] is opaque, we need to continue to look
into lower0 because children c has an absolute redirect.
Following is a reproducer.
Watch me make foo disappear:
$ mkdir lower middle upper work work2 merged
$ mkdir lower/origin
$ touch lower/origin/foo
$ mount -t overlay none merged/ \
-olowerdir=lower,upperdir=middle,workdir=work2
$ mkdir merged/pure
$ mv merged/origin merged/pure/redirect
$ umount merged
$ mount -t overlay none merged/ \
-olowerdir=middle:lower,upperdir=upper,workdir=work
$ mv merged/pure/redirect merged/redirect
Now you see foo inside a twice redirected merged dir:
$ ls merged/redirect
foo
$ umount merged
$ mount -t overlay none merged/ \
-olowerdir=middle:lower,upperdir=upper,workdir=work
After mount cycle you don't see foo inside the same dir:
$ ls merged/redirect
During middle layer lookup, the opaqueness of middle/pure is left in
the lookup state and then middle/pure/redirect is wrongly treated as
opaque.
Fixes: 02b69b284c ("ovl: lookup redirects")
Cc: <stable@vger.kernel.org> #v4.10
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 452061fd45 upstream.
d->last signifies that this is the last layer we are looking into and there
is no more. And that means this allows for some optimzation opportunities
during lookup. For example, in ovl_lookup_single() we don't have to check
for opaque xattr of a directory is this is the last layer we are looking
into (d->last = true).
But knowing for sure whether we are looking into last layer can be very
tricky. If redirects are not enabled, then we can look at poe->numlower and
figure out if the lookup we are about to is last layer or not. But if
redircts are enabled then it is possible poe->numlower suggests that we are
looking in last layer, but there is an absolute redirect present in found
element and that redirects us to a layer in root and that means lookup will
continue in lower layers further.
For example, consider following.
/upperdir/pure (opaque=y)
/upperdir/pure/foo (opaque=y,redirect=/bar)
/lowerdir/bar
In this case pure is "pure upper". When we look for "foo", that time
poe->numlower=0. But that alone does not mean that we will not search for a
merge candidate in /lowerdir. Absolute redirect changes that.
IOW, d->last should not be set just based on poe->numlower if redirects are
enabled. That can lead to setting d->last while it should not have and that
means we will not check for opaque xattr while we should have.
So do this.
- If redirects are not enabled, then continue to rely on poe->numlower
information to determine if it is last layer or not.
- If redirects are enabled, then set d->last = true only if this is the
last layer in root ovl_entry (roe).
Suggested-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 02b69b284c ("ovl: lookup redirects")
Cc: <stable@vger.kernel.org> #v4.10
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bffa9909a6 upstream.
From commit 4b855ad371 ("blk-mq: Create hctx for each present CPU),
blk-mq doesn't remap queue after CPU topo is changed, that said when
some of these offline CPUs become online, they are still mapped to
hctx 0, then hctx 0 may become the bottleneck of IO dispatch and
completion.
This patch sets up the mapping from the beginning, and aligns to
queue mapping for PCI device (blk_mq_pci_map_queues()).
Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: stable@vger.kernel.org
Fixes: 4b855ad371 ("blk-mq: Create hctx for each present CPU)
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a1c735fb79 upstream.
From commit 20e4d81393 (blk-mq: simplify queue mapping & schedule
with each possisble CPU), one hctx can be mapped from all offline CPUs,
then hctx->next_cpu can be set as wrong.
This patch fixes this issue by making hctx->next_cpu pointing to the
first CPU in hctx->cpumask if all CPUs in hctx->cpumask are offline.
Cc: Stefan Haberland <sth@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Fixes: 20e4d81393 ("blk-mq: simplify queue mapping & schedule with each possisble CPU")
Cc: stable@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0bca799b92 upstream.
This patch orders getting budget and driver tag by making sure to acquire
driver tag after budget is got, this way can help to avoid the following
race:
1) before dispatch request from scheduler queue, get one budget first, then
dequeue a request, call it request A.
2) in another IO path for dispatching request B which is from hctx->dispatch,
driver tag is got, then try to get budget in blk_mq_dispatch_rq_list(),
unfortunately the budget is held by request A.
3) meantime blk_mq_dispatch_rq_list() is called for dispatching request
A, and try to get driver tag first, unfortunately no driver tag is
available because the driver tag is held by request B
4) both two IO pathes can't move on, and IO stall is caused.
This issue can be observed when running dbench on USB storage.
This patch fixes this issue by always getting budget before getting
driver tag.
Cc: stable@vger.kernel.org
Fixes: de14829740 ("blk-mq: introduce .get_budget and .put_budget in blk_mq_ops")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2079699c10 upstream.
If a task is holding a reference to a namespace on a removed controller,
the head will not be released. If the same controller is added again
later, its namespaces may not be successfully added. Instead, the user
will see kernel message "Duplicate IDs for nsid <X>".
This patch fixes that by skipping heads that don't have namespaces when
considering if a new namespace is safe to add.
Reported-by: Alex Gagniuc <Alex_Gagniuc@Dellteam.com>
Cc: stable@vger.kernel.org
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 818e0fa293 upstream.
scsi_device_quiesce() uses synchronize_rcu() to guarantee that the
effect of blk_set_preempt_only() will be visible for percpu_ref_tryget()
calls that occur after the queue unfreeze by using the approach
explained in https://lwn.net/Articles/573497/. The rcu read lock and
unlock calls in blk_queue_enter() form a pair with the synchronize_rcu()
call in scsi_device_quiesce(). Both scsi_device_quiesce() and
blk_queue_enter() must either use regular RCU or RCU-sched.
Since neither the RCU-protected code in blk_queue_enter() nor
blk_queue_usage_counter_release() sleeps, regular RCU protection
is sufficient. Note: scsi_device_quiesce() does not have to be
modified since it already uses synchronize_rcu().
Reported-by: Tejun Heo <tj@kernel.org>
Fixes: 3a0a529971 ("block, scsi: Make SCSI quiesce and resume work reliably")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Martin Steigerwald <martin@lichtvoll.de>
Cc: stable@vger.kernel.org # v4.15
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3aefb6a70 upstream.
make_checksum_hmac_md5() is allocating an HMAC transform and doing
crypto API calls in the following order:
crypto_ahash_init()
crypto_ahash_setkey()
crypto_ahash_digest()
This is wrong because it makes no sense to init() the request before a
key has been set, given that the initial state depends on the key. And
digest() is short for init() + update() + final(), so in this case
there's no need to explicitly call init() at all.
Before commit 9fa68f6200 ("crypto: hash - prevent using keyed hashes
without setting key") the extra init() had no real effect, at least for
the software HMAC implementation. (There are also hardware drivers that
implement HMAC-MD5, and it's not immediately obvious how gracefully they
handle init() before setkey().) But now the crypto API detects this
incorrect initialization and returns -ENOKEY. This is breaking NFS
mounts in some cases.
Fix it by removing the incorrect call to crypto_ahash_init().
Reported-by: Michael Young <m.a.young@durham.ac.uk>
Fixes: 9fa68f6200 ("crypto: hash - prevent using keyed hashes without setting key")
Fixes: fffdaef2eb ("gss_krb5: Add support for rc4-hmac encryption")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a774635db5 upstream.
The APIC ID as parsed from ACPI MADT is validity checked with the
apic->apic_id_valid() callback, which depends on the selected APIC type.
For non X2APIC types APIC IDs >= 0xFF are invalid, but values > 0x7FFFFFFF
are detected as valid. This happens because the 'apicid' argument of the
apic_id_valid() callback is type 'int'. So the resulting comparison
apicid < 0xFF
evaluates to true for all unsigned int values > 0x7FFFFFFF which are handed
to default_apic_id_valid(). As a consequence, invalid APIC IDs in !X2APIC
mode are considered valid and accounted as possible CPUs.
Change the apicid argument type of the apic_id_valid() callback to u32 so
the evaluation is unsigned and returns the correct result.
[ tglx: Massaged changelog ]
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: jgross@suse.com
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/1523322966-10296-1-git-send-email-lirongqing@baidu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9820e1c337 upstream.
Consistently use types provided by <linux/types.h> to fix the following
asm/bootparam.h userspace compilation errors:
/usr/include/asm/bootparam.h:140:2: error: unknown type name 'u16'
u16 version;
/usr/include/asm/bootparam.h:141:2: error: unknown type name 'u16'
u16 compatible_version;
/usr/include/asm/bootparam.h:142:2: error: unknown type name 'u16'
u16 pm_timer_address;
/usr/include/asm/bootparam.h:143:2: error: unknown type name 'u16'
u16 num_cpus;
/usr/include/asm/bootparam.h:144:2: error: unknown type name 'u64'
u64 pci_mmconfig_base;
/usr/include/asm/bootparam.h:145:2: error: unknown type name 'u32'
u32 tsc_khz;
/usr/include/asm/bootparam.h:146:2: error: unknown type name 'u32'
u32 apic_khz;
/usr/include/asm/bootparam.h:147:2: error: unknown type name 'u8'
u8 standard_ioapic;
/usr/include/asm/bootparam.h:148:2: error: unknown type name 'u8'
u8 cpu_ids[255];
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: <stable@vger.kernel.org> # v4.16
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 4a362601ba ("x86/jailhouse: Add infrastructure for running in non-root cell")
Link: http://lkml.kernel.org/r/20180405043210.GA13254@altlinux.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 182b191710 upstream.
When ath9k was switched over to use the mac80211 intermediate queues,
node cleanup now drains the mac80211 queues. However, this call path is
not protected by rcu_read_lock() as it was previously entirely internal
to the driver which uses its own locking.
This leads to a possible rcu_dereference() without holding
rcu_read_lock(); but only if a station is cleaned up while having
packets queued on the TXQ. Fix this by adding the rcu_read_lock() to the
caller in ath9k.
Fixes: 50f08edf98 ("ath9k: Switch to using mac80211 intermediate software queues.")
Cc: stable@vger.kernel.org
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68627a697c upstream.
Currently, bank 4 is reserved on Fam17h, so we chose not to initialize
bank 4 in the smca_banks array. This means that when we check if a bank
is initialized, like during boot or resume, we will see that bank 4 is
not initialized and try to initialize it.
This will cause a call trace, when resuming from suspend, due to
rdmsr_*on_cpu() calls in the init path. The rdmsr_*on_cpu() calls issue
an IPI but we're running with interrupts disabled. This triggers:
WARNING: CPU: 0 PID: 11523 at kernel/smp.c:291 smp_call_function_single+0xdc/0xe0
...
Reserved banks will be read-as-zero, so their MCA_IPID register will be
zero. So, like the smca_banks array, the threshold_banks array will not
have an entry for a reserved bank since all its MCA_MISC* registers will
be zero.
Enumerate a "Reserved" bank type that matches on a HWID_MCATYPE of 0,0.
Use the "Reserved" type when checking if a bank is reserved. It's
possible that other bank numbers may be reserved on future systems.
Don't try to find the block address on reserved banks.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.14.x
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20180221101900.10326-7-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c02216acf4 upstream.
In randconfig testing, we sometimes get this warning:
drivers/gpu/drm/radeon/radeon_object.c: In function 'radeon_bo_create':
drivers/gpu/drm/radeon/radeon_object.c:242:2: error: #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to write-combining [-Werror=cpp]
#warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \
This is rather annoying since almost all other code produces no build-time
output unless we have found a real bug. We already fixed this in the
amdgpu driver in commit 31bb90f1cd ("drm/amdgpu: shut up #warning for
compile testing") by adding a CONFIG_COMPILE_TEST check last year and
agreed to do the same here, but both Michel and I then forgot about it
until I came across the issue again now.
For stable kernels, as this is one of very few remaining randconfig
warnings in 4.14.
Cc: stable@vger.kernel.org
Link: https://patchwork.kernel.org/patch/9550009/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91d29b288a upstream.
timestamp_insn_cnt is used to estimate the timestamp based on the number of
instructions since the last known timestamp.
If the estimate is not accurate enough decoding might not be correctly
synchronized with side-band events causing more trace errors.
However there are always timestamps following an overflow, so the
estimate is not needed and can indeed result in more errors.
Suppress the estimate by setting timestamp_insn_cnt to zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 63d8e38f6a upstream.
sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.
The flag when sync_switch is enabled was global to the decoding, whereas
it is really specific to the CPU.
The trace data for different CPUs is put on different queues, so add
sync_switch to the intel_pt_queue structure and use that in preference
to the global setting in the intel_pt structure.
That fixes problems decoding one CPU's trace because sync_switch was
disabled on a different CPU's queue.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 19ce7909ed upstream.
This crashes with a "Bad real address for load" attempting to load
from the vmalloc region in realmode (faulting address is in DAR).
Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
REGS: c000000fff76dd80 TRAP: 0200 Not tainted (4.16.0-01530-g43d1859f0994)
MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000
CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
LR [c0000000000c2430] do_tlbies+0x230/0x2f0
I suspect the reason is the per-cpu data is not in the linear chunk.
This could be restored if that was able to be fixed, but for now,
just remove the tracepoints.
Fixes: 0428491cba ("powerpc/mm: Trace tlbie(l) instructions")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de0aa7b2f9 upstream.
1. With the patch "x86/vector/msi: Switch to global reservation mode",
the recent v4.15 and newer kernels always hang for 1-vCPU Hyper-V VM
with SR-IOV. This is because when we reach hv_compose_msi_msg() by
request_irq() -> request_threaded_irq() ->__setup_irq()->irq_startup()
-> __irq_startup() -> irq_domain_activate_irq() -> ... ->
msi_domain_activate() -> ... -> hv_compose_msi_msg(), local irq is
disabled in __setup_irq().
Note: when we reach hv_compose_msi_msg() by another code path:
pci_enable_msix_range() -> ... -> irq_domain_activate_irq() -> ... ->
hv_compose_msi_msg(), local irq is not disabled.
hv_compose_msi_msg() depends on an interrupt from the host.
With interrupts disabled, a UP VM always hangs in the busy loop in
the function, because the interrupt callback hv_pci_onchannelcallback()
can not be called.
We can do nothing but work it around by polling the channel. This
is ugly, but we don't have any other choice.
2. If the host is ejecting the VF device before we reach
hv_compose_msi_msg(), in a UP VM, we can hang in hv_compose_msi_msg()
forever, because at this time the host doesn't respond to the
CREATE_INTERRUPT request. This issue exists the first day the
pci-hyperv driver appears in the kernel.
Luckily, this can also by worked around by polling the channel
for the PCI_EJECT message and hpdev->state, and by checking the
PCI vendor ID.
Note: actually the above 2 issues also happen to a SMP VM, if
"hbus->hdev->channel->target_cpu == smp_processor_id()" is true.
Fixes: 4900be8360 ("x86/vector/msi: Switch to global reservation mode")
Tested-by: Adrian Suhov <v-adsuho@microsoft.com>
Tested-by: Chris Valean <v-chvale@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 021ad274d7 upstream.
When we hot-remove the device, we first receive a PCI_EJECT message and
then receive a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
The first message is offloaded to hv_eject_device_work(), and the second
is offloaded to pci_devices_present_work(). Both the paths can be running
list_del(&hpdev->list_entry), causing general protection fault, because
system_wq can run them concurrently.
The patch eliminates the race condition.
Since access to present/eject work items is serialized, we do not need the
hbus->enum_sem anymore, so remove it.
Fixes: 4daace0d8c ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Link: https://lkml.kernel.org/r/KL1P15301MB00064DA6B4D221123B5241CFBFD70@KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM
Tested-by: Adrian Suhov <v-adsuho@microsoft.com>
Tested-by: Chris Valean <v-chvale@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: squashed semaphore removal patch]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org> # v4.6+
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d5654e156b upstream.
Make sure that the HPMC (High Priority Machine Check) handler is 16-byte
aligned and that it's length in the IVT is a multiple of 16 bytes.
Otherwise PDC may decide not to call the HPMC crash handler.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 615b2665fd upstream.
As found by the ubsan checker, the value of the 'index' variable can be
out of range for the bc[] array:
UBSAN: Undefined behaviour in arch/parisc/kernel/drivers.c:655:21
index 6 is out of range for type 'char [6]'
Backtrace:
[<104fa850>] __ubsan_handle_out_of_bounds+0x68/0x80
[<1019d83c>] check_parent+0xc0/0x170
[<1019d91c>] descend_children+0x30/0x6c
[<1059e164>] device_for_each_child+0x60/0x98
[<1019cd54>] parse_tree_node+0x40/0x54
[<1019d86c>] check_parent+0xf0/0x170
[<1019d91c>] descend_children+0x30/0x6c
[<1059e164>] device_for_each_child+0x60/0x98
[<1019d938>] descend_children+0x4c/0x6c
[<1059e164>] device_for_each_child+0x60/0x98
[<1019cd54>] parse_tree_node+0x40/0x54
[<1019cffc>] hwpath_to_device+0xa4/0xc4
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cc095f0ac1 upstream.
device_remove_group() was called on any cleanup, even if the
device attrs had not been added yet. That can occur in certain
error scenarios, so add a flag to know if it has been added.
Also make sure we remove the dev if we added it ourselves.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org # 4.15
Cc: Laura Abbott <labbott@redhat.com>
Tested-by: Bill Perkins <wmp@grnwood.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 613928e853 upstream.
To allow dual pipelines utilising two WPF entities when available, the
VSP was updated to support header-mode display list in continuous
pipelines.
A small bug in the status check of the command register causes the
second pipeline to be directly afflicted by the running of the first;
appearing as a perceived performance issue with stuttering display.
Fix the vsp1_dl_list_hw_update_pending() call to ensure that the read
comparison corresponds to the correct pipeline.
Fixes: eaf4bfad6a ("v4l: vsp1: Add support for header display lists in continuous mode")
Cc: "Stable v4.14+" <stable@vger.kernel.org>
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 49d7006d9f ]
Each Oracle DAX CCB has a corresponding completion area, and the required
number of areas must fit within a previously allocated array of completion
areas beginning at the requested index. Since the completion area index
is specified by a file offset, a user can pass arbitrary values, including
negative numbers. So the index must be thoroughly range checked to prevent
access to addresses outside the bounds of the allocated completion
area array. The index cannot be negative, and it cannot exceed the
total array size, less the number of CCBs requested. The old code did
not check for negative values and was off by one on the upper bound.
Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Jonathan Helman <jonathan.helman@oracle.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4bfc33807a ]
lan78xx_read_otp tries to return -EINVAL in the event of invalid OTP
content, but the value gets overwritten before it is returned and the
read goes ahead anyway. Make the read conditional as it should be
and preserve the error code.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1cc5954f44 ]
Commit dd9d598c66 ("ip_gre: add the support for i/o_flags update via
netlink") added the ability to change o_flags, but missed that the
GSO/LLTX features are disabled by default, and only enabled some gre
features are unused. Thus we also need to disable the GSO/LLTX features
on the device when the TUNNEL_SEQ or TUNNEL_CSUM flags are set.
These two examples should result in the same features being set:
ip link add gre_none type gre local 192.168.0.10 remote 192.168.0.20 ttl 255 key 0
ip link set gre_none type gre seq
ip link add gre_seq type gre local 192.168.0.10 remote 192.168.0.20 ttl 255 key 1 seq
Fixes: dd9d598c66 ("ip_gre: add the support for i/o_flags update via netlink")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f6cd651b05 ]
We can't use l2tp_tunnel_find() to prevent l2tp_nl_cmd_tunnel_create()
from creating a duplicate tunnel. A tunnel can be concurrently
registered after l2tp_tunnel_find() returns. Therefore, searching for
duplicates must be done at registration time.
Finally, remove l2tp_tunnel_find() entirely as it isn't use anywhere
anymore.
Fixes: 309795f4be ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6b9f34239b ]
l2tp_tunnel_create() inserts the new tunnel into the namespace's tunnel
list and sets the socket's ->sk_user_data field, before returning it to
the caller. Therefore, there are two ways the tunnel can be accessed
and freed, before the caller even had the opportunity to take a
reference. In practice, syzbot could crash the module by closing the
socket right after a new tunnel was returned to pppol2tp_create().
This patch moves tunnel registration out of l2tp_tunnel_create(), so
that the caller can safely hold a reference before publishing the
tunnel. This second step is done with the new l2tp_tunnel_register()
function, which is now responsible for associating the tunnel to its
socket and for inserting it into the namespace's list.
While moving the code to l2tp_tunnel_register(), a few modifications
have been done. First, the socket validation tests are done in a helper
function, for clarity. Also, modifying the socket is now done after
having inserted the tunnel to the namespace's tunnels list. This will
allow insertion to fail, without having to revert theses modifications
in the error path (a followup patch will check for duplicate tunnels
before insertion). Either the socket is a kernel socket which we
control, or it is a user-space socket for which we have a reference on
the file descriptor. In any case, the socket isn't going to be closed
from under us.
Reported-by: syzbot+fbeeb5c3b538e8545644@syzkaller.appspotmail.com
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d14d2b7809 ]
Commit d65026c6c6 ("vhost: validate log
when IOTLB is enabled") introduced a regression. The logic was
originally:
if (vq->iotlb)
return 1;
return A && B;
After the patch the short-circuit logic for A was inverted:
if (A || vq->iotlb)
return A;
return B;
This patch fixes the regression by rewriting the checks in the obvious
way, no longer returning A when vq->iotlb is non-NULL (which is hard to
understand).
Reported-by: syzbot+65a84dde0214b0387ccd@syzkaller.appspotmail.com
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3f01ddb962 ]
On receiving a packet the state index points to the rstate which must be
used to fill up IP and TCP headers. But if the state index points to a
rstate which is unitialized, i.e. filled with zeros, it gets stuck in an
infinite loop inside ip_fast_csum trying to compute the ip checsum of a
header with zero length.
89.666953: <2> [<ffffff9dd3e94d38>] slhc_uncompress+0x464/0x468
89.666965: <2> [<ffffff9dd3e87d88>] ppp_receive_nonmp_frame+0x3b4/0x65c
89.666978: <2> [<ffffff9dd3e89dd4>] ppp_receive_frame+0x64/0x7e0
89.666991: <2> [<ffffff9dd3e8a708>] ppp_input+0x104/0x198
89.667005: <2> [<ffffff9dd3e93868>] pppopns_recv_core+0x238/0x370
89.667027: <2> [<ffffff9dd4428fc8>] __sk_receive_skb+0xdc/0x250
89.667040: <2> [<ffffff9dd3e939e4>] pppopns_recv+0x44/0x60
89.667053: <2> [<ffffff9dd4426848>] __sock_queue_rcv_skb+0x16c/0x24c
89.667065: <2> [<ffffff9dd4426954>] sock_queue_rcv_skb+0x2c/0x38
89.667085: <2> [<ffffff9dd44f7358>] raw_rcv+0x124/0x154
89.667098: <2> [<ffffff9dd44f7568>] raw_local_deliver+0x1e0/0x22c
89.667117: <2> [<ffffff9dd44c8ba0>] ip_local_deliver_finish+0x70/0x24c
89.667131: <2> [<ffffff9dd44c92f4>] ip_local_deliver+0x100/0x10c
./scripts/faddr2line vmlinux slhc_uncompress+0x464/0x468 output:
ip_fast_csum at arch/arm64/include/asm/checksum.h:40
(inlined by) slhc_uncompress at drivers/net/slip/slhc.c:615
Adding a variable to indicate if the current rstate is initialized. If
such a packet arrives, move to toss state.
Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a43cced9a3 ]
rds_sendmsg() calls rds_send_mprds_hash() to find a c_path to use to
send a message. Suppose the RDS connection is not yet up. In
rds_send_mprds_hash(), it does
if (conn->c_npaths == 0)
wait_event_interruptible(conn->c_hs_waitq,
(conn->c_npaths != 0));
If it is interrupted before the connection is set up,
rds_send_mprds_hash() will return a non-zero hash value. Hence
rds_sendmsg() will use a non-zero c_path to send the message. But if
the RDS connection ends up to be non-MP capable, the message will be
lost as only the zero c_path can be used.
Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 53765341ee ]
The Cinterion AHS8 is a 3G device with one embedded WWAN interface
using cdc_ether as a driver.
The modem is controlled via AT commands through the exposed TTYs.
AT+CGDCONT write command can be used to activate or deactivate a WWAN
connection for a PDP context defined with the same command. UE
supports one WWAN adapter.
Signed-off-by: Bassem Boubaker <bassem.boubaker@actia.fr>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1489bbd10e ]
The NSP default buffer is a piece of NFP memory where additional
command data can be placed. Its format has been copied from
host buffer, but the PCIe selection bits do not make sense in
this case. If those get masked out from a NFP address - writes
to random place in the chip memory may be issued and crash the
device.
Even in the general NSP buffer case, it doesn't make sense to have the
PCIe selection bits there anymore. These are unused at the moment, and
when it becomes necessary, the PCIe selection bits should rather be
moved to another register to utilise more bits for the buffer address.
This has never been an issue because the buffer used to be
allocated in memory with less-than-38-bit-long address but that
is about to change.
Fixes: 1a64821c6a ("nfp: add support for service processor access")
Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a9d48205d0 ]
We want to use dev_valid_name() to validate tunnel names,
so better use strnlen(name, IFNAMSIZ) than strlen(name) to make
sure to not upset KASAN.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ec1d8ccb07 ]
Just like function ethtool_get_ts_info(), we should also consider the
phy_driver ts_info call back. For example, driver dp83640.
Fixes: 37dd9255b2 ("vlan: Pass ethtool get_ts_info queries to real device.")
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71a1c91523 ]
At the end of ip6_forward(), IPSTATS_MIB_OUTFORWDATAGRAMS and
IPSTATS_MIB_OUTOCTETS are incremented immediately before the NF_HOOK call
for NFPROTO_IPV6 / NF_INET_FORWARD. As a result, these counters get
incremented regardless of whether or not the netfilter hook allows the
packet to continue being processed. This change increments the counters
in ip6_forward_finish() so that it will not happen if the netfilter hook
chooses to terminate the packet, which is similar to how IPv4 works.
Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fc5f33768c ]
The Marvell switches under some conditions will pass a frame to the
host with the port being the CPU port. Such frames are invalid, and
should be dropped. Not dropping them can result in a crash when
incrementing the receive statistics for an invalid port.
Reported-by: Chris Healy <cphealy@gmail.com>
Fixes: 91da11f870 ("net: Distributed Switch Architecture protocol support")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 58b35f2768 ]
arp_filter performs an ip_route_output search for arp source address and
checks if output device is the same where the arp request was received,
if it is not, the arp request is not answered.
This route lookup is always done on main route table so l3slave devices
never find the proper route and arp is not answered.
Passing l3mdev_master_ifindex_rcu(dev) return value as oif fixes the
lookup for l3slave devices while maintaining same behavior for non
l3slave devices as this function returns 0 in that case.
Fixes: 613d09b30f ("net: Use VRF device index for lookups on TX")
Signed-off-by: Miguel Fadon Perlines <mfadon@teldat.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8420f71943 upstream.
The change moving addr_lsb into the _sigfault union failed to take
into account that _sigfault._addr_bnd._lower being a pointer forced
the entire union to have pointer alignment. The fix for
_sigfault._addr_bnd._lower having pointer alignment failed to take
into account that m68k has a pointer alignment less than the size
of a pointer. So simply making the padding members pointers changed
the location of later members in the structure.
Fix this by directly computing the needed size of the padding members,
and making the padding members char arrays of the needed size. AKA
if __alignof__(void *) is 1 sizeof(short) otherwise __alignof__(void *).
Which should be exactly the same rules the compiler whould have
used when computing the padding.
I have tested this change by adding BUILD_BUG_ONs to m68k to verify
the offset of every member of struct siginfo, and with those testing
that the offsets of the fields in struct siginfo is the same before
I changed the generic _sigfault member and after the correction
to the _sigfault member.
I have also verified that the x86 with it's own BUILD_BUG_ONs to verify
the offsets of the siginfo members also compiles cleanly.
Cc: stable@vger.kernel.org
Reported-by: Eugene Syromiatnikov <esyr@redhat.com>
Fixes: 859d880cf5 ("signal: Correct the offset of si_pkey in struct siginfo")
Fixes: b68a68d3dc ("signal: Move addr_lsb into the _sigfault union for clarity")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd5c4facf5 upstream.
I'm getting a slab named "biovec-(1<<(21-12))". It is caused by unintended
expansion of the macro BIO_MAX_PAGES. This patch renames it to biovec-max.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65d9982d7e upstream.
ECMA-48 [1] (aka ISO 6429) has defined SGR 21 as "doubly underlined"
since at least March 1984. The Linux kernel has treated it as SGR 22
"normal intensity" since it was added in Linux-0.96b in June 1992.
Before that, it was simply ignored. Other terminal emulators have
either ignored it, or treat it as double underline now. xterm for
example added support in its 304 release (May 2014) [2] where it was
previously ignoring it.
Changing this behavior shouldn't be an issue:
- It isn't a named capability in ncurses's terminfo database, so no
script is using libtinfo/libcurses to look this up, or using tput
to query & output the right sequence.
- Any script assuming SGR 21 will reset intensity in all terminals
already do not work correctly on non-Linux VTs (including running
under screen/tmux/etc...).
- If someone has written a script that only runs in the Linux VT, and
they're using SGR 21 (instead of SGR 22), the output should still
be readable.
imo it's important to change this as the Linux VT's non-conformance
is sometimes used as an argument for other terminal emulators to not
implement SGR 21 at all, or do so incorrectly.
[1]: https://www.ecma-international.org/publications/standards/Ecma-048.htm
[2]: 2fd29cb98d
Signed-off-by: Mike Frysinger <vapier@chromium.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04bb1719c4 upstream.
The touch sensor buttons on Sony VAIO VGN-CS series laptops (e.g.
VGN-CS31S) are a separate PS/2 device. As the MUX is disabled for all
VAIO machines by the nomux blacklist, the data from touch sensor
buttons and touchpad are combined. The protocol used by the buttons is
probably similar to the touchpad protocol (both are Synaptics) so both
devices get enabled. The controller combines the data, creating a mess
which results in random button clicks, touchpad stopping working and
lost sync error messages:
psmouse serio1: TouchPad at isa0060/serio1/input0 lost sync at byte 4
psmouse serio1: TouchPad at isa0060/serio1/input0 lost sync at byte 1
psmouse serio1: TouchPad at isa0060/serio1/input0 lost sync at byte 1
psmouse serio1: TouchPad at isa0060/serio1/input0 lost sync at byte 1
psmouse serio1: TouchPad at isa0060/serio1/input0 lost sync at byte 1
psmouse serio1: issuing reconnect request
Add a new i8042_dmi_forcemux_table whitelist with VGN-CS.
With MUX enabled, touch sensor buttons are detected as separate device
(and left disabled as there's currently no driver), fixing all touchpad
problems.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b56af54ac7 upstream.
Reset i8042 before probing because of insufficient BIOS initialisation of
the i8042 serial controller. This makes Synaptics touchpad detection
possible. Without resetting the Synaptics touchpad is not detected because
there are always NACK messages from AUX port.
Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 567b9b549c upstream.
The primary interface for the touchpad device in Thinkpad L570 is SMBus,
so ALPS overlooked PS2 interface Firmware setting of TrackStick, and
shipped with TrackStick otp bit is disabled.
The address 0xD7 contains device number information, so we can identify
the device by checking this value, but to access it we need to enable
Command mode, and then re-enable the device. Devices shipped in Thinkpad
L570 report either 0x0C or 0x1D as device numbers, if we see them we assume
that the devices are DualPoints.
The same issue exists on Dell Latitude 7370.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196929
Fixes: 646580f793 ("Input: ALPS - fix multi-touch decoding on SS4 plus touchpads")
Signed-off-by: Masaki Ota <masaki.ota@jp.alps.com>
Tested-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Jonathan Liu <net147@gmail.com>
Tested-by: Jaak Ristioja <jaak@ristioja.ee>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9de9a44948 upstream.
This reverts commit 452562abb5 ("base: arch_topology: fix section
mismatch build warnings"). It causes the notifier call hangs in some
use-cases.
In some cases with using maxcpus, some of cpus are booted first and
then the remaining cpus are booted. As an example, some users who want
to realize fast boot up often use the following procedure.
1) Define all CPUs on device tree (CA57x4 + CA53x4)
2) Add "maxcpus=4" in bootargs
3) Kernel boot up with CA57x4
4) After kernel boot up, CA53x4 is booted from user
When kernel init was finished, CPUFREQ_POLICY_NOTIFIER was not still
unregisterd. This means that "__init init_cpu_capacity_callback()"
will be called after kernel init sequence. To avoid this problem,
it needs to remove __init{,data} annotations by reverting this commit.
Also, this commit was needed to fix kernel compile issue below.
However, this issue was also fixed by another patch: commit 82d8ba717c
("arch_topology: Fix section miss match warning due to
free_raw_capacity()") in v4.15 as well.
Whereas commit 452562abb5 added all the missing __init annotations,
commit 82d8ba717c removed it from free_raw_capacity().
WARNING: vmlinux.o(.text+0x548f24): Section mismatch in reference
from the function init_cpu_capacity_callback() to the variable
.init.text:$x
The function init_cpu_capacity_callback() references
the variable __init $x.
This is often because init_cpu_capacity_callback lacks a __init
annotation or the annotation of $x is wrong.
Fixes: 82d8ba717c ("arch_topology: Fix section miss match warning due to free_raw_capacity()")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Gaku Inami <gaku.inami.xh@renesas.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e1d9fc04c4 upstream.
Ack ai fifo error interrupts in interrupt handler to clear interrupt
after fifo overflow. It should prevent lock-ups after the ai fifo
overflows.
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Frank Mori Hess <fmh6jj@gmail.com>
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5811375325 upstream.
Fstests generic/475 provides a way to fail metadata reads while
checking if checksum exists for the inode inside run_delalloc_nocow(),
and csum_exist_in_range() interprets error (-EIO) as inode having
checksum and makes its caller enter the cow path.
In case of free space inode, this ends up with a warning in
cow_file_range().
The same problem applies to btrfs_cross_ref_exist() since it may also
read metadata in between.
With this, run_delalloc_nocow() bails out when errors occur at the two
places.
cc: <stable@vger.kernel.org> v2.6.28+
Fixes: 17d217fe97 ("Btrfs: fix nodatasum handling in balancing code")
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4063cafa3b upstream.
Add 6 new ACPI HIDs to enable bluetooth on devices using these HIDs,
I've tested the following HIDs / devices:
BCM2E74: Jumper ezPad mini 3
BCM2E83: Acer Iconia Tab8 w1-810
BCM2E90: Meegopad T08
BCM2EAA: Chuwi Vi8 plus (CWI519)
The reporter of Red Hat bugzilla 1554835 has tested:
BCM2E84: Lenovo Yoga2
The reporter of kernel bugzilla 274481 has tested:
BCM2E38: Toshiba Encore
Note the Lenovo Yoga2 and Toshiba Encore also needs the earlier patch to
treat all Interrupt ACPI resources as active low.
Cc: stable@vger.kernel.org
Buglink: https://bugzilla.kernel.org/attachment.cgi?id=274481
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1554835
Reported-and-tested-by: Robert R. Howell <rhowell@uwyo.edu>
Reported-and-tested-by: Christian Herzog <daduke@daduke.org>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f461b1e02 upstream.
With ecb-cast5-avx, if a 128+ byte scatterlist element followed a
shorter one, then the algorithm accidentally encrypted/decrypted only 8
bytes instead of the expected 128 bytes. Fix it by setting the
encryption/decryption 'fn' correctly.
Fixes: c12ab20b16 ("crypto: cast5/avx - avoid using temporary stack buffers")
Cc: <stable@vger.kernel.org> # v3.8+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6aaf49b495 upstream.
The decision to rebuild .S_shipped is made based on the relative
timestamps of .S_shipped and .pl files but git makes this essentially
random. This means that the perl script might run anyway (usually at
most once per checkout), defeating the whole purpose of _shipped.
Fix by skipping the rule unless explicit make variables are provided:
REGENERATE_ARM_CRYPTO or REGENERATE_ARM64_CRYPTO.
This can produce nasty occasional build failures downstream, for example
for toolchains with broken perl. The solution is minimally intrusive to
make it easier to push into stable.
Another report on a similar issue here: https://lkml.org/lkml/2018/3/8/1379
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a9eb80e64 upstream.
rsa-pkcs1pad uses a value returned from a RSA implementation max_size
callback as a size of an input buffer passed to the RSA implementation for
encrypt and sign operations.
CCP RSA implementation uses a hardware input buffer which size depends only
on the current RSA key length, so it should return this key length in
the max_size callback, too.
This also matches what the kernel software RSA implementation does.
Previously, the value returned from this callback was always the maximum
RSA key size the CCP hardware supports.
This resulted in this huge buffer being passed by rsa-pkcs1pad to CCP even
for smaller key sizes and then in a buffer overflow when ccp_run_rsa_cmd()
tried to copy this large input buffer into a RSA key length-sized hardware
input buffer.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes: ceeec0afd6 ("crypto: ccp - Add support for RSA on the CCP")
Cc: stable@vger.kernel.org
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 900a081f69 upstream.
When we have an unaligned SG list entry where there is no leftover
aligned data, the hash walk code will incorrectly return zero as if
the entire SG list has been processed.
This patch fixes it by moving onto the next page instead.
Reported-by: Eli Cooper <elicooper@gmx.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 333e18c5cc upstream.
The RSA private key for the first form should have
version, prime1, prime2, exponent1, exponent2, coefficient
values 0.
With non-zero values for prime1,2, exponent 1,2 and coefficient
the Intel QAT driver will assume that values are provided for the
private key second form. This will result in signature verification
failures for modules where QAT device is present and the modules
are signed with rsa,sha256.
Cc: <stable@vger.kernel.org>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Conor McLoughlin <conor.mcloughlin@intel.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f962eb46e7 upstream.
In this driver the clock is got but never put when the driver is removed
or if there is an error in the probe.
Using the managed version of clk_get() allows to let the kernel take care
of it.
Fixes: 1b44c5a60c ("crypto: inside-secure - add SafeXcel EIP197 crypto
engine driver")
cc: stable@vger.kernel.org
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad4cd51fb8 upstream.
Commit 49f9783b0c ("crypto: talitos - do hw_context DMA mapping
outside the requests") introduced a persistent dma mapping of
req_ctx->hw_context
Commit 37b5e8897e ("crypto: talitos - chain in buffered data for ahash
on SEC1") introduced a persistent dma mapping of req_ctx->buf
As there is no destructor for req_ctx (the request context), the
associated dma handlers where set in ctx (the tfm context). This is
wrong as several hash operations can run with the same ctx.
This patch removes this persistent mapping.
Reported-by: Horia Geanta <horia.geanta@nxp.com>
Cc: <stable@vger.kernel.org>
Fixes: 49f9783b0c ("crypto: talitos - do hw_context DMA mapping outside the requests")
Fixes: 37b5e8897e ("crypto: talitos - chain in buffered data for ahash on SEC1")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ee991be4c upstream.
Any change to the result buffer should only happen on final, finup
and digest operations. Changes to the buffer for update, import, export,
etc, are not allowed.
Fixes: 66d7b9f6175e ("crypto: testmgr - test misuse of result in ahash")
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 823f792383 upstream.
WCH CH382L is a PCI-E adapter with 1 parallel port. It is similair to CH382
but serial ports are not soldered on board. Detected as
Serial controller: Device 1c00:3050 (rev 10) (prog-if 05 [16850])
Signed-off-by: Alexander Gerasiov <gq@redlab-i.ru>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 50e7044535 upstream.
Quoting the original report:
It looks like there is a double-free vulnerability in Linux usbtv driver
on an error path of usbtv_probe function. When audio registration fails,
usbtv_video_free function ends up freeing usbtv data structure, which
gets freed the second time under usbtv_video_fail label.
usbtv_audio_fail:
usbtv_video_free(usbtv); =>
v4l2_device_put(&usbtv->v4l2_dev);
=> v4l2_device_put
=> kref_put
=> v4l2_device_release
=> usbtv_release (CALLBACK)
=> kfree(usbtv) (1st time)
usbtv_video_fail:
usb_set_intfdata(intf, NULL);
usb_put_dev(usbtv->udev);
kfree(usbtv); (2nd time)
So, as we have refcounting, use it
Reported-by: Yavuz, Tuba <tuba@ece.ufl.edu>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb0829a741 upstream.
Currently the driver spams the kernel log on unsupported ioctls which is
unnecessary as the ioctl returns -ENOIOCTLCMD to indicate this anyway.
I suspect this was originally for debugging purposes but it really is not
required so remove it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f597fbce38 upstream.
The Nuvoton UART is almost compatible with the 8250 driver when probed
via the 8250_of driver, however it requires some extra configuration
at startup.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9608e5c0f0 upstream.
This patch adds a device ID for the RT Systems cable used to
program Yaesu VX-8R/VX-8DR handheld radios. It uses the main
FTDI VID instead of the common RT Systems VID.
Signed-off-by: Major Hayden <major@mhtx.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21035965f6 upstream.
Commit 2a98dc028f ("include/linux/bitmap.h: turn bitmap_set and
bitmap_clear into memset when possible") introduced an optimization to
bitmap_{set,clear}() which uses memset() when the start and length are
constants aligned to a byte.
This is wrong on big-endian systems; our bitmaps are arrays of unsigned
long, so bit n is not at byte n / 8 in memory. This was caught by the
Btrfs selftests, but the bitmap selftests also fail when run on a
big-endian machine.
We can still use memset if the start and length are aligned to an
unsigned long, so do that on big-endian. The same problem applies to
the memcmp in bitmap_equal(), so fix it there, too.
Fixes: 2a98dc028f ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible")
Fixes: 2c6deb0152 ("bitmap: use memcmp optimisation in more situations")
Cc: stable@kernel.org
Reported-by: "Erhard F." <erhard_f@mailbox.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-08 14:29:49 +02:00
2223 changed files with 317185 additions and 10800 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.